17 research outputs found

    Alternatives to relational databases in precision medicine: comparison of NOSQL approaches for big data storage using supercomputers

    Get PDF
    Improvements in medical and genomic technologies have dramatically increased the production of electronic data over the last decade. As a result, data management is rapidly becoming a major determinant, and urgent challenge, for the development of Precision Medicine. Although successful data management is achievable using Relational Database Management Systems (RDBMS), exponential data growth is a significant contributor to failure scenarios. Growing amounts of data can also be observed in other sectors, such as economics and business, which, together with the previous facts, suggests that alternate database approaches (NoSQL) may soon be required for efficient storage and management of big databases. However, this hypothesis has been difficult to test in the Precision Medicine field since alternate database architectures are complex to assess and means to integrate heterogeneous electronic health records (EHR) with dynamic genomic data are not easily available. In this dissertation, we present a novel set of experiments for identifying NoSQL database approaches that enable effective data storage and management in Precision Medicine using patients’ clinical and genomic information from the cancer genome atlas (TCGA). The first experiment draws on performance and scalability from biologically meaningful queries with differing complexity and database sizes. The second experiment measures performance and scalability in database updates without schema changes. The third experiment assesses performance and scalability in database updates with schema modifications due dynamic data. We have identified two NoSQL approach, based on Cassandra and Redis, which seems to be the ideal database management systems for our precision medicine queries in terms of performance and scalability. We present NoSQL approaches and show how they can be used to manage clinical and genomic big data. Our research is relevant to the public health since we are focusing on one of the main challenges to the development of Precision Medicine and, consequently, investigating a potential solution to the progressively increasing demands on health care

    Arquitecturas para sistemas de informação baseados em cloud computing

    Get PDF
    Mestrado em Engenharia dos Computadores e TelemáticaEste trabalho faz um apanhado do panorama actual no que diz respeito a Cloud computing. Começa por analisar a definição proposta pelo NIST e cate-gorizar vários serviços comerciais de acordo com as categorias propostas nes-sa definição. De seguida, são analisadas as implementações grátis disponíveis em licenças Open Source e chega-se à conclusão que para Clouds do tipo IaaS já existem várias implementações, algumas com boa qualidade, mas que na área de PaaS ainda existe muito trabalho a ser feito antes de se chegar a uma imple-mentação com funcionalidade comparável à dos serviços comerciais existen-tes. Após uma breve análise sobre a integração de SOA com as facilidades do Cloud computing, chegou-se à conclusão que PaaS se apresenta como o modelo de serviço mais adequando para desenvolver aplicações SOA. Visto que não existe ainda nenhum PaaS livre, e que os existentes apresentam problemas sérios de vendor lock in, é especificada uma framework completa, portátil e aberta que permitirá implementar um serviço do tipo PaaS em infra-estrutura privada ou sobre algum dos IaaS existentes. O PaaS especificado baseia-se, sempre que possível, em tecnologias existen-tes, concluindo-se que apenas a tecnologia de armazenamento de dados estruturados está aquém do necessário para a implementação. Deixa-se para o futuro a implementação dos vários módulos que permitirão a integração dos vários componentes da PaaS, no entanto sempre que possível, são sugeridas tecnologias a utilizar de forma a manter a implementação aberta e portátil.This work sums up the current situation of Cloud computing. It starts by per-forming an analysis of the NIST definition draft, and categorizing some com-mercial services into the categories proposed by the referred definition. Next, the free implementations distributed under an Open Source license are analyzed, and the conclusion is that there are some high quality IaaS cloud implementations, but the PaaS area still needs a lot of work before the functio-nality of a free implementation is comparable to that of the commercial services available. After a brief analysis of the integration of SOA and Cloud computing, the con-clusion is that PaaS presents the most adequate service model for the devel-opment of SOA applications. Given that, up to the moment, there is no free PaaS, and that the existing ones present serious vendor lock in problems, a complete, portable, and open framework that allows the deployment of a PaaS type service on private or on IaaS infrastructure is specified. The specified PaaS is based on current technology whenever possible, with exception of the storage of structured data that is not up to the requirements yet. The implementation of the modules required to integrate the various PaaS components is left as future work. Yet, whenever possible, suggestions are made about usable technologies that will allow the PaaS to remain portable and open

    Analysis of alternatives to store genealogical trees using Graph Databases

    Get PDF
    This master thesis is expected to give solutions to a current and open problem. Different aspects related to the genealogical tree storage using advanced databases are considered in this thesis. The first important point of the work is the application of software selection techniques to find the best DBMS or the most suitable to be used for a concrete domain . The next point is the use of current graph DBMS, some of which are still in early phases. Furthermore , the main purpose for this thesis is to state different alternatives to store this kind of information and to overview the previous contexts from which we depart
    corecore