3 research outputs found
Um algoritmo de alocação para bancos de dados biológicos distribuídos
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2014O presente trabalho propõe um algoritmo de alocação de dados distribuídos baseado na anidade de dados e perfis de uso com foco em bancos de dados (BD) relacionais biológicos. A proposta visa instruir os administradores de banco de dados (DBAs) sobre como alocar os dados nos nós de um cluster visando obter o melhor desempenho possível nas consultas e demais requisições dos usuários. O esquema e verificado através de testes em laboratório. Os experimentos são realizados sobre o sistema data warehouse (DW) Intermine (SMITH et al., 2012) utilizando o pgGrid, que adiciona funções de reaplicação e fragmentação no PostgreSQL e o HadoopDB (implementação do modelo Map-Reduce para bancos de dados relacionais). O algoritmo e comparado com outras propostas de alocação geradas por algoritmos desenvolvidos em pesquisas recentes.Abstract: This work proposes a data allocation algorithm based on distributed data affinity and query profile with focus on biological relational databases.The proposal aims to help database administrators (DBAs) about how to allocate the data across nodes in a cluster in order to obtain the maximum performance improvements on query time and executing other user requests. The allocation schema is verified in laboratory tests. The Intermine datawarehouse (DW) system (SMITH et al., 2012) was chosen as subject of this evaluation. The experiments were executed on distributed database platforms such as pgGrid, which adds replication and fragmentation functions to PostgreSQL and HadoopDB(implementation of Map-Reduce model for relational databases). Finally, the algorithm is compared with other allocation methods developed in recent researches
Recommended from our members
A Dementia Care Mapping (DCM) data warehouse as a resource for improving the quality of dementia care. Exploring requirements for secondary use of DCM data using a user-driven approach and discussing their implications for a data warehouse
The secondary use of Dementia Care Mapping (DCM) data, if that data were
held in a data warehouse, could contribute to global efforts in monitoring and
improving dementia care quality. This qualitative study identifies
requirements for the secondary use of DCM data within a data warehouse
using a user-driven approach. The thesis critically analyses various technical
methodologies and then argues the use and further demonstrates the
applicability of a modified grounded theory as a user-driven methodology for
a data warehouse. Interviews were conducted with 29 DCM researchers,
trainers and practitioners in three phases. 19 interviews were face to face
with the others on Skype and telephone with an average length of individual
interview 45-60 minutes. The interview data was systematically analysed
using open, axial and selective coding techniques and constant comparison
methods.
The study data highlighted benchmarking, mappers’ support and research as
three perceived potential secondary uses of DCM data within a data
warehouse. DCM researchers identified concerns regarding the quality and
security of DCM data for secondary uses, which led to identifying the
requirements for additional provenance, ethical and contextual data to be
included in a warehouse alongside DCM data to meet requirements for
secondary uses of this data for research. The study data was also used to
extrapolate three main factors such as an individual mapper, the organization
and an electronic data management that can influence the quality and
availability of DCM data for secondary uses. The study makes further
recommendations for designing a future DCM data warehouse