5 research outputs found

    Join Execution Using Fragmented Columnar Indices on GPU and MIC

    Full text link
    The paper describes an approach to the parallel natural join execution on computing clusters with GPU and MIC Coprocessors. This approach is based on a decomposition of natural join relational operator using the column indices and domain-interval fragmentation. This decomposition admits parallel executing the resource-intensive relational operators without data transfers. All column index fragments are stored in main memory. To process the join of two relations, each pair of index fragments corresponding to particular domain interval is joined on a separate processor core. Described approach allows efficient parallel query processing for very large databases on modern computing cluster systems with many-core accelerators. A prototype of the DBMS coprocessor system was implemented using this technique. The results of computational experiments for GPU and Xeon Phi are presented. These results confirm the efficiency of proposed approach

    Aplicação de MonetDB na avaliação de desempenho de bases de dados verticais

    Get PDF
    Dissertação apresentada à Universidade Fernando Pessoa como partes dos requisitos para a obtenção do grau de Mestre em Engenharia Informática, ramo de Sistemas de Informação e MultimédiaEsta dissertação analisa a aplicação do Sistema de Gestão de Bases de Dados MonetDB na avaliação do desempenho de bases de dados verticais, comparando com os sistemas PostgreSQL e CitusDB. Nos últimos anos, os sistemas de bases de dados verticais têm atraído muito interesse não só na comunidade científica como também nas comunidades empresarial e organizacional. Esse interesse está relacionado com o potencial de melhor desempenho, com a forma como as bases de dados são armazenadas, com a possibilidade de compressão dos dados e com o seu suporte no apoio à decisão nas organizações. O interesse crescente no uso de bases de dados por colunas em relação às bases de dados tradicionais, com armazenamento por linhas, deve-se essencialmente à forma de armazenamento e ao desempenho. Os sistemas de base de dados por linhas armazenam os registos de uma relação de forma sequencial, por página, enquanto os sistemas de bases de dados em coluna armazenam os valores pertencendo à mesma coluna de forma contínua, na mesma página, o que torna mais rápidas as operações de leitura de apenas um subconjunto das colunas de uma tabela. Nesta dissertação descrevem-se as principais características e vantagens do método de armazenamento por colunas em relação ao método de armazenamento por linhas, analisando sua arquitetura e os conceitos, e analisando as vantagens da compressão e das técnicas de materialização na execução de consultas. Essas vantagens mostram que a nível de execução de consultas típicas de aplicação analíticas, o desempenho das bases de dados por linhas é inferior ao das bases de dados por colunas coluna.This dissertation analyzes the application of MonetDB in a performance evaluation of vertical databases against traditional systems as PostgreSQL and CitusDB. In recent years, vertical database systems have attracted great interest both in the scientific community as well as in commercial areas. This interest is related to performance issues, to how the databases are stored, to the use of data compression and to their use in decision support queries. The growing interest in the use of vertical, or columnar, databases over traditional database storage lies mainly in the way data storage is made and to performance gains in some situations. The traditional database systems store tuples sequentially, by page, while vertical database systems store data belonging to the same column continuously, in the same page, which makes it faster to read a subset of a table. This dissertation describes the main characteristics and advantages of the vertical storage method in relation to the traditional storage method, analyzing its architecture and concepts, highlighting the compression advantages and materialization in the analysis of queries. These advantages show that the level of query execution performance of traditional databases, for analytical applications, is slower than the vertical databases

    Flexibility in Data Management

    Get PDF
    With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators. This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology

    Flexibility in Data Management

    Get PDF
    With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators. This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology
    corecore