123 research outputs found

    Evaluation of hardware architectures for parallel execution of complex database operations

    Get PDF
    Abstract New database applications, primarily in the areas of engineering and knowledge-based systems, refer to complex objects (e.g. representation of a CAD workpiece or a VLSI chip) while performing their tasks. Retrieval, maintenance, and integrity checking of such complex objects consume substantial computing resources which were traditionally used by conventional database management systems in a sequential manner. Rigid performance goals dictated by interactive use and design environments imply new approaches to master the functionality of complex objects under satisfactory time restrictions. Because of the object granularity, the set orientation of the database interface, and the complicated algorithms for object handling, the exploitation of parallelism within such operations seems to be promising. Our main goal is the investigation and evaluation of different hardware architectures and their suitability to efficiently cope with workloads generated by database operations on complex objects. Apparently, employing just a number of processors is not a panacea for our database problem. The sheer horse power of machines does not help very much when data synchronization and event serialization requirements play a major role during object handling. What are the critical hardware architecture properties? How can the existing MIPS be best utilized for the data management functions when processing complex objects? To answer these questions and related issues, we discuss different kinds of architectures combining multiple processors: loosely-, tightly-, and closely-coupled. Furthermore, we consider parallelism at different levels of abstraction: the distribution of (sub-)queries or the decomposition of such queries and their concurrent evaluation at an inter-or intra-object level. Finally, we give some thoughts as to the problems of load control and transaction management

    Integrating analytics with relational databases

    Get PDF
    The database research community has made tremendous strides in developing powerful database engines that allow for efficient analytical query processing. However, these powerful systems have gone largely unused by analysts and data scientists. This poor adoption is caused primarily by the state of database-client integration. In this thesis we attempt to overcome this challenge by investigating how we can facilitate efficient and painless integration of analytical tools and relational database management systems. We focus our investigation on the three primary methods for database-client integration: client-server connections, in-database processing and embedding the database inside the client application.PROMIMOOCAlgorithms and the Foundations of Software technolog

    High Performance Frequent Subgraph Mining on Transactional Datasets

    Get PDF
    Graph data mining has been a crucial as well as inevitable area of research. Large amounts of graph data are produced in many areas, such as Bioinformatics, Cheminformatics, Social Networks, and Web etc. Scalable graph data mining methods are getting increasingly popular and necessary due to increased graph complexities. Frequent subgraph mining is one such area where the task is to find overly recurring patterns/subgraphs. To tackle this problem, many main memory-based methods were proposed, which proved to be inefficient as the data size grew exponentially over time. In the past few years several research groups have attempted to handle the frequent subgraph mining (FSM) problem in multiple ways. Many authors have tried to achieve better performance using Graphic Processing Units (GPUs) which has multi-fold improvement over in-memory while dealing with large datasets. Later, Google\u27s MapReduce model with the Hadoop framework proved to be a major breakthrough in high performance large batch processing. Although MapReduce came with many benefits, its disk I/O and non-iterative style model could not help much for FSM domain since subgraph mining process is an iterative approach. In recent years, Spark has emerged to be the De Facto industry standard with its distributed in-memory computing capability. This is a right fit solution for iterative style of programming as well. In this work, we cover how high-performance computing has helped in improving the performance tremendously in the transactional directed and undirected aspect of graphs and performance comparisons of various FSM techniques are done based on experimental results

    Regras de refatoração para banco de dados baseado em grafos

    Get PDF
    Orientador: Luiz Camolesi JúniorDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de TecnologiaResumo: A informação produzida atualmente apresenta crescimento em volume e complexidade, representando um desafio tecnológico que demanda mais do que a atual estrutura de Bancos de Dados Relacionais pode oferecer. Tal fato estimula o uso de diferentes formas de armazenamento, como Bancos de Dados baseados em Grafos (BDG). Os atuais Bancos de Dados baseados em Grafos são adaptados para suportar automaticamente a evolução do banco de dados, mas não fornecem recursos adequados para a organização da informação. Esta função é deixada a cargo das aplicações que acessam o banco de dados, comprometendo a integridade dos dados e sua confiabilidade. O objetivo deste trabalho é a definição de regras de refatoração para auxiliar o gerenciamento da evolução de Bancos de Dados baseados em Grafos. As regras apresentadas neste trabalho são adaptações e extensões de regras de refatoração consolidadas para bancos de dados relacionais para atender às características dos Bancos de Dados baseado em Grafos. O resultado deste trabalho é um catálogo de regras que poderá ser utilizado por desenvolvedores de ferramentas de administração de bancos de dados baseados em grafos para garantir a integridade das operações de evolução de esquemas de dados e consequentemente dos dados relacionadosAbstract: The information produced nowadays does not stop growing in volume and complexity, representing a technological challenge which demands more than the relational model for databases can currently offer. This situation stimulates the use of different forms of storage, such as Graph Databases. Current Graph Databases allow automatic database evolution, but do not provide adequate resources for the information organization. This is mostly left under the responsibility of the applications which access the database, compromising the data integrity and reliability. The goal of this work is the definition of refactoring rules to support the management of the evolution of Graph Databases. The rules presented in this document are adaptations and extensions of the existent refactoring rules for relational databases to meet the requirements of the Graph Databases features. The result of this work is a catalog of refactoring rules that can be used by developers of graph database management tools to guarantee the integrity of the operations of database evolutionMestradoTecnologia e InovaçãoMestra em Tecnologi
    corecore