703 research outputs found

    10381 Summary and Abstracts Collection -- Robust Query Processing

    Get PDF
    Dagstuhl seminar 10381 on robust query processing (held 19.09.10 - 24.09.10) brought together a diverse set of researchers and practitioners with a broad range of expertise for the purpose of fostering discussion and collaboration regarding causes, opportunities, and solutions for achieving robust query processing. The seminar strove to build a unified view across the loosely-coupled system components responsible for the various stages of database query processing. Participants were chosen for their experience with database query processing and, where possible, their prior work in academic research or in product development towards robustness in database query processing. In order to pave the way to motivate, measure, and protect future advances in robust query processing, seminar 10381 focused on developing tests for measuring the robustness of query processing. In these proceedings, we first review the seminar topics, goals, and results, then present abstracts or notes of some of the seminar break-out sessions. We also include, as an appendix, the robust query processing reading list that was collected and distributed to participants before the seminar began, as well as summaries of a few of those papers that were contributed by some participants

    Adaptive Execution of Compiled Queries

    Get PDF
    Compiling queries to machine code is arguably the most efficient way for executing queries. One often overlooked problem with compilation, however, is the time it takes to generate machine code. Even with fast compilation frameworks like LLVM, Generating machine code for complex queries routinely takes hundreds of milliseconds. Such compilation times can be a major disadvantage for workloads that execute many complex, but quick queries. To solve this problem, we propose an adaptive execution framework, which dynamically and transparently switches from interpretation to compilation. We also propose a fast bytecode interpreter for LLVM, which can execute queries without costly translation to machine code and thereby dramatically reduces query latency. Adaptive execution is dynamic, fine-grained, and can execute different code paths of the same query using different execution modes. Our extensive evaluation shows that this approach achieves optimal performance in a wide variety from settings---low latency for small data sets and maximum throughput for large data sizes

    Query optimizers based on machine learning techniques

    Get PDF
    Dissertação de mestrado integrado em Engenharia InformáticaQuery optimizers are considered one of the most relevant and sophisticated components in a database management system. However, despite currently producing nearly optimal results, optimizers rely on statistical estimates and heuristics to reduce the search space of alternative execution plans for a single query. As a result, for more complex queries, errors may grow exponentially, often translating into sub-optimal plans resulting in less than ideal performance. Recent advances in machine learning techniques have opened new opportunities for many of the existing problems related to system optimization. This document proposes a solution built on top of PostgreSQL that learns to select the most efficient set of optimizer strategy settings for a particular query. Instead of depending entirely on the optimizer’s estimates to compare different plans under different configurations, it relies on a greedy selection algorithm that supports several types of predictive modeling techniques, from more traditional modeling techniques to a deep learning approach. The system is evaluated experimentally with the standard TPC-H and Join Order ing Benchmark workloads to measure the cost and benefits of adding machine learning capabilities to traditional query optimizers.Os otimizadores de queries são considerados um dos componentes de maior relevância e complexidade num sistema de gestão de bases de dados. No entanto, apesar de atualmente produzirem resultados quase ótimos, os otimizadores dependem do uso de estimativas estatísticas e de heurísticas para reduzir o espaço de procura de planos de execução alternativos para uma determinada query. Como resultado, para queries mais complexas, os erros podem crescer exponencialmente, o que geralmente se traduz em planos sub-ótimos, resultando num desempenho inferior ao ideal. Os recentes avanços nas técnicas de aprendizagem automática abriram novas oportunidades para muitos dos problemas existentes relacionados com otimização de sistemas. Este documento propõe uma solução construída sobre o PostgreSQL que aprende a selecionar o conjunto mais eficiente de configurações do otimizador para uma determinada query. Em vez de depender inteiramente de estimativas do otimizador para comparar planos de configurações diferentes, a solução baseia-se num algoritmo de seleção greedy que suporta vários tipos de técnicas de modelagem preditiva, desde técnicas mais tradicionais a uma abordagem de deep learning. O sistema é avaliado experimentalmente com os workloads TPC-H e Join Ordering Benchmark para medir o custo e os benefícios de adicionar aprendizagem automática a otimizadores de queries tradicionais.This work is financed by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia, within project UIDB/50014/2020

    A Survey on Mapping Semi-Structured Data and Graph Data to Relational Data

    Get PDF
    The data produced by various services should be stored and managed in an appropriate format for gaining valuable knowledge conveniently. This leads to the emergence of various data models, including relational, semi-structured, and graph models, and so on. Considering the fact that the mature relational databases established on relational data models are still predominant in today's market, it has fueled interest in storing and processing semi-structured data and graph data in relational databases so that mature and powerful relational databases' capabilities can all be applied to these various data. In this survey, we review existing methods on mapping semi-structured data and graph data into relational tables, analyze their major features, and give a detailed classification of those methods. We also summarize the merits and demerits of each method, introduce open research challenges, and present future research directions. With this comprehensive investigation of existing methods and open problems, we hope this survey can motivate new mapping approaches through drawing lessons from eachmodel's mapping strategies, aswell as a newresearch topic - mapping multi-model data into relational tables.Peer reviewe

    Cost-Based Optimization of Integration Flows

    Get PDF
    Integration flows are increasingly used to specify and execute data-intensive integration tasks between heterogeneous systems and applications. There are many different application areas such as real-time ETL and data synchronization between operational systems. For the reasons of an increasing amount of data, highly distributed IT infrastructures, and high requirements for data consistency and up-to-dateness of query results, many instances of integration flows are executed over time. Due to this high load and blocking synchronous source systems, the performance of the central integration platform is crucial for an IT infrastructure. To tackle these high performance requirements, we introduce the concept of cost-based optimization of imperative integration flows that relies on incremental statistics maintenance and inter-instance plan re-optimization. As a foundation, we introduce the concept of periodical re-optimization including novel cost-based optimization techniques that are tailor-made for integration flows. Furthermore, we refine the periodical re-optimization to on-demand re-optimization in order to overcome the problems of many unnecessary re-optimization steps and adaptation delays, where we miss optimization opportunities. This approach ensures low optimization overhead and fast workload adaptation

    Consistency-by-Construction Techniques for Software Models and Model Transformations

    Get PDF
    A model is consistent with given specifications (specs) if and only if all the specifications are held on the model, i.e., all the specs are true (correct) for the model. Constructing consistent models (e.g., programs or artifacts) is vital during software development, especially in Model-Driven Engineering (MDE), where models are employed throughout the life cycle of software development phases (analysis, design, implementation, and testing). Models are usually written using domain-specific modeling languages (DSMLs) and specified to describe a domain problem or a system from different perspectives and at several levels of abstraction. If a model conforms to the definition of its DSML (denoted usually by a meta-model and integrity constraints), the model is consistent. Model transformations are an essential technology for manipulating models, including, e.g., refactoring and code generation in a (semi)automated way. They are often supposed to have a well-defined behavior in the sense that their resulting models are consistent with regard to a set of constraints. Inconsistent models may affect their applicability and thus the automation becomes untrustworthy and error-prone. The consistency of the models and model transformation results contribute to the quality of the overall modeled system. Although MDE has significantly progressed and become an accepted best practice in many application domains such as automotive and aerospace, there are still several significant challenges that have to be tackled to realize the MDE vision in the industry. Challenges such as handling and resolving inconsistent models (e.g., incomplete models), enabling and enforcing model consistency/correctness during the construction, fostering the trust in and use of model transformations (e.g., by ensuring the resulting models are consistent), developing efficient (automated, standardized and reliable) domain-specific modeling tools, and dealing with large models are continually making the need for more research evident. In this thesis, we contribute four automated interactive techniques for ensuring the consistency of models and model transformation results during the construction process. The first two contributions construct consistent models of a given DSML in an automated and interactive way. The construction can start at a seed model being potentially inconsistent. Since enhancing a set of transformations to satisfy a set of constraints is a tedious and error-prone task and requires high skills related to the theoretical foundation, we present the other contributions. They ensure model consistency by enhancing the behavior of model transformations through automatically constructing application conditions. The resulting application conditions control the applicability of the transformations to respect a set of constraints. Moreover, we provide several optimizing strategies. Specifically, we present the following: First, we present a model repair technique for repairing models in an automated and interactive way. Our approach guides the modeler to repair the whole model by resolving all the cardinalities violations and thereby yields a desired, consistent model. Second, we introduce a model generation technique to efficiently generate large, consistent, and diverse models. Both techniques are DSML-agnostic, i.e., they can deal with any meta-models. We present meta-techniques to instantiate both approaches to a given DSML; namely, we develop meta-tools to generate the corresponding DSML tools (model repair and generation) for a given meta-model automatically. We present the soundness of our techniques and evaluate and discuss their features such as scalability. Third, we develop a tool based on a correct-by-construction technique for translating OCL constraints into semantically equivalent graph constraints and integrating them as guaranteeing application conditions into a transformation rule in a fully automated way. A constraint-guaranteeing application condition ensures that a rule applies successfully to a model if and only if the resulting model after the rule application satisfies the constraint. Fourth, we propose an optimizing-by-construction technique for application conditions for transformation rules that need to be constraint-preserving. A constraint-preserving application condition ensures that a rule applies successfully to a consistent model (w.r.t. the constraint) if and only if the resulting model after the rule application still satisfies the constraint. We show the soundness of our techniques, develop them as ready-to-use tools, evaluate the efficiency (complexity and performance) of both works, and assess the overall approach in general as well. All our four techniques are compliant with the Eclipse Modeling Framework (EMF), which is the realization of the OMG standard specification in practice. Thus, the interoperability and the interchangeability of the techniques are ensured. Our techniques not only improve the quality of the modeled system but also increase software productivity by providing meta-tools for generating the DSML tool supports and automating the tasks
    • …
    corecore