720 research outputs found

    Automatic physical database design : recommending materialized views

    Get PDF
    This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

    Neo: A Learned Query Optimizer

    Full text link
    Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific workloads and datasets. Motivated by this shortcoming and inspired by recent advances in applying machine learning to data management challenges, we introduce Neo (Neural Optimizer), a novel learning-based query optimizer that relies on deep neural networks to generate query executions plans. Neo bootstraps its query optimization model from existing optimizers and continues to learn from incoming queries, building upon its successes and learning from its failures. Furthermore, Neo naturally adapts to underlying data patterns and is robust to estimation errors. Experimental results demonstrate that Neo, even when bootstrapped from a simple optimizer like PostgreSQL, can learn a model that offers similar performance to state-of-the-art commercial optimizers, and in some cases even surpass them

    Declarative Ajax Web Applications through SQL++ on a Unified Application State

    Full text link
    Implementing even a conceptually simple web application requires an inordinate amount of time. FORWARD addresses three problems that reduce developer productivity: (a) Impedance mismatch across the multiple languages used at different tiers of the application architecture. (b) Distributed data access across the multiple data sources of the application (SQL database, user input of the browser page, session data in the application server, etc). (c) Asynchronous, incremental modification of the pages, as performed by Ajax actions. FORWARD belongs to a novel family of web application frameworks that attack impedance mismatch by offering a single unifying language. FORWARD's language is SQL++, a minimally extended SQL. FORWARD's architecture is based on two novel cornerstones: (a) A Unified Application State (UAS), which is a virtual database over the multiple data sources. The UAS is accessed via distributed SQL++ queries, therefore resolving the distributed data access problem. (b) Declarative page specifications, which treat the data displayed by pages as rendered SQL++ page queries. The resulting pages are automatically incrementally modified by FORWARD. User input on the page becomes part of the UAS. We show that SQL++ captures the semi-structured nature of web pages and subsumes the data models of two important data sources of the UAS: SQL databases and JavaScript components. We show that simple markup is sufficient for creating Ajax displays and for modeling user input on the page as UAS data sources. Finally, we discuss the page specification syntax and semantics that are needed in order to avoid race conditions and conflicts between the user input and the automated Ajax page modifications. FORWARD has been used in the development of eight commercial and academic applications. An alpha-release web-based IDE (itself built in FORWARD) enables development in the cloud.Comment: Proceedings of the 14th International Symposium on Database Programming Languages (DBPL 2013), August 30, 2013, Riva del Garda, Trento, Ital

    Adaptive query parallelization in multi-core column stores

    Get PDF
    With the rise of multi-core CPU platforms, their optimal utilization for in-memory OLAP workloads using column store databases has become one of the biggest challenges. Some of the inherent limi- tations in the achievable query parallelism are due to the degree of parallelism dependency on the data skew, the overheads incurred by thread coordination, and the hardware resource limits. Finding the right balance between the degree of parallelism and the multi-core utilizati

    PerfXplain: Debugging MapReduce Job Performance

    Full text link
    While users today have access to many tools that assist in performing large scale data analysis tasks, understanding the performance characteristics of their parallel computations, such as MapReduce jobs, remains difficult. We present PerfXplain, a system that enables users to ask questions about the relative performances (i.e., runtimes) of pairs of MapReduce jobs. PerfXplain provides a new query language for articulating performance queries and an algorithm for generating explanations from a log of past MapReduce job executions. We formally define the notion of an explanation together with three metrics, relevance, precision, and generality, that measure explanation quality. We present the explanation-generation algorithm based on techniques related to decision-tree building. We evaluate the approach on a log of past executions on Amazon EC2, and show that our approach can generate quality explanations, outperforming two naive explanation-generation methods.Comment: VLDB201

    TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark

    Get PDF
    The TPC-D benchmark was developed almost 20 years ago, and even though its current existence as TPC H could be considered superseded by TPC-DS, one can still learn from it. We focus on the technical level, summarizing the challenges posed by the TPC-H workload as we now understand them, which w
    • …
    corecore