22 research outputs found

    Query optimization by using derivability in a data warehouse environment

    Get PDF
    Materialized summary tables and cached query results are frequently used for the optimization of aggregate queries in a data warehouse. Query rewriting techniques are incorporated into database systems to use those materialized views and thus avoid the access of the possibly huge raw data. A rewriting is only possible if the query is derivable from these views. Several approaches can be found in the literature to check the derivability and find query rewritings. The specific application scenario of a data warehouse with its multidimensional perspective allows the consideration of much more semantic information, e.g. structural dependencies within the dimension hierarchies and different characteristics of measures. The motivation of this article is to use this information to present conditions for derivability in a large number of relevant cases which go beyond previous approaches

    Reducing the View Selection Problem through Code Modeling: Static and Dynamic approaches

    Get PDF
    2015 - 2016Data  warehouse  systems aim to support decision making by providing users with the appropriate  information  at  the right time. This task is particularly challenging in business contexts where large  amount of data is produced at a high speed. To this end, data warehouses have been equipped with  Online Analytical Processing tools that help users to make fast and precise decisions througt the  execution of complex queries. Since the computation of these queries is time consuming, data   warehouses precompute a set of materialized views answering to the workload  queries.   This thesis work defines a process to determine the minimal set of workload queries and the set of views to materialize. The set of queries is represented by an optimized lattice structure used to select  the views to be materialized according to the processing time costs and the view storage space. The minimal set of required Online Analytical Processing queries is computer by analyzing the data model defined with the visual language CoDe (Complexity Design). The latter allows to conceptually organizatio  the visualization of data reports and to generate visualizations of data obtained from data-­‐mart queries. CoDe adopts a hybrid modeling process combining two main methodologieser-­‐driven and data-­ driven. The first aims to create a model according to  the  user  knowledge,  re-quirements, and analysis needs, whilst the latter has in  charge to concretize data  and their relationships in the model through Online Analytical Processing queries. Since the materialized views change over time, we also propose a dynamic process that allows users to upgrade the CoDe model with a context-­‐aware editor, build an optimized lattice structure able to  minimize the effort to recalculate it,and propose the new set of views  to  materialize  Moreover,  the  process applies a Markov strategy to predict whether the views need to be recalculate or not  according to the changes of the model. The effectiveness of the proposed  techniques has  been  evaluated on a real world data warehouse. The results  revealed that the Markov strategy gives a better set of solutions in term of storage space and total processing cost. [edited by author]  XV n.

    Physical Plan Instrumentation in Databases: Mechanisms and Applications

    Get PDF
    Database management systems (DBMSs) are designed with the goal set to compile SQL queries to physical plans that, when executed, provide results to the SQL queries. Building on this functionality, an ever-increasing number of application domains (e.g., provenance management, online query optimization, physical database design, interactive data profiling, monitoring, and interactive data visualization) seek to operate on how queries are executed by the DBMS for a wide variety of purposes ranging from debugging and data explanation to optimization and monitoring. Unfortunately, DBMSs provide little, if any, support to facilitate the development of this class of important application domains. The effect is such that database application developers and database system architects either rewrite the database internals in ad-hoc ways; work around the SQL interface, if possible, with inevitable performance penalties; or even build new databases from scratch only to express and optimize their domain-specific application logic over how queries are executed. To address this problem in a principled manner in this dissertation, we introduce a prototype DBMS, namely, Smoke, that exposes instrumentation mechanisms in the form of a framework to allow external applications to manipulate physical plans. Intuitively, a physical plan is the underlying representation that DBMSs use to encode how a SQL query will be executed, and providing instrumentation mechanisms at this representation level allows applications to express and optimize their logic on how queries are executed. Having such an instrumentation-enabled DBMS in-place, we then consider how to express and optimize applications that rely their logic on how queries are executed. To best demonstrate the expressive and optimization power of instrumentation-enabled DBMSs, we express and optimize applications across several important domains including provenance management, interactive data visualization, interactive data profiling, physical database design, online query optimization, and query discovery. Expressivity-wise, we show that Smoke can express known techniques, introduce novel semantics on known techniques, and introduce new techniques across domains. Performance-wise, we show case-by-case that Smoke is on par with or up-to several orders of magnitudes faster than state-of-the-art imperative and declarative implementations of important applications across domains. As such, we believe our contributions provide evidence and form the basis towards a class of instrumentation-enabled DBMSs with the goal set to express and optimize applications across important domains with core logic over how queries are executed by DBMSs

    Resolution-based methods for linear temporal reasoning

    Get PDF
    The aim of this thesis is to explore the potential of resolution-based methods for linear temporal reasoning. On the abstract level, this means to develop new algorithms for automated reasoning about properties of systems which evolve in time. More concretely, we will: 1) show how to adapt the superposition framework to proving theorems in propositional Linear Temporal Logic (LTL), 2) use a connection between superposition and the CDCL calculus of modern SAT solvers to come up with an efficient LTL prover, 3) specialize the previous to reachability properties and discover a close connection to Property Directed Reachability (PDR), an algorithm recently developed for model checking of hardware circuits, 4) further improve PDR by providing a new technique for enhancing clause propagation phase of the algorithm, and 5) adapt PDR to automated planning by replacing the SAT solver inside with a planning-specific procedure. We implemented the proposed ideas and provide experimental results which demonstrate their practical potential on representative benchmark sets. Our system LS4 is shown to be the strongest LTL prover currently publicly available. The mentioned enhancement of PDR substantially improves the performance of our implementation of the algorithm for hardware model checking in the multi-property setting. It is expected that other implementations would benefit from it in an analogous way. Finally, our planner PDRplan has been compared with the state-of-the-art planners on the benchmarks from the International Planning Competition with very promising results.Das Ziel dieser Doktorarbeit ist es, das Potential resolutionsbasierter Methoden zur linearer, temporaler Beweisführung zu untersuchen. Von einem abstrakten Gesichtspunkt aus gesehen bedeutet dies, neue Algorithmen über die Eigenschaften von sich zeitlich entwicklenden Systemen im Bereich des automatischen Theorembeweisens zu entwickeln. Konkreter gesagt werden wir 1) aufzeigen, wie sich das Rahmenprogramm der Superposition so anpassen lässt, damit es Theoreme in propositionaler Linear Temporal Logic (LTL) beweist, 2) eine Verbindung zwischen der Superposition und dem CDCL-Kalkül moderner SAT-Solver nutzen, um mit einem effizienten LTL-Prover aufzuwarten, 3) das Vorangegangene auf Erreichbarkeitseigenschaften spezialisieren, und eine starke Verbindung zu der Property Directed Reachability (PDR), einem jüngst eintwickeltem Model-Checking-Algorithmus für Hardware-Schaltkreise, aufzudecken, 4) PDR durch die Einführung neuer Technik verbessern, die die Clause-Propagation-Phase des Algorithmus beschleunigt, und 5) PDR für das automatisierte Planen anpassen, indem wir den inneren SAT-Solver durch eine planungsspezifische Prozedur ersetzen. Wir haben die vorgeschlagenen Ideen implementiert, und es werden experimentelle Ergebnisse angegeben, die das praktische Potential dieser Ideen auf repräsentativen Benchmarks aufzeigt. Es hat sich herausgestellt, dass unser System LS4 der staerkste öffentlich zugängliche LTL-Prover ist. Die erwähnte Erweiterung von PDR verbessern die Leistungsfähigkeit unserer Implementierung des Hardware-Model-Checking-Algorithmus substantiell im Bereich der Multi-Property-Einstellungen. Wir erwarten, dass andere Implementierungen in ähnlicher Weise profitieren würden. Schließlich haben wir viel versprechende Ergebnisse durch den Vergleich unser Planer PDRplan mit anderen state-of-the-art Planer auf den Benchmarks der International Planning Competition erzielt
    corecore