Search CORE

14 research outputs found

Rewriting Declarative Query Languages

Author: Brantner Matthias
Publication venue: Universität Mannheim
Publication date: 01/01/2007
Field of study

Queries against databases are formulated in declarative languages. Examples are the relational query language SQL and XPath or XQuery for querying data stored in XML. Using a declarative query language, the querist does not need to know about or decide on anything about the actual strategy a system uses to answer the query. Instead, the system can freely choose among the algorithms it employs to answer a query. Predominantly, query processing in the relational context is accomplished using a relational algebra. To this end, the query is translated into a logical algebra. The algebra consists of logical operators which facilitate the application of various optimization techniques. For example, logical algebra expressions can be rewritten in order to yield more efficient expressions. In order to query XML data, XPath and XQuery have been developed. Both are declarative query languages and, hence, can benefit from powerful optimizations. For instance, they could be evaluated using an algebraic framework. However, in general, the existing approaches are not directly utilizable for XML query processing. This thesis has two goals. The first goal is to overcome the above-mentioned misfits of XML query processing, making it ready for industrial-strength settings. Specifically, we develop an algebraic framework that is designed for the efficient evaluation of XPath and XQuery. To this end, we define an order-aware logical algebra and a translation of XPath into this algebra. Furthermore, based on the resulting algebraic expressions, we present rewrites in order to speed up the execution of such queries. The second goal is to investigate rewriting techniques in the relational context. To this end, we present rewrites based on algebraic equivalences that unnest nested SQL queries with disjunctions. Specifically, we present equivalences for unnesting algebraic expressions with bypass operators to handle disjunctive linking and correlation. Our approach can be applied to quantified table subqueries as well as scalar subqueries. For all our results, we present experiments that demonstrate the effectiveness of the developed approaches

MAnnheim DOCument Server

Execution strategies for SQL subqueries

Author: César A. Galindo-legaria
Milind M. Joshi
Mostafa Elhemali
Torsten Grabs
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Optimizing SQL subqueries has been an active area in database research and the database industry throughout the last decades. Pre-vious work has already identified some approaches to efficiently execute relational subqueries. For satisfactory performance, proper choice of subquery execution strategies becomes even more essen-tial today with the increase in decision support systems and auto-matically generated SQL, e.g., with ad-hoc reporting tools. This goes hand in hand with increasing query complexity and growing data volumes – which all pose challenges for an industrial-strength query optimizer. This current paper explores the basic building blocks that Microsoft SQL Server utilizes to optimize and execute relational subqueries. We start with indispensable prerequisites such as detection and removal of correlations for subqueries. We identify a full spectrum of fundamental subquery execution strategies such as forward and reverse lookup as well as set-based approaches, explain the different execution strategies for subqueries implemented in SQL Server, and relate them to the current state of the art. To the best of our knowl-edge, several strategies discussed in this paper have not been pub-lished before. An experimental evaluation complements the paper. It quantifies the performance characteristics of the different approaches and shows that indeed alternative execution strategies are needed in different circumstances, which make a cost-based query optimizer indispen-sable for adequate query performance

CiteSeerX

Crossref

An Algebraic Approach to XQuery Optimization

Author: May Norman
Publication venue: Universität Mannheim
Publication date: 01/01/2007
Field of study

As more data is stored in XML and more applications need to process this data, XML query optimization becomes performance critical. While optimization techniques for relational databases have been developed over the last thirty years, the optimization of XML queries poses new challenges. Query optimizers for XQuery, the standard query language for XML data, need to consider both document order and sequence order. Nevertheless, algebraic optimization proved powerful in query optimizers in relational and object oriented databases. Thus, this dissertation presents an algebraic approach to XQuery optimization. In this thesis, an algebra over sequences is presented that allows for a simple translation of XQuery into this algebra. The formal definitions of the operators in this algebra allow us to reason formally about algebraic optimizations. This thesis leverages the power of this formalism when unnesting nested XQuery expressions. In almost all cases unnesting nested queries in XQuery reduces query execution times from hours to seconds or milliseconds. Moreover, this dissertation presents three basic algebraic patterns of nested queries. For every basic pattern a decision tree is developed to select the most effective unnesting equivalence for a given query. Query unnesting extends the search space that can be considered during cost-based optimization of XQuery. As a result, substantially more efficient query execution plans may be detected. This thesis presents two more important cases where the number of plan alternatives leads to substantially shorter query execution times: join ordering and reordering location steps in path expressions. Our algebraic framework detects cases where document order or sequence order is destroyed. However, state-of-the-art techniques for order optimization in cost-based query optimizers have efficient mechanisms to repair order in these cases. The results obtained for query unnesting and cost-based optimization of XQuery underline the need for an algebraic approach to XQuery optimization for efficient XML query processing. Moreover, they are applicable to optimization in relational databases where order semantics are considered

MAnnheim DOCument Server

Self Maintenance of Materialized XQuery Views via Query Containment and Re-Writing

Author: Nilekar Shirish K.
Publication venue: Digital WPI
Publication date: 24/04/2006
Field of study

In recent years XML, the eXtensible Markup Language has become the de-facto standard for publishing and exchanging information on the web and in enterprise data integration systems. Materialized views are often used in information integration systems to present a unified schema for efficient querying of distributed and possibly heterogenous data sources. On similar lines, ACE-XQ, an XQuery based semantic caching system shows the significant performance gains achieved by caching query results (as materialized views) and using these materialized views along with query containment techniques for answering future queries over distributed XML data sources. To keep data in these materialized views of ACE-XQ up-to-date, the view must be maintained i.e. whenever the base data changes, the corresponding cached data in the materialized view must also be updated. This thesis builds on the query containment ideas of ACE-XQ and proposes an efficient approach for self-maintenance of materialized views. Our experimental results illustrate the significant performance improvement achieved by this strategy over view re-computation for a variety of situations

DigitalCommons@WPI

Query translation and optimisation for complex value databases

Author: Liu Hong-Cheu
Publication venue
Publication date: 01/01/2000
Field of study

This thesis considers the theory of database queries on the complex value data model extended with external functions. In modern intelligent database systems, we expect that query systems be able to handle a wide range of calculus formulas correctly and efficiently. Accordingly, they will require general query translators and efficient optimisers. Motivated by these concerns, this thesis undertakes a· comprehensive study of query evaluation in the complex value model and investigates the following issues: • identifying recursive sets of complex value formulas which define domain independent queries; • implementing complex value calculus queries with the incorporation of functions; • solving the problem of how to process join operation in complex value databases; and • investigating some algebraic properties concerning nested relational operators. The first part of this thesis extends some classical properties of the relational theory - particularly those related to query safety - to the context of complex value databases with fixed external functions and investigates the problem of how to implement calculus queries. Two notions of syntactic criteria for queries which guarantee domain independence, namely, embedded evaluable and embedded allowed, are generalised for this data model. This thesis shows that all embedded-allowed calculus (or fix-point) queries are external-function domain independent and continuous. This thesis discusses the topic of "embedded allowed database programs" and proves that embedded allowed stratified programs satisfying certain constraints are embedded domain independent. It also develops an algorithm for translating embedded allowed queries into equivalent algebraic expressions as a basis for evaluating safe queries in all calculus-based query classes. The second part of this thesis considers the issue of query optimisation for nested relational databases. Within a restricted set of nested schema trees, a join operator, called P-join, is proposed. The P-join operator does not require as many restructuring operators and combines the advantages of the extended natural join and recursive join for efficient data access. A P-join algorithm which takes advantage of a decomposed storage model and various join techniques available in the standard relational model to reduce the cost of join operation in nested relational databases is also proposed. Finally, this thesis investigates some algebraic properties of nested relational operators which are useful for query optimisation in the nested relational model and outlines a heuristic optimisation algorithm for nested relational expressions by adopting algebraic transformation rules developed in this thesis and previous related work

The Australian National University

Logical Foundations of Object-Oriented and Frame-Based Languages

Author: Kifer Michael
Lausen Georg
Wu James
Publication venue
Publication date: 01/01/1990
Field of study

We propose a novel logic, called Frame Logic (abbr., F-logic), that accounts in a clean, declarative fashion for most of the structural aspects of object-oriented and frame-based languages. These features include object identity, complex objects, inheritance, polymorphic types, methods, encapsulation, and others. In a sense, F-logic stands in the same relationship to the object-oriented paradigm as classical predicate calculus stands to relational programming. The syntax of F-logic is higher-order, which, among other things, allows the user to explore data and schema using the same declarative language. F-logic has a model-theoretic semantics and a sound and complete resolution-based proof procedure. This paper also discusses various aspects of programming in declarative object-oriented languages based on F-logic

CiteSeerX

MAnnheim DOCument Server

Equivalence of Queries with Nested Aggregation

Author: DeHaan David
Publication venue: 'University of Waterloo'
Publication date: 01/01/2009
Field of study

Query equivalence is a fundamental problem within database theory. The correctness of all forms of logical query rewriting—join minimization, view flattening, rewriting over materialized views, various semantic optimizations that exploit schema dependencies, federated query processing and other forms of data integration—requires proving that the final executed query is equivalent to the original user query. Hence, advances in the theory of query equivalence enable advances in query processing and optimization. In this thesis we address the problem of deciding query equivalence between conjunctive SQL queries containing aggregation operators that may be nested. Our focus is on understanding the interaction between nested aggregation operators and the other parts of the query body, and so we model aggregation functions simply as abstract collection constructors. Hence, the precise language that we study is a conjunctive algebraic language that constructs complex objects from databases of flat relations. Using an encoding of complex objects as flat relations, we reduce the query equivalence problem for this algebraic language to deciding equivalence between relational encodings output by traditional conjunctive queries (not containing aggregation). This encoding-equivalence cleanly unifies and generalizes previous results for deciding equivalence of conjunctive queries evaluated under various processing semantics. As part of our study of aggregation operators that can construct empty sub-collections—so-called “scalar” aggregation—we consider query equivalence for conjunctive queries extended with a left outer join operator, a very practical class of queries for which the general equivalence problem has never before been analyzed. Although we do not completely solve the equivalence problem for queries with outer joins or with scalar aggregation, we do propose useful sufficient conditions that generalize previously known results for restricted classes of queries. Overall, this thesis offers new insight into the fundamental principles governing the behaviour of nested aggregation

University of Waterloo's Institutional Repository

Weiterentwicklung analytischer Datenbanksysteme

Author: Kipf Andreas Michael
Publication venue: Technische Universität München
Publication date
Field of study

This thesis contributes to the state of the art in analytical database systems. First, we identify and explore extensions to better support analytics on event streams. Second, we propose a novel polygon index to enable efficient geospatial data processing in main memory. Third, we contribute a new deep learning approach to cardinality estimation, which is the core problem in cost-based query optimization.Diese Arbeit trägt zum aktuellen Forschungsstand von analytischen Datenbanksystemen bei. Wir identifizieren und explorieren Erweiterungen um Analysen auf Eventströmen besser zu unterstützen. Wir stellen eine neue Indexstruktur für Polygone vor, die eine effiziente Verarbeitung von Geodaten im Hauptspeicher ermöglicht. Zudem präsentieren wir einen neuen Ansatz für Kardinalitätsschätzungen mittels maschinellen Lernens

Programming Languages and Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book constitutes the proceedings of the 30th European Symposium on Programming, ESOP 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 24 papers included in this volume were carefully reviewed and selected from 79 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

OAPEN Library

Computer science: the hardware software and heart of IT

Author: Edward K. Blum Alfred V. Aho
Publication venue: Springer
Publication date
Field of study

1st edition, 201

Eastern University Digital Library