7 research outputs found

    Multiple Query Optimization on the D-Wave 2X Adiabatic Quantum Computer

    Get PDF
    The D-Wave adiabatic quantum annealer solves hard combinatorial optimization problems leveraging quantum physics. The newest version features over 1000 qubits and was released in August 2015. We were given access to such a machine, currently hosted at NASA Ames Research Center in California, to explore the potential for hard optimization problems that arise in the context of databases. In this paper, we tackle the problem of multiple query optimization (MQO). We show how an MQO problem instance can be transformed into a mathematical formula that complies with the restrictive input format accepted by the quantum annealer. This formula is translated into weights on and between qubits such that the configuration minimizing the input formula can be found via a process called adiabatic quantum annealing. We analyze the asymptotic growth rate of the number of required qubits in the MQO problem dimensions as the number of qubits is currently the main factor restricting applicability. We experimentally compare the performance of the quantum annealer against other MQO algorithms executed on a traditional computer. While the problem sizes that can be treated are currently limited, we already find a class of problem instances where the quantum annealer is three orders of magnitude faster than other approaches

    Automatic physical database design : recommending materialized views

    Get PDF
    This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

    Common Subexpression Processing in Multiple-Query Processing

    No full text
    The efficiency of common subexpression identification is critical to the performance of multiple-query processing. In this paper, we develop a multi-graph for representing and facilitating the processing of multiple queries. In addition to the traditional multiple-query processing approaches in exploiting common subexpressions for identical and subsumption cases, the proposed multi-graph processing also covers the overlap case. A performance study shows the viability of this technique when compared to an earlier multi-graph approach. Keywords: Query optimization, query processing, multiple query processing, subexpression identification, select-project-join, databases. 1 Introduction The main idea of multiple-query processing (MQP) is to optimize a set of queries together and execute the common operations once. Sellis shows that using MQP may lead to substantial savings over single-query processing (SQP) [12]. The major tasks in MQP are common operation/subexpression identification an..

    From Massive Parallelization to Quantum Computing: Seven Novel Approaches to Query Optimization

    Get PDF
    The goal of query optimization is to map a declarative query (describing data to generate) to a query plan (describing how to generate the data) with optimal execution cost. Query optimization is required to support declarative query interfaces. It is a core problem in the area of database systems and has received tremendous attention in the research community, starting with an initial publication in 1979. In this thesis, we revisit the query optimization problem. This visit is motivated by several developments that change the context of query optimization. That change is not reflected in prior literature. First, advances in query execution platforms and processing techniques have changed the context of query optimization. Novel provisioning models and processing techniques such as Cloud computing, crowdsourcing, or approximate processing allow to trade between different execution cost metrics (e.g., execution time versus monetary execution fees in case of Cloud computing). This makes it necessary to compare alternative execution plans according to multiple cost metrics in query optimization. While this is a common scenario nowadays, the literature on query optimization with multiple cost metrics (a generalization of the classical problem variant with one execution cost metric) is surprisingly sparse. While prior methods take hours to optimize even moderately sized queries when considering multiple cost metrics, we propose a multitude of approaches to make query optimization in such scenarios practical. A second development that we address in this thesis is the availability of novel software and hardware platforms that can be exploited for optimization. We will show that integer programming solvers, massively parallel clusters (which nowadays are commonly used for query execution), and adiabatic quantum annealers enable us to solve query optimization problem instances that are far beyond the capabilities of prior approaches. In summary, we propose seven novel approaches to query optimization that significantly increase the size of the problem instances that can be addressed (measured by the query size and by the number of considered execution cost metrics). Those novel approaches can be classified into three broad categories: moving query optimization before run time to relax constraints on optimization time, trading optimization time for relaxed optimality guarantees (leading to approximation schemes, incremental algorithms, and randomized algorithms for query optimization with multiple cost metrics), and reducing optimization time by leveraging novel software and hardware platforms (integer programming solvers, massively parallel clusters, and adiabatic quantum annealers). Those approaches are novel since they address novel problem variants of query optimization, introduced in this thesis, since they are novel for their respective problem variant (e.g., we propose the first randomized algorithm for query optimization with multiple cost metrics), or because they have never been used for optimization problems in the database domain (e.g., this is the first time that quantum computing is used to solve a database-specific optimization problem)

    An investigation of computer based nominal data record linkage

    Get PDF
    The Internet now provides access to vast volumes of nominal data (data associated with names e. g. birth/death records, parish records, text articles, multimedia) collected for a range of different purposes. This research focuses on parish registers containing baptism, marriage, and burial records. Mining these data resources involves linkage investigating as to how two records are related with regards to attributes like surname, spatio-temporal location, legal association and inter-relationships. Furthermore, as well as handling the implicit constraints of nominal data, such a system must also be able to handle automatically a range of temporal and spatial rules and constraints. The research examines the linkage rules that apply and how such rules interact. In this investigation a report is given of the current practices in several disciplines (e. g. history, demography, genealogy, and epidemiology) and how these are implemented in current computer and database systems. The practical aspects of this study, and the workbench approach proposed are centred on the extensive Lancashire & Cheshire Parish Register archive held on the MIMAS database computer located at Manchester University. The research also proposes how these findings can have wider applications. This thesis describes some initial research into this problem. It describes three prototypes of nominal data workbench that allow the specification and examination of several linkage types and discusses the merits of alternative name matching methods, name grouping techniques and method comparisons. The conclusion is that in the cases examined so far, effective nominal data linkage is essentially a query optimisation process. The process is made more efficient if linkage specific indexes exist, and suggests that query re-organization based on these indexes, though a complex process, is entirely feasible. To facilitate the use of indexes and to guide the optimization process, the work suggests the use of formal ontologies
    corecore