Search CORE

43 research outputs found

Parallel evaluation of multi-join queries

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1995
Field of study

Automatic Parallelization of Database Queries

Author: Dietz Henry G.
Kang Myong H.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/1990
Field of study

Although automatic parallelization of conventional language programs is now widely accepted, relatively little emphasis has been placed on automatic parallelization of database query programs (sometimes referred to as “multiple queries” ). In this paper, we discuss the unique problems associated with automatic parallelization of database programs. From this discussion, we derive a complete approach to automatic parallelization of database programs. Beside integrating a number of existing techniques, our approach relies heavily on several new concepts, including the concepts of “algorithm-level” analysis and hybrid static/dynamic scheduling

Purdue E-Pubs

Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)

Author: Ailamaki A.
Boncz P. A.
Boncz P. A.
Chen M.-S.
Chen M.-S.
DeWitt D. J.
Lang H.
Li Y.
Liu B.
Lohman G. M.
Lu H.
Manegold S.
Ono K.
Schneider D. A.
Shatdal A.
Stillger M.
Zhang N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/07/2015
Field of study

Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes memory-resident. Even after careful tuning for an in-memory environment, a linear disk I/O model such as the one implemented in PostgreSQL may make query response time predictions that are up to 2X slower than the optimal multi-join query plan over memory-resident data. This paper introduces a memory I/O cost model to identify good evaluation strategies for complex query plans with multiple hash-based equi-joins over memory-resident data. The proposed cost model is carefully validated for accuracy using three different systems, including an Amazon EC2 instance, to control for hardware-specific differences. Prior work in parallel query evaluation has advocated right-deep and bushy trees for multi-join queries due to their greater parallelization and pipelining potential. A surprising finding is that the conventional wisdom from shared-nothing disk-based systems does not directly apply to the modern shared-everything memory hierarchy. As corroborated by our model, the performance gap between the optimal left-deep and right-deep query plan can grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in SoCC'1

arXiv.org e-Print Archive

Crossref

PRISMA/DB: A Parallel Main Memory Relational DBMS

Author: Apers P.M.G. (Peter)
Berg C.A. van den
Flokstra J.
Grefen P.W.P.J.
Kersten M.L. (Martin)
Wilschut A.N.
Publication venue: I.E.E.E. Computer Society Press
Publication date: 01/12/1992
Field of study

CWI's Institutional Repository

Parallel evaluation of multi-join queries

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Analytical response time estimation in parallel relational database systems

Author: Broughton P.
Burger A.
Dempster E.
King P. J B
Taylor H.
Tomov N.
Williams Howard
Publication venue: 'Elsevier BV'
Publication date: 01/02/2004
Field of study

Techniques for performance estimation in parallel database systems are well established for parameters such as throughput, bottlenecks and resource utilisation. However, response time estimation is a complex activity which is difficult to predict and has attracted research for a number of years. Simulation is one option for predicting response time but this is a costly process. Analytical modelling is a less expensive option but requires approximations and assumptions about the queueing networks built up in real parallel database machines which are often questionable and few of the papers on analytical approaches are backed by results from validation against real machines. This paper describes a new analytical approach for response time estimation that is based on a detailed study of different approaches and assumptions. The approach has been validated against two commercial parallel DBMSs running on actual parallel machines and is shown to produce acceptable accuracy

Heriot Watt Pure

Abertay Research Portal

Memory aware query scheduling in a database cluster

Author: Kersten M.L. (Martin)
Waas F.
Publication venue: CWI
Publication date: 01/01/2000
Field of study

Query throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications: they are of lower complexity and almost exclusively read-only. The architecture we propose here is specifically tailored to take advantage of the query characteristics. It is based on a large parallel shared-nothing database cluster where each node runs a separate server with a fully replicated copy of the database. A query is assigned and entirely executed on one single node avoiding network contention or synchronization effects. However, the actual key to enhanced throughput is a resource efficient scheduling of the arriving queries. We develop a simple and robust scheduling scheme that takes the currently memory resident data at each server into account and trades off memory re-use and execution time, reordering queries as necessary. Our experimental evaluation demonstrates the effectiveness when scaling the system beyond hundreds of nodes showing super-linear speedup

CWI's Institutional Repository

On applying hash filters to improving the execution of multi-join queries

Author: Hui-I Hsiao
Ming-Syan Chen
Philip S. Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref