7 research outputs found
Thinking Big in a Small World — Efficient Query Execution on Small-Scale SMPs
Many techniques developed for parallel database systems were focused on large-scale, often prototypical, hardware platforms. Therefore, most results cannot easily be transfered to widely available workstation clusters such as multiprocessor workstations.
In this paper we address exploitation of pipelining parallelism in query processing on small multiprocessor environments. We present DTE/R, a strategy for executing pipelining segments of arbitrary length by replicating the segment's operator. Therefore, DTE/R avoids static processor-to-operator assignment of conventional processing techniques. Consequently, DTE/R achieves automatic load-balancing and skew-handling. Furthermore, DTE/R outperforms conventional pipelining execution techniques substantially
Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)
Database management systems (DBMSs) carefully optimize complex multi-join
queries to avoid expensive disk I/O. As servers today feature tens or hundreds
of gigabytes of RAM, a significant fraction of many analytic databases becomes
memory-resident. Even after careful tuning for an in-memory environment, a
linear disk I/O model such as the one implemented in PostgreSQL may make query
response time predictions that are up to 2X slower than the optimal multi-join
query plan over memory-resident data. This paper introduces a memory I/O cost
model to identify good evaluation strategies for complex query plans with
multiple hash-based equi-joins over memory-resident data. The proposed cost
model is carefully validated for accuracy using three different systems,
including an Amazon EC2 instance, to control for hardware-specific differences.
Prior work in parallel query evaluation has advocated right-deep and bushy
trees for multi-join queries due to their greater parallelization and
pipelining potential. A surprising finding is that the conventional wisdom from
shared-nothing disk-based systems does not directly apply to the modern
shared-everything memory hierarchy. As corroborated by our model, the
performance gap between the optimal left-deep and right-deep query plan can
grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in
SoCC'1
Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins
In this paper, we explore the execution of pipelined hash joins in a multiprocessor-based database system. To improve the query execution, an innovative ap-proach on query execution tree selection is proposed to exploit segmented right-deep trees, which are bushy trees of right-deep subtrees. We first derive an analyt-ical model for the execution of a pipeline segment, and then, in light of the model, develop heuristic schemes to determine the query execution plan based on a seg-mented right-deep tree so that the query can be ef-ficiently executed. As shown by our simulation, the proposed approach, without incurring additional over-head on plan execution, possesses more flexibility in query plan generation, and leads to query plans of sig-nificantly better performance than those achievable by the previous schemes using right-deep trees.
Hochleistungs-Transaktionssysteme: Konzepte und Entwicklungen moderner Datenbankarchitekturen
Das Buch richtet sich an Informatiker in Studium, Lehre, Forschung und Entwicklung,
die an neueren Entwicklungen im Bereich von Transaktions- und Datenbanksystemen
interessiert sind. Es entspricht einer überarbeiteten Version meiner
im Februar 1993 vom Fachbereich Informatik der Universität Kaiserslautern
angenommenen Habilitationsschrift. Neben der Präsentation neuer Forschungsergebnisse
erfolgen eine breite Einführung in die Thematik sowie überblicksartige
Behandlung verschiedener Realisierungsansätze, wobei auf eine möglichst allgemeinverständliche
Darstellung Wert gelegt wurde. Der Text wurde durchgehend
mit Marginalien versehen, welche den Aufbau der Kapitel zusätzlich verdeutlichen
und eine schnelle Lokalisierung bestimmter Inhalte unterstützen sollen