Search CORE

26 research outputs found

Analytical response time estimation in parallel relational database systems

Author: Broughton P.
Burger A.
Dempster E.
King P. J B
Taylor H.
Tomov N.
Williams Howard
Publication venue: 'Elsevier BV'
Publication date: 01/02/2004
Field of study

Techniques for performance estimation in parallel database systems are well established for parameters such as throughput, bottlenecks and resource utilisation. However, response time estimation is a complex activity which is difficult to predict and has attracted research for a number of years. Simulation is one option for predicting response time but this is a costly process. Analytical modelling is a less expensive option but requires approximations and assumptions about the queueing networks built up in real parallel database machines which are often questionable and few of the papers on analytical approaches are backed by results from validation against real machines. This paper describes a new analytical approach for response time estimation that is based on a detailed study of different approaches and assumptions. The approach has been validated against two commercial parallel DBMSs running on actual parallel machines and is shown to produce acceptable accuracy

Heriot Watt Pure

Abertay Research Portal

Thinking Big in a Small World — Efficient Query Execution on Small-Scale SMPs

Author: A Wilschut
A Wilschut
B Bergsten
C Chekuri
C Walton
D DeWitt
D Schneider
F Cariño
G Graefe
G Graefe
J Srivastava
M-S Chen
P Apers
S Manegold
S Manegold
W Hasan
W Hasan
WHM Stonebraker
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1998
Field of study

Many techniques developed for parallel database systems were focused on large-scale, often prototypical, hardware platforms. Therefore, most results cannot easily be transfered to widely available workstation clusters such as multiprocessor workstations. In this paper we address exploitation of pipelining parallelism in query processing on small multiprocessor environments. We present DTE/R, a strategy for executing pipelining segments of arbitrary length by replicating the segment's operator. Therefore, DTE/R avoids static processor-to-operator assignment of conventional processing techniques. Consequently, DTE/R achieves automatic load-balancing and skew-handling. Furthermore, DTE/R outperforms conventional pipelining execution techniques substantially

Crossref

CWI's Institutional Repository

Efficient resource utilization in shared-everything environments

Author: Manegold S. (Stefan)
Obermaier J.K.
Publication venue: CWI
Publication date: 01/01/1997
Field of study

Efficient resource usage is a key to achieve better performance in parallel database systems. Up to now, most research has focussed on balancing the load on several resources of the same type, i.e. balancing either CPU load or I/O load. In this paper, we present emph{floating probe, a strategy for parallel evaluation of pipelining segments in a shared-everything environment that provides dynamic load balancing between CPU- and I/O-resources. The key idea of floating probe is to overlap---as much as possible with respect to data dependencies---I/O-bound build phase and CPU-bound probe phase of pipelining segments to improve resource utilization. Simulation results show, that floating probe achieves shorter execution times while consuming less memory than conventional pipelining strategies

CWI's Institutional Repository

Parallel evaluation of multi-join queries

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1995
Field of study

Crossref

MDCC: Multi-Data Center Consistency

Author: Fekete Alan
Franklin Michael J.
Kraska Tim
Madden Samuel R.
Pang Gene
Publication venue
Publication date: 01/01/2012
Field of study

Replicating data across multiple data centers not only allows moving the data closer to the user and, thus, reduces latency for applications, but also increases the availability in the event of a data center failure. Therefore, it is not surprising that companies like Google, Yahoo, and Netflix already replicate user data across geographically different regions. However, replication across data centers is expensive. Inter-data center network delays are in the hundreds of milliseconds and vary significantly. Synchronous wide-area replication is therefore considered to be unfeasible with strong consistency and current solutions either settle for asynchronous replication which implies the risk of losing data in the event of failures, restrict consistency to small partitions, or give up consistency entirely. With MDCC (Multi-Data Center Consistency), we describe the first optimistic commit protocol, that does not require a master or partitioning, and is strongly consistent at a cost similar to eventually consistent protocols. MDCC can commit transactions in a single round-trip across data centers in the normal operational case. We further propose a new programming model which empowers the application developer to handle longer and unpredictable latencies caused by inter-data center communication. Our evaluation using the TPC-W benchmark with MDCC deployed across 5 geographically diverse data centers shows that MDCC is able to achieve throughput and latency similar to eventually consistent quorum protocols and that MDCC is able to sustain a data center outage without a significant impact on response times while guaranteeing strong consistency

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref