1,188 research outputs found
Database architecture evolution: Mammals flourished long before dinosaurs became extinct
The holy grail for database architecture research is to find a solution that is Scalable & Speedy, to run on anything from small ARM processors up to globally distributed compute clusters, Stable & Secure, to service a broad user community, Small & Simple, to be comprehensible to a small team of programmers, Self-managing, to let it run out-of-the-box without hassle. In this paper, we provide a trip report on this quest, covering both past experiences, ongoing research on hardware-conscious algorithms, and novel ways towards self-management specifically focused on column store solutions
Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)
Database management systems (DBMSs) carefully optimize complex multi-join
queries to avoid expensive disk I/O. As servers today feature tens or hundreds
of gigabytes of RAM, a significant fraction of many analytic databases becomes
memory-resident. Even after careful tuning for an in-memory environment, a
linear disk I/O model such as the one implemented in PostgreSQL may make query
response time predictions that are up to 2X slower than the optimal multi-join
query plan over memory-resident data. This paper introduces a memory I/O cost
model to identify good evaluation strategies for complex query plans with
multiple hash-based equi-joins over memory-resident data. The proposed cost
model is carefully validated for accuracy using three different systems,
including an Amazon EC2 instance, to control for hardware-specific differences.
Prior work in parallel query evaluation has advocated right-deep and bushy
trees for multi-join queries due to their greater parallelization and
pipelining potential. A surprising finding is that the conventional wisdom from
shared-nothing disk-based systems does not directly apply to the modern
shared-everything memory hierarchy. As corroborated by our model, the
performance gap between the optimal left-deep and right-deep query plan can
grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in
SoCC'1
Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps
Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This text applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made tree aware, i.e., tree properties like subtree size, intersection of paths, inclusion or disjointness of subtrees are made explicit. We propose a local change to the database kernel, the staircase join, which encapsulates the necessary tree knowledge needed to improve XPath performance. Staircase join operates on an XML encoding which makes this knowledge available at the cost of simple integer operations (e.g., +, <=). We finally report on quite promising experiments with a staircase join enhanced main-memory database kernel
Runtime Optimizations for Prediction with Tree-Based Models
Tree-based models have proven to be an effective solution for web ranking as
well as other problems in diverse domains. This paper focuses on optimizing the
runtime performance of applying such models to make predictions, given an
already-trained model. Although exceedingly simple conceptually, most
implementations of tree-based models do not efficiently utilize modern
superscalar processor architectures. By laying out data structures in memory in
a more cache-conscious fashion, removing branches from the execution flow using
a technique called predication, and micro-batching predictions using a
technique called vectorization, we are able to better exploit modern processor
architectures and significantly improve the speed of tree-based models over
hard-coded if-else blocks. Our work contributes to the exploration of
architecture-conscious runtime implementations of machine learning algorithms
MonetDB: Two Decades of Research in Column-oriented Database Architectures
MonetDB is a state-of-the-art open-source column-store database management system targeting applications in need for analytics over large collections of data. MonetDB is actively used nowadays in
health care, in telecommunications as well as in scientific databases and in data management research,
accumulating on average more than 10,000 downloads on a monthly basis. This paper gives a brief
overview of the MonetDB technology as it developed over the past two decades and the main research
highlights which drive the current MonetDB design and form the basis for its future evolution
One size does not fit all : accelerating OLAP workloads with GPUs
GPU has been considered as one of the next-generation platforms for real-time query processing databases. In this paper we empirically demonstrate that the representative GPU databases [e.g., OmniSci (Open Source Analytical Database & SQL Engine,, 2019)] may be slower than the representative in-memory databases [e.g., Hyper (Neumann and Leis, IEEE Data Eng Bull 37(1):3-11, 2014)] with typical OLAP workloads (with Star Schema Benchmark) even if the actual dataset size of each query can completely fit in GPU memory. Therefore, we argue that GPU database designs should not be one-size-fits-all; a general-purpose GPU database engine may not be well-suited for OLAP workloads without careful designed GPU memory assignment and GPU computing locality. In order to achieve better performance for GPU OLAP, we need to re-organize OLAP operators and re-optimize OLAP model. In particular, we propose the 3-layer OLAP model to match the heterogeneous computing platforms. The core idea is to maximize data and computing locality to specified hardware. We design the vector grouping algorithm for data-intensive workload which is proved to be assigned to CPU platform adaptive. We design the TOP-DOWN query plan tree strategy to guarantee the optimal operation in final stage and pushing the respective optimizations to the lower layers to make global optimization gains. With this strategy, we design the 3-stage processing model (OLAP acceleration engine) for hybrid CPU-GPU platform, where the computing-intensive star-join stage is accelerated by GPU, and the data-intensive grouping & aggregation stage is accelerated by CPU. This design maximizes the locality of different workloads and simplifies the GPU acceleration implementation. Our experimental results show that with vector grouping and GPU accelerated star-join implementation, the OLAP acceleration engine runs 1.9x, 3.05x and 3.92x faster than Hyper, OmniSci GPU and OmniSci CPU in SSB evaluation with dataset of SF = 100.Peer reviewe
- …