Search CORE

4 research outputs found

Decomposing and re-composing lightweight compression schemes - and why it matters

Author: Rozenberg E. (Eyal)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2018
Field of study

We argue for a richer view of the space of lightweight compression schemes for columnar DBMSes: We demonstrate how even simple simple schemes used in DBMSes decompose into constituent schemes through a columnar perspective on their decompression. With our concrete examples, we touch briefly on what follows from these and other decompositions: Composition of alternative compression schemes as well as other practical and analytical implications

Crossref

CWI's Institutional Repository

Faster across the PCIe bus: A GPU library for lightweight decompression including support for patched compression schemes

Author: Boncz P.A. (Peter)
Rozenberg E. (Eyal)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/05/2017
Field of study

This short paper present a collection of GPU lightweight decompression algorithms implementations within a FOSS library, Giddy - the first to be published to offer such functionality. As the use of compression is important in ameliorating PCIe data transfer bottlenecks, we believe this library and its constituent implementations can serve as useful building blocks in GPU-accelerated DBMSes --- as well as other data-intensive systems. The paper also includes an initial exploration of GPU-oriented patched compression schemes. Patching makes compression ratio robust against outliers, and is important with real-life data, which (in contrast to many synthetic benchmark datasets) exhibits non-uniform data distributions and noise. An experimental evaluation of both the unpatched and the patched schemes in Giddy is included

CWI's Institutional Repository

Optimizing group-by and aggregation using GPU-CPU co-processing

Author: Boncz P.A. (Peter)
Gomes Tomé D. (Diego)
Gubner T.K. (Tim)
Raasveldt M. (Mark)
Rozenberg E. (Eyal)
Publication venue
Publication date: 27/08/2018
Field of study

While GPU query processing is a well-studied area, real adoption is limited in practice as typically GPU execution is only significantly faster than CPU execution if the data resides in GPU memory, which limits scalability to small data scenarios where performance tends to be less critical. Another problem is that not all query code (e.g. UDFs) will realistically be able to run on GPUs. We therefore investigate CPU-GPU co-processing, where both the CPU and GPU are involved in evaluating the query in scenarios where the data does not fit in the GPU memory.As we wish to deeply explore opportunities for optimizing execution speed, we narrow our focus further to a specific well-studied OLAP scenario, amenable to such co-processing, in the form of the TPC-H benchmark Query 1.For this query, and at large scale factors, we are able to improve performance significantly over the state-of-the-art for GPU implementations; we present competitive performance of a GPU versus a state-of-the-art multi-core CPU baseline a novelty for data exceeding GPU memory size; and finally, we show that co-processing does provide significant additional speedup over any of the processors individually.We achieve this performance improvement by utilizing parallelism-friendly compression to alleviate the PCIe transfer bottleneck, query-compilation-like fusion of the processing operations, and a simple yet effective scheduling mechanism. We hope that some of these features can inspire future work on GPU-focused and heterogeneous analytic DBMSes.</p

CWI's Institutional Repository

Overtaking CPU DBMSes with a GPU in whole-query analytic processing with parallelism-friendly execution plan optimization

Author: Agbaria A. (Adnan)
Huawei Research
Minor D. (David)
Peterfreund N. (Natan)
Rosenberg O. (Ofer)
Rozenberg E. (Eyal)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/09/2016
Field of study

Existing work on accelerating analytic DB query processing with (discrete) GPUs fails to fully realize their potential for speedup through parallelism: Published results do not achieve significant speedup over more performant CPU-only DBMSes when processing complete queries. This paper presents a successful

CWI's Institutional Repository