4 research outputs found

    Decomposing and re-composing lightweight compression schemes - and why it matters

    Get PDF
    We argue for a richer view of the space of lightweight compression schemes for columnar DBMSes: We demonstrate how even simple simple schemes used in DBMSes decompose into constituent schemes through a columnar perspective on their decompression. With our concrete examples, we touch briefly on what follows from these and other decompositions: Composition of alternative compression schemes as well as other practical and analytical implications

    Faster across the PCIe bus: A GPU library for lightweight decompression including support for patched compression schemes

    Get PDF
    This short paper present a collection of GPU lightweight decompression algorithms implementations within a FOSS library, Giddy - the first to be published to offer such functionality. As the use of compression is important in ameliorating PCIe data transfer bottlenecks, we believe this library and its constituent implementations can serve as useful building blocks in GPU-accelerated DBMSes --- as well as other data-intensive systems. The paper also includes an initial exploration of GPU-oriented patched compression schemes. Patching makes compression ratio robust against outliers, and is important with real-life data, which (in contrast to many synthetic benchmark datasets) exhibits non-uniform data distributions and noise. An experimental evaluation of both the unpatched and the patched schemes in Giddy is included

    Optimizing group-by and aggregation using GPU-CPU co-processing

    Get PDF
    While GPU query processing is a well-studied area, real adoption is limited in practice as typically GPU execution is only significantly faster than CPU execution if the data resides in GPU memory, which limits scalability to small data scenarios where performance tends to be less critical. Another problem is that not all query code (e.g. UDFs) will realistically be able to run on GPUs. We therefore investigate CPU-GPU co-processing, where both the CPU and GPU are involved in evaluating the query in scenarios where the data does not fit in the GPU memory.As we wish to deeply explore opportunities for optimizing execution speed, we narrow our focus further to a specific well-studied OLAP scenario, amenable to such co-processing, in the form of the TPC-H benchmark Query 1.For this query, and at large scale factors, we are able to improve performance significantly over the state-of-the-art for GPU implementations; we present competitive performance of a GPU versus a state-of-the-art multi-core CPU baseline a novelty for data exceeding GPU memory size; and finally, we show that co-processing does provide significant additional speedup over any of the processors individually.We achieve this performance improvement by utilizing parallelism-friendly compression to alleviate the PCIe transfer bottleneck, query-compilation-like fusion of the processing operations, and a simple yet effective scheduling mechanism. We hope that some of these features can inspire future work on GPU-focused and heterogeneous analytic DBMSes.</p

    Overtaking CPU DBMSes with a GPU in whole-query analytic processing with parallelism-friendly execution plan optimization

    Get PDF
    Existing work on accelerating analytic DB query processing with (discrete) GPUs fails to fully realize their potential for speedup through parallelism: Published results do not achieve significant speedup over more performant CPU-only DBMSes when processing complete queries. This paper presents a successful
    corecore