Search CORE

45 research outputs found

Decomposing and re-composing lightweight compression schemes - and why it matters

Author: Rozenberg E. (Eyal)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2018
Field of study

We argue for a richer view of the space of lightweight compression schemes for columnar DBMSes: We demonstrate how even simple simple schemes used in DBMSes decompose into constituent schemes through a columnar perspective on their decompression. With our concrete examples, we touch briefly on what follows from these and other decompositions: Composition of alternative compression schemes as well as other practical and analytical implications

Crossref

CWI's Institutional Repository

Metode Sorting Bitonic Pada GPU

Author: Harmanto Lingga
Mukhlis Yulisdin
Publication venue: 'Gunadarma University'
Publication date
Field of study

Perubahan arsitektur komputer menjadi multiprocessor memang bisa membuat lebih banyak proses bisa dikerjakan sekaligus, namun perubahan tersebut tidaklah mampu meningkatkan kecepatan masing-masing proses secara signifikan. Peningkatan kecepatan setiap proses bisa dicapai melalui peningkatan kecepatan perangkat lunak. Kecepatan perangkat lunak sangat ditentukan oleh algoritmanya. Usaha untuk mencari algoritma yang lebih cepat tidaklah mudah, namun dengan adanya komputer multiprocessor, dapatlah dirancang algoritma yang lebih cepat, yaitu dengan memparalelkan proses komputasinya. Salah satu contoh implementasi dari multiprosessor pada desain grafis adalah GPU (graphical processing unit) yang dipelopori oleh NVIDIA. GPU menerapkan algoritma dari paralel computing. Salah satu algoritma tersebut adalah sorting. Sorting adalah salah satu masalah pokok yang sering dikemukakan dalam pemrosesan paralel. Strategi pemecahannya adalah dengan algoritma Divide and Conquer yaitu strategi pemecahan masalah dengan cara melakukan pembagian masalah yang besar tersebut menjadi beberapa bagian yang lebih kecil secara rekursif hingga masalah tersebut dapat dipecahkan secara langsung. Solusi yang didapat dari setiap bagian kemudian digabungkan untuk membentuk sebuah solusi yang utuh. Metode sorting seperti ini dinamakan sebagai bitonic sort

Gunadarma University Repository

Efficient Cross-Device Query Processing

Author: Pirk H. (Holger)
Publication venue
Publication date: 01/01/2012
Field of study

The increasing diversity of hardware within a single system promises large performance gains but also poses a challenge for data management systems. Strategies for the efficient use of hardware with large performance differences are still lacking. For example, existing research on GPU supported data management largely handles the GPU in isolation from the system’s CPU — The GPU is considered the central processor and the CPU used only to mitigate the GPU’s weaknesses where necessary. To make efficient use of all available devices, we developed a processing strategy that lets unequal devices like GPU and CPU combine their strengths rather than work in isolation. To this end, we decompose relational data into individual bits and place the resulting partitions on the appropriate devices. Operations are processed in phases, each phase executed on one device. This way, we achieve significant performance gains and good load distribution among the available devices in a limited real-life use case. To grow this idea into a generic system, we identify challenges as well as potential hardware configurations and applications that can benefit from this approach

CWI's Institutional Repository

Faster across the PCIe bus: A GPU library for lightweight decompression including support for patched compression schemes

Author: Boncz P.A. (Peter)
Rozenberg E. (Eyal)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/05/2017
Field of study

This short paper present a collection of GPU lightweight decompression algorithms implementations within a FOSS library, Giddy - the first to be published to offer such functionality. As the use of compression is important in ameliorating PCIe data transfer bottlenecks, we believe this library and its constituent implementations can serve as useful building blocks in GPU-accelerated DBMSes --- as well as other data-intensive systems. The paper also includes an initial exploration of GPU-oriented patched compression schemes. Patching makes compression ratio robust against outliers, and is important with real-life data, which (in contrast to many synthetic benchmark datasets) exhibits non-uniform data distributions and noise. An experimental evaluation of both the unpatched and the patched schemes in Giddy is included

Crossref

CWI's Institutional Repository

Query processing on low-energy many-core processors

Author: Asmussen Nils
Fettweis Gerhard
Habich Dirk
Karnagel Tomas
Lehner Wolfgang
Nöthen Benedikt
Ungethüm Annett
Völp Marcus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/01/2023
Field of study

Aside from performance, energy efficiency is an increasing challenge in database systems. To tackle both aspects in an integrated fashion, we pursue a hardware/software co-design approach. To fulfill the energy requirement from the hardware perspective, we utilize a low-energy processor design offering the possibility to us to place hundreds to millions of chips on a single board without any thermal restrictions. Furthermore, we address the performance requirement by the development of several database-specific instruction set extensions to customize each core, whereas each core does not have all extensions. Therefore, our hardware foundation is a low-energy processor consisting of a high number of heterogeneous cores. In this paper, we introduce our hardware setup on a system level and present several challenges for query processing. Based on these challenges, we describe two implementation concepts and a comparison between these concepts. Finally, we conclude the paper with some lessons learned and an outlook on our upcoming research directions

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

Author: Pietron Marcin
Russek Pawel
Wiatr Kazimierz
Publication venue: 'AGHU University of Science and Technology Press'
Publication date: 01/01/2013
Field of study

This paper presents implementations of a few selected SQL operations using theCUDA programming framework on the GPU platform. Nowadays, the GPU’sparallel architectures give a high speed-up on certain problems. Therefore, thenumber of non-graphical problems that can be run and sped-up on the GPUstill increases. Especially, there has been a lot of research in data mining onGPUs. In many cases it proves the advantage of oﬄoading processing fromthe CPU to the GPU. At the beginning of our project we chose the set ofSELECT WHERE and SELECT JOIN instructions as the most common op-erations used in databases. We parallelized these SQL operations using threemain mechanisms in CUDA: thread group hierarchy, shared memories, andbarrier synchronization. Our results show that the implemented highly parallelSELECT WHERE and SELECT JOIN operations on the GPU platform canbe signiﬁcantly faster than the sequential one in a database system run on theCPU

AGH (Akademia Górniczo-Hutnicza) University of Science and Technology: Journals

Computer Science Journal (AGH University of Science and Technology, Krakow)

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

X-Device Query Processing by Bitwise Distribution

Author: Kersten M.L. (Martin)
Manegold S. (Stefan)
Pirk H. (Holger)
Sellam T.H.J. (Thibault)
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/05/2012
Field of study

The diversity of hardware components within a single system calls for strategies for efficient cross-device data processing. For exam- ple, existing approaches to CPU/GPU co-processing distribute individual relational operators to the “most appropriate” device. While pleasantly simple, this strategy has a number of problems: it may leave the “inappropriate” devices idle while overloading the “appropriate” device and putting a high pressure on the PCI bus. To address these issues we distribute data among the devices by par- tially decomposing relations at the granularity of individual bits. Each of the resulting bit-partitions is stored and processed on one of the available devices. Using this strategy, we implemented a processor for spatial range queries that makes efficient use of all available devices. The performance gains achieved indicate that bitwise distribution makes a good cross-device processing strategy

CWI's Institutional Repository