Search CORE

906 research outputs found

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

Author: Pietron Marcin
Russek Pawel
Wiatr Kazimierz
Publication venue: 'AGHU University of Science and Technology Press'
Publication date: 01/01/2013
Field of study

This paper presents implementations of a few selected SQL operations using theCUDA programming framework on the GPU platform. Nowadays, the GPU’sparallel architectures give a high speed-up on certain problems. Therefore, thenumber of non-graphical problems that can be run and sped-up on the GPUstill increases. Especially, there has been a lot of research in data mining onGPUs. In many cases it proves the advantage of oﬄoading processing fromthe CPU to the GPU. At the beginning of our project we chose the set ofSELECT WHERE and SELECT JOIN instructions as the most common op-erations used in databases. We parallelized these SQL operations using threemain mechanisms in CUDA: thread group hierarchy, shared memories, andbarrier synchronization. Our results show that the implemented highly parallelSELECT WHERE and SELECT JOIN operations on the GPU platform canbe signiﬁcantly faster than the sequential one in a database system run on theCPU

AGH (Akademia Górniczo-Hutnicza) University of Science and Technology: Journals

Computer Science Journal (AGH University of Science and Technology, Krakow)

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Saber: window-based hybrid stream processing for heterogeneous architectures

Author: Costa P
Fernandez R
Koliousis A
Pietzuch P
Weidlich M
Wolf A
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/09/2015
Field of study

Modern servers have become heterogeneous, often combining multicore CPUs with many-core GPGPUs. Such heterogeneous architectures have the potential to improve the performance of data-intensive stream processing applications, but they are not supported by current relational stream processing engines. For an engine to exploit a heterogeneous architecture, it must execute streaming SQL queries with sufficient data-parallelism to fully utilise all available heterogeneous processors, and decide how to use each in the most effective way. It must do this while respecting the semantics of streaming SQL queries, in particular with regard to window handling. We describe SABER, a hybrid high-performance relational stream processing engine for CPUs and GPGPUs. SABER executes windowbased streaming SQL queries in a data-parallel fashion using all available CPU and GPGPU cores. Instead of statically assigning query operators to heterogeneous processors, SABER employs a new adaptive heterogeneous lookahead scheduling strategy, which increases the share of queries executing on the processor that yields the highest performance. To hide data movement costs, SABER pipelines the transfer of stream data between different memory types and the CPU/GPGPU. Our experimental comparison against state-ofthe-art engines shows that SABER increases processing throughput while maintaining low latency for a wide range of streaming SQL queries with small and large windows sizes

Spiral - Imperial College Digital Repository

An empirical evaluation of High-Level Synthesis languages and tools for database acceleration

Author: Arcas Abella Oriol
Armejach Adrià
Cristal Kestelman Adrián
Ghasempour Mohsen
Lujan Mikel
Mawer John
Navaridas Javier
Ndu Geoffrey
Song Wei
Sönmez Nehir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms depends on requirements such as area and throughput, as well as on programmer experience. In this paper, we explore the different trade-offs present when using a representative set of HLS tools in the context of Database Management Systems (DBMS) acceleration. More specifically, we conduct an empirical analysis of four representative frameworks (Bluespec SystemVerilog, Altera OpenCL, LegUp and Chisel) that we utilize to accelerate commonly-used database algorithms such as sorting, the median operator, and hash joins. Through our implementation experience and empirical results for database acceleration, we conclude that the selection of the most suitable HLS depends on a set of orthogonal characteristics, which we highlight for each HLS framework.Peer ReviewedPostprint (author’s final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

CuDB : a Relational Database Engine Boosted by Graphics Processing Units

Author: Bagein Michel
Cremer Samuel
Mahmoudi Saïd
Manneback Pierre
Publication venue
Publication date: 01/01/2016
Field of study

Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.GPUs benefit from much more computation power with the same order of energy consumption than CPUs. Thanks to their massive data parallel architecture, GPUs can outperform CPUs, especially on Single Program Multiple Data (SPMD) programming paradigm on a large amount of data. Database engines are now everywhere, from different sizes and complexities, for multiple usages, embedded or distributed; in 2012, 500 million of SQLite active instances were estimated over the world. Our goal is to exploit the computation power of GPUs to improve performance of SQLite, which is a key software component of many applications and systems. In this paper, we introduce CuDB, a GPU-boosted in-memory database engine (IMDB) based on SQLite. The SQLite API remains unchanged, allowing developers to easily upgrade database engine from SQlite to CuDB even on already existing applications. Preliminary results show significant speedups of 70x with join queries on datasets of 1 million records. We also demonstrate the "memory bounded" character of GPU-databases and show the energy efficiency of our approach.European Cooperation in Science and Technology. COS

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Overtaking CPU DBMSes with a GPU in whole-query analytic processing with parallelism-friendly execution plan optimization

Author: Agbaria A. (Adnan)
Huawei Research
Minor D. (David)
Peterfreund N. (Natan)
Rosenberg O. (Ofer)
Rozenberg E. (Eyal)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/09/2016
Field of study

Existing work on accelerating analytic DB query processing with (discrete) GPUs fails to fully realize their potential for speedup through parallelism: Published results do not achieve significant speedup over more performant CPU-only DBMSes when processing complete queries. This paper presents a successful

CWI's Institutional Repository

One size does not fit all : accelerating OLAP workloads with GPUs

Author: Han Ruichen
Liu Zhuan
Lu Jiaheng
Wang Shan
Zhang Yansong
Zhang Yu
Publication venue
Publication date: 31/07/2020
Field of study

GPU has been considered as one of the next-generation platforms for real-time query processing databases. In this paper we empirically demonstrate that the representative GPU databases [e.g., OmniSci (Open Source Analytical Database & SQL Engine,, 2019)] may be slower than the representative in-memory databases [e.g., Hyper (Neumann and Leis, IEEE Data Eng Bull 37(1):3-11, 2014)] with typical OLAP workloads (with Star Schema Benchmark) even if the actual dataset size of each query can completely fit in GPU memory. Therefore, we argue that GPU database designs should not be one-size-fits-all; a general-purpose GPU database engine may not be well-suited for OLAP workloads without careful designed GPU memory assignment and GPU computing locality. In order to achieve better performance for GPU OLAP, we need to re-organize OLAP operators and re-optimize OLAP model. In particular, we propose the 3-layer OLAP model to match the heterogeneous computing platforms. The core idea is to maximize data and computing locality to specified hardware. We design the vector grouping algorithm for data-intensive workload which is proved to be assigned to CPU platform adaptive. We design the TOP-DOWN query plan tree strategy to guarantee the optimal operation in final stage and pushing the respective optimizations to the lower layers to make global optimization gains. With this strategy, we design the 3-stage processing model (OLAP acceleration engine) for hybrid CPU-GPU platform, where the computing-intensive star-join stage is accelerated by GPU, and the data-intensive grouping & aggregation stage is accelerated by CPU. This design maximizes the locality of different workloads and simplifies the GPU acceleration implementation. Our experimental results show that with vector grouping and GPU accelerated star-join implementation, the OLAP acceleration engine runs 1.9x, 3.05x and 3.92x faster than Hyper, OmniSci GPU and OmniSci CPU in SSB evaluation with dataset of SF = 100.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Novel Selectivity Estimation Strategy for Modern DBMS

Author: Shin Jun Hyung
Publication venue
Publication date: 01/01/2018
Field of study

Selectivity estimation is important in query optimization, however accurate estimation is difficult when predicates are complex. Instead of existing database synopses and statistics not helpful for such cases, we introduce a new approach to compute the exact selectivity by running an aggregate query during the optimization phase. Exact selectivity can be achieved without significant overhead for in-memory and GPU-accelerated databases by adding extra query execution calls. We implement a selection push-down extension based on the novel selectivity estimation strategy in the MapD database system. Our approach records constant and less than 30 millisecond overheads in any circumstances while running on GPU. The novel strategy successfully generates better query execution plans which result in performance improvement up to 4.8 times from TPC-H benchmark SF-50 queries and 7.3 times from star schema benchmark SF-80 queries

arXiv.org e-Print Archive

Ezid

eScholarship - University of California