Search CORE

207,135 research outputs found

An Application-Specific Instruction Set for Accelerating Set-Oriented Database Primitives

Author: Arnold Oliver
Fettweis Gerhard
Haas Sebastian
Kissinger Thomas
Lehner Wolfgang
Schlegel Benjamin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/06/2022
Field of study

The key task of database systems is to efficiently manage large amounts of data. A high query throughput and a low query latency are essential for the success of a database system. Lately, research focused on exploiting hardware features like superscalar execution units, SIMD, or multiple cores to speed up processing. Apart from these software optimizations for given hardware, even tailor-made processing circuits running on FPGAs are built to run mostly stateless query plans with incredibly high throughput. A similar idea, which was already considered three decades ago, is to build tailor-made hardware like a database processor. Despite their superior performance, such application-specific processors were not considered to be beneficial because general-purpose processors eventually always caught up so that the high development costs did not pay off. In this paper, we show that the development of a database processor is much more feasible nowadays through the availability of customizable processors. We illustrate exemplarily how to create an instruction set extension for set-oriented database rimitives. The resulting application-specific processor provides not only a high performance but it also enables very energy-efficient processing. Our processor requires in various configurations more than 960x less energy than a high-end x86 processor while providing the same performance

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Heterogeneity-Aware Operator Placement in Column-Store DBMS

Author: Habich Dirk
Karnagel Tomas
Lehner Wolfgang
Schlegel Benjamin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/02/2023
Field of study

Due to the tremendous increase in the amount of data efficiently managed by current database systems, optimization is still one of the most challenging issues in database research. Today’s query optimizer determine the most efficient composition of physical operators to execute a given SQL query, whereas the underlying hardware consists of a multi-core CPU. However, hardware systems are more and more shifting towards heterogeneity, combining a multi-core CPU with various computing units, e.g., GPU or FPGA cores. In order to efficiently utilize the provided performance capability of such heterogeneous hardware, the assignment of physical operators to computing units gains importance. In this paper, we propose a heterogeneity-aware physical operator placement strategy (HOP) for in-memory columnar database systems in a heterogeneous environment. Our placement approach takes operators from the physical query execution plan as an input and assigns them to computing units using a cost model at runtime. To enable this runtime decision, our cost model uses the characteristics of the computing units, execution properties of the operators, as well as runtime data to estimate execution costs for each unit. We evaluated our approach on full TPC-H queries within a prototype database engine. As we are going to show, the placement in a heterogeneous hardware system has a high influence on query performance

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Query processing on low-energy many-core processors

Author: Asmussen Nils
Fettweis Gerhard
Habich Dirk
Karnagel Tomas
Lehner Wolfgang
Nöthen Benedikt
Ungethüm Annett
Völp Marcus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/01/2023
Field of study

Aside from performance, energy efficiency is an increasing challenge in database systems. To tackle both aspects in an integrated fashion, we pursue a hardware/software co-design approach. To fulfill the energy requirement from the hardware perspective, we utilize a low-energy processor design offering the possibility to us to place hundreds to millions of chips on a single board without any thermal restrictions. Furthermore, we address the performance requirement by the development of several database-specific instruction set extensions to customize each core, whereas each core does not have all extensions. Therefore, our hardware foundation is a low-energy processor consisting of a high number of heterogeneous cores. In this paper, we introduce our hardware setup on a system level and present several challenges for query processing. Based on these challenges, we describe two implementation concepts and a comparison between these concepts. Finally, we conclude the paper with some lessons learned and an outlook on our upcoming research directions

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry Data

Author: Haseeb Muhammad
Saeed Fahad
Publication venue: FIU Digital Commons
Publication date: 01/08/2021
Field of study

Database peptide search algorithms deduce peptides from mass spectrometry data. There has been substantial effort in improving their computational efficiency to achieve larger and more complex systems biology studies. However, modern serial and high-performance computing (HPC) algorithms exhibit suboptimal performance mainly due to their ineffective parallel designs (low resource utilization) and high overhead costs. We present an HPC framework, called HiCOPS, for efficient acceleration of the database peptide search algorithms on distributed-memory supercomputers. HiCOPS provides, on average, more than tenfold improvement in speed and superior parallel performance over several existing HPC database search software. We also formulate a mathematical model for performance analysis and optimization, and report near-optimal results for several key metrics including strong-scale efficiency, hardware utilization, load-balance, inter-process communication and I/O overheads. The core parallel design, techniques and optimizations presented in HiCOPS are search-algorithm-independent and can be extended to efficiently accelerate the existing and future algorithms and software

PubMed Central

DigitalCommons@Florida International University

Recommended from our members

Analytical Query Execution Optimized for all Layers of Modern Hardware

Author: Polychroniou Orestis
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Analytical database queries are at the core of business intelligence and decision support. To analyze the vast amounts of data available today, query execution needs to be orders of magnitude faster. Hardware advances have made a profound impact on database design and implementation. The large main memory capacity allows queries to execute exclusively in memory and shifts the bottleneck from disk access to memory bandwidth. In the new setting, to optimize query performance, databases must be aware of an unprecedented multitude of complicated hardware features. This thesis focuses on the design and implementation of highly efficient database systems by optimizing analytical query execution for all layers of modern hardware. The hardware layers include the network across multiple machines, main memory and the NUMA interconnection across multiple processors, the multiple levels of caches across multiple processor cores, and the execution pipeline within each core. For the network layer, we introduce a distributed join algorithm that minimizes the network traffic. For the memory hierarchy, we describe partitioning variants aware to the dynamics of the CPU caches and the NUMA interconnection. To improve the memory access rate of linear scans, we optimize lightweight compression variants and evaluate their trade-offs. To accelerate query execution within the core pipeline, we introduce advanced SIMD vectorization techniques generalizable across multiple operators. We evaluate our algorithms and techniques on both mainstream hardware and on many-integrated-core platforms, and combine our techniques in a new query engine design that can better utilize the features of many-core CPUs. In the era of hardware becoming increasingly parallel and datasets consistently growing in size, this thesis can serve as a compass for developing hardware-conscious databases with truly high-performance analytical query execution

Columbia University Academic Commons

Characterization of the Impact of Hardware Islands on OLTP

Author: Ailamaki Anastasia
De Oliveira Branco Miguel Sérgio
Pandis Ippokratis
Porobic Danica
Tozun Pinar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/11/2015
Field of study

Modern hardware is abundantly parallel and increasingly heterogeneous. The numerous processing cores have non-uniform access latencies to the main memory and processor caches, which causes variability in the communication costs. Unfortunately, database systems mostly assume that all processing cores are the same and that microarchitecture differences are not significant enough to appear in critical database execution paths. As we demonstrate in this paper, however, non-uniform core topology does appear in the critical path and conventional database architectures achieve suboptimal and even worse, unpredictable performance. We perform a detailed performance analysis of OLTP deployments in servers with multiple cores per CPU (multicore) and multiple CPUs per server (multisocket). We compare different database deployment strategies where we vary the number and size of independent database instances running on a single server, from a single shared-everything instance to fine-grained shared-nothing configurations. We quantify the impact of non-uniform hardware on various deployments by (a) examining how efficiently each deployment uses the available hardware resources and (b) measuring the impact of distributed transactions and skewed requests on different workloads. We show that no strategy is optimal for all cases and that the best choice depends on the combination of hardware topology and workload characteristics. Finally, we argue that transaction processing systems must be aware of the hardware topology in order to achieve predictably high performance

Infoscience - École polytechnique fédérale de Lausanne