Search CORE

513 research outputs found

Learning a Partitioning Advisor with Deep Reinforcement Learning

Author: Binnig Carsten
Hilprecht Benjamin
Roehm Uwe
Publication venue
Publication date: 01/01/2019
Field of study

Commercial data analytics products such as Microsoft Azure SQL Data Warehouse or Amazon Redshift provide ready-to-use scale-out database solutions for OLAP-style workloads in the cloud. While the provisioning of a database cluster is usually fully automated by cloud providers, customers typically still have to make important design decisions which were traditionally made by the database administrator such as selecting the partitioning schemes. In this paper we introduce a learned partitioning advisor for analytical OLAP-style workloads based on Deep Reinforcement Learning (DRL). The main idea is that a DRL agent learns its decisions based on experience by monitoring the rewards for different workloads and partitioning schemes. We evaluate our learned partitioning advisor in an experimental evaluation with different databases schemata and workloads of varying complexity. In the evaluation, we show that our advisor is not only able to find partitionings that outperform existing approaches for automated partitioning design but that it also can easily adjust to different deployments. This is especially important in cloud setups where customers can easily migrate their cluster to a new set of (virtual) machines

arXiv.org e-Print Archive

TUbiblio

Crossref

Memory-aware sizing for in-memory databases

Author: Casale G
Molka K
Molka T
Moore L
Publication venue: Department of Computing, Imperial College London
Publication date: 01/01/2014
Field of study

In-memory database systems are among the technological drivers of big data processing. In this paper we apply analytical modeling to enable efficient sizing of in-memory databases. We present novel response time approximations under online analytical processing workloads to model thread-level forkjoin and per-class memory occupation.We combine these approximations with a non-linear optimization program to minimize memory swapping in in-memory database clusters. We compare our approach with state-of-the-art response time approximations and trace-driven simulation using real data from an SAP HANA in-memory system and show that our optimization model is significantly more accurate than existing approaches at similar computational costs

Crossref

Spiral - Imperial College Digital Repository

The University of Manchester - Institutional Repository

Design of efficient and elastic storage in the cloud

Author: VO HOANG TAM
Publication venue
Publication date: 14/08/2012
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

SAP HANA distributed in-memory database system: Transaction, session, and metadata management

Author: Bensberg Christian
Färber Franz
Kwon Yong Sik
Lee Arthur H.
Lee Chulwon
Lee Joo Yeon
Lee Juchang
Lehner Wolfgang
Muehle Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/01/2023
Field of study

One of the core principles of the SAP HANA database system is the comprehensive support of distributed query facility. Supporting scale-out scenarios was one of the major design principles of the system from the very beginning. Within this paper, we first give an overview of the overall functionality with respect to data allocation, metadata caching and query routing. We then dive into some level of detail for specific topics and explain features and methods not common in traditional disk-based database systems. In summary, the paper provides a comprehensive overview of distributed query processing in SAP HANA database to achieve scalability to handle large databases and heterogeneous types of workloads

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

ClusterMiner: High Performance for Data, Text and Web Mining

Author: Baião Fernanda
Costa Myriam
Ebecken Nelson
Evsukoff Alexandre
Mattoso Marta
Terra Guilherme
Zaverucha Gerson
Publication venue: 'Universidade Federal do Estado do Rio de Janeiro UNIRIO'
Publication date: 24/11/2008
Field of study

Universidade Federal do Estado do Rio de Janeiro: Portal de Revistas da UNIRIO

On-line analytical processing in distributed data warehouses

Author: Albrecht Jens
Lehner Wolfgang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/04/2022
Field of study

The concepts of 'data warehousing' and 'on-line analytical processing' have seen a growing interest in the research and commercial product community. Today, the trend moves away from complex centralized data warehouses to distributed data marts integrated in a common conceptual schema. However, as the first part of this paper demonstrates, there are many problems and little solutions for large distributed decision support systems in worldwide operating corporations. After showing the benefits and problems of the distributed approach, this paper outlines possibilities for achieving performance in distributed online analytical processing. Finally, the architectural framework of the prototypical distributed OLAP system CUBESTAR is outlined

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa