Search CORE

2,640 research outputs found

Learning a Partitioning Advisor with Deep Reinforcement Learning

Author: Binnig Carsten
Hilprecht Benjamin
Roehm Uwe
Publication venue
Publication date: 01/01/2019
Field of study

Commercial data analytics products such as Microsoft Azure SQL Data Warehouse or Amazon Redshift provide ready-to-use scale-out database solutions for OLAP-style workloads in the cloud. While the provisioning of a database cluster is usually fully automated by cloud providers, customers typically still have to make important design decisions which were traditionally made by the database administrator such as selecting the partitioning schemes. In this paper we introduce a learned partitioning advisor for analytical OLAP-style workloads based on Deep Reinforcement Learning (DRL). The main idea is that a DRL agent learns its decisions based on experience by monitoring the rewards for different workloads and partitioning schemes. We evaluate our learned partitioning advisor in an experimental evaluation with different databases schemata and workloads of varying complexity. In the evaluation, we show that our advisor is not only able to find partitionings that outperform existing approaches for automated partitioning design but that it also can easily adjust to different deployments. This is especially important in cloud setups where customers can easily migrate their cluster to a new set of (virtual) machines

arXiv.org e-Print Archive

TUbiblio

Crossref

Deep Learning Data and Indexes in a Database

Author: Sharma Vishal
Publication venue: DigitalCommons@USU
Publication date: 01/08/2021
Field of study

A database is used to store and retrieve data, which is a critical component for any software application. Databases requires configuration for efficiency, however, there are tens of configuration parameters. It is a challenging task to manually configure a database. Furthermore, a database must be reconfigured on a regular basis to keep up with newer data and workload. The goal of this thesis is to use the query workload history to autonomously configure the database and improve its performance. We achieve proposed work in four stages: (i) we develop an index recommender using deep reinforcement learning for a standalone database. We evaluated the effectiveness of our algorithm by comparing with several state-of-the-art approaches, (ii) we build a real-time index recommender that can, in real-time, dynamically create and remove indexes for better performance in response to sudden changes in the query workload, (iii) we develop a database advisor. Our advisor framework will be able to learn latent patterns from a workload. It is able to enhance a query, recommend interesting queries, and summarize a workload, (iv) we developed LinkSocial, a fast, scalable, and accurate framework to gain deeper insights from heterogeneous data

DigitalCommons@USU

Workload-Aware Performance Tuning for Autonomous DBMSs

Author: Chainani Naresh
Lin Chunbin
Lu Jiaheng
Yan Zhengtong
Publication venue: IEEE
Publication date: 19/04/2021
Field of study

Optimal configuration is vital for a DataBase Management System (DBMS) to achieve high performance. There is no one-size-fits-all configuration that works for different workloads since each workload has varying patterns with different resource requirements. There is a relationship between configuration, workload, and system performance. If a configuration cannot adapt to the dynamic changes of a workload, there could be a significant degradation in the overall performance of DBMS unless a sophisticated administrator is continuously re-configuring the DBMS. In this tutorial, we focus on autonomous workload-aware performance tuning, which is expected to automatically and continuously tune the configuration as the workload changes. We survey three research directions, including 1) workload classification, 2) workload forecasting, and 3) workload-based tuning. While the first two topics address the issue of obtaining accurate workload information, the third one tackles the problem of how to properly use the workload information to optimize performance. We also identify research challenges and open problems, and give real-world examples about leveraging workload information for database tuning in commercial products (e.g., Amazon Redshift). We will demonstrate workload-aware performance tuning in Amazon Redshift in the presentation.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning

Author: Cuadrado Felix
Vaquero Luis M.
Publication venue
Publication date: 14/09/2018
Field of study

Fine tuning distributed systems is considered to be a craftsmanship, relying on intuition and experience. This becomes even more challenging when the systems need to react in near real time, as streaming engines have to do to maintain pre-agreed service quality metrics. In this article, we present an automated approach that builds on a combination of supervised and reinforcement learning methods to recommend the most appropriate lever configurations based on previous load. With this, streaming engines can be automatically tuned without requiring a human to determine the right way and proper time to deploy them. This opens the door to new configurations that are not being applied today since the complexity of managing these systems has surpassed the abilities of human experts. We show how reinforcement learning systems can find substantially better configurations in less time than their human counterparts and adapt to changing workloads

arXiv.org e-Print Archive

Explore Bristol Research

Learning-Based Data Storage [Vision] (Technical Report)

Author: Lian Xiang
Zhang Xiaofei
Publication venue
Publication date: 22/01/2023
Field of study

Deep neural network (DNN) and its variants have been extensively used for a wide spectrum of real applications such as image classification, face/speech recognition, fraud detection, and so on. In addition to many important machine learning tasks, as artificial networks emulating the way brain cells function, DNNs also show the capability of storing non-linear relationships between input and output data, which exhibits the potential of storing data via DNNs. We envision a new paradigm of data storage, "DNN-as-a-Database", where data are encoded in well-trained machine learning models. Compared with conventional data storage that directly records data in raw formats, learning-based structures (e.g., DNN) can implicitly encode data pairs of inputs and outputs and compute/materialize actual output data of different resolutions only if input data are provided. This new paradigm can greatly enhance the data security by allowing flexible data privacy settings on different levels, achieve low space consumption and fast computation with the acceleration of new hardware (e.g., Diffractive Neural Network and AI chips), and can be generalized to distributed DNN-based storage/computing. In this paper, we propose this novel concept of learning-based data storage, which utilizes a learning structure called learning-based memory unit (LMU), to store, organize, and retrieve data. As a case study, we use DNNs as the engine in the LMU, and study the data capacity and accuracy of the DNN-based data storage. Our preliminary experimental results show the feasibility of the learning-based data storage by achieving high (100%) accuracy of the DNN storage. We explore and design effective solutions to utilize the DNN-based data storage to manage and query relational tables. We discuss how to generalize our solutions to other data types (e.g., graphs) and environments such as distributed DNN storage/computing.Comment: 14 pages, 16 figure

arXiv.org e-Print Archive

Improvement of Decision on Coding Unit Split Mode and Intra-Picture Prediction by Machine Learning

Author: Jiang Wenchan
Publication venue: DigitalCommons@Kennesaw State University
Publication date: 16/08/2018
Field of study

High efficiency Video Coding (HEVC) has been deemed as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The reference software (i.e., HM) have included the implementations of the guidelines in appliance with the new standard. The software includes both encoder and decoder functionality. Machine learning (ML) works with data and processes it to discover patterns that can be later used to analyze new trends. ML can play a key role in a wide range of critical applications, such as data mining, natural language processing, image recognition, and expert systems. In this research project, in compliance with H.265 standard, we are focused on improvement of the performance of encode/decode by optimizing the partition of prediction block in coding unit with the help of supervised machine learning. We used Keras library as the main tool to implement the experiments. Key parameters were tuned for the model in our convolution neuron network. The coding tree unit mode decision time produced in the model was compared with that produced in HM software, and it was proved to have improved significantly. The intra-picture prediction mode decision was also investigated with modified model and yielded satisfactory results

DigitalCommons@Kennesaw State University

Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation

Author: Hernández Giner Alor
López Chau Asdrúbal
Rodríguez Mazahua Lisbeth
Rodríguez Mazahua Nidia
Publication venue: 'Fundacion Universitaria Ceipa'
Publication date: 01/12/2020
Field of study

One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model.One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model

Revistas CEIPA (CEIPA Business School)

Towards Scalable Circuit Partitioning for Multi-Core Quantum Architectures with Deep Reinforcement Learning

Author: Pastor Lacueva Arnau
Publication venue: Universitat Politècnica de Catalunya
Publication date: 27/06/2023
Field of study

La computación cuántica tiene un inmenso potencial para resolver problemas clásicamente intratables aprovechando las propiedades únicas de los cúbits. Sin embargo, la escalabilidad de las arquitecturas cuánticas sigue siendo un desafío significativo. Para abordar este problema, se proponen arquitecturas cuánticas de múltiples núcleos. No obstante, la realización de dichas arquitecturas plantea múltiples desafíos en hardware, algoritmos y la interfaz entre ellos. En particular, uno de estos desafíos es cómo particionar de manera óptima los algoritmos para que se ajusten dentro de los múltiples núcleos. Esta tesis presenta un enfoque novedoso para la partición escalable de circuitos en arquitecturas cuánticas de múltiples núcleos utilizando Aprendizaje Profundo Reforzado. El objetivo es superar a los algoritmos metaheurísticos existentes, como el algoritmo de particionamiento de FGP-rOEE, en términos de precisión y escalabilidad. Esta investigación contribuye al avance tanto de la computación cuántica como de las técnicas de particionamiento de gráficos, ofreciendo nuevos conocimientos sobre la optimización de los sistemas cuánticos. Al abordar los desafíos asociados con la escalabilidad de las computadoras cuánticas, abrimos el camino para su implementación práctica en la resolución de problemas computacionalmente desafiantes.Quantum computing holds immense potential for solving classically intractable problems by leveraging the unique properties of qubits. However, the scalability of quantum architectures remains a significant challenge. To address this issue, multi-core quantum architectures are proposed. Yet, the realization of such multi-core architectures poses multiple challenges in hardware, algorithms, and the interface between them. In particular, one of these challenges is how to optimally partition the algorithms to fit within the cores of a multi-core quantum computer. This thesis presents a novel approach for scalable circuit partitioning on multi-core quantum architectures using Deep Reinforcement Learning. The objective is to surpass existing meta-heuristic algorithms, such as FGP-rOEE's partitioning algorithm, in terms of accuracy and scalability. This research contributes to the advancement of both quantum computing and graph partitioning techniques, offering new insights into the optimization of quantum systems. By addressing the challenges associated with scaling quantum computers, we pave the way for their practical implementation in solving computationally challenging problems

UPCommons. Portal del coneixement obert de la UPC