4,803 research outputs found
Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning
Fine tuning distributed systems is considered to be a craftsmanship, relying
on intuition and experience. This becomes even more challenging when the
systems need to react in near real time, as streaming engines have to do to
maintain pre-agreed service quality metrics. In this article, we present an
automated approach that builds on a combination of supervised and reinforcement
learning methods to recommend the most appropriate lever configurations based
on previous load. With this, streaming engines can be automatically tuned
without requiring a human to determine the right way and proper time to deploy
them. This opens the door to new configurations that are not being applied
today since the complexity of managing these systems has surpassed the
abilities of human experts. We show how reinforcement learning systems can find
substantially better configurations in less time than their human counterparts
and adapt to changing workloads
Deep Learning Data and Indexes in a Database
A database is used to store and retrieve data, which is a critical component for any software application. Databases requires configuration for efficiency, however, there are tens of configuration parameters. It is a challenging task to manually configure a database. Furthermore, a database must be reconfigured on a regular basis to keep up with newer data and workload. The goal of this thesis is to use the query workload history to autonomously configure the database and improve its performance. We achieve proposed work in four stages: (i) we develop an index recommender using deep reinforcement learning for a standalone database. We evaluated the effectiveness of our algorithm by comparing with several state-of-the-art approaches, (ii) we build a real-time index recommender that can, in real-time, dynamically create and remove indexes for better performance in response to sudden changes in the query workload, (iii) we develop a database advisor. Our advisor framework will be able to learn latent patterns from a workload. It is able to enhance a query, recommend interesting queries, and summarize a workload, (iv) we developed LinkSocial, a fast, scalable, and accurate framework to gain deeper insights from heterogeneous data
Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning
Distributed file systems are widely used nowadays, yet using their default
configurations is often not optimal. At the same time, tuning configuration
parameters is typically challenging and time-consuming. It demands expertise
and tuning operations can also be expensive. This is especially the case for
static parameters, where changes take effect only after a restart of the system
or workloads. We propose a novel approach, Magpie, which utilizes deep
reinforcement learning to tune static parameters by strategically exploring and
exploiting configuration parameter spaces. To boost the tuning of the static
parameters, our method employs both server and client metrics of distributed
file systems to understand the relationship between static parameters and
performance. Our empirical evaluation results show that Magpie can noticeably
improve the performance of the distributed file system Lustre, where our
approach on average achieves 91.8% throughput gains against default
configuration after tuning towards single performance indicator optimization,
while it reaches 39.7% more throughput gains against the baseline.Comment: Accepted at The IEEE International Conference on Cloud Engineering
(IC2E) conference 202
A Unified and Efficient Coordinating Framework for Autonomous DBMS Tuning
Recently using machine learning (ML) based techniques to optimize modern
database management systems has attracted intensive interest from both industry
and academia. With an objective to tune a specific component of a DBMS (e.g.,
index selection, knobs tuning), the ML-based tuning agents have shown to be
able to find better configurations than experienced database administrators.
However, one critical yet challenging question remains unexplored -- how to
make those ML-based tuning agents work collaboratively. Existing methods do not
consider the dependencies among the multiple agents, and the model used by each
agent only studies the effect of changing the configurations in a single
component. To tune different components for DBMS, a coordinating mechanism is
needed to make the multiple agents cognizant of each other. Also, we need to
decide how to allocate the limited tuning budget among the agents to maximize
the performance. Such a decision is difficult to make since the distribution of
the reward for each agent is unknown and non-stationary. In this paper, we
study the above question and present a unified coordinating framework to
efficiently utilize existing ML-based agents. First, we propose a message
propagation protocol that specifies the collaboration behaviors for agents and
encapsulates the global tuning messages in each agent's model. Second, we
combine Thompson Sampling, a well-studied reinforcement learning algorithm with
a memory buffer so that our framework can allocate budget judiciously in a
non-stationary environment. Our framework defines the interfaces adapted to a
broad class of ML-based tuning agents, yet simple enough for integration with
existing implementations and future extensions. We show that it can effectively
utilize different ML-based agents and find better configurations with 1.4~14.1X
speedups on the workload execution time compared with baselines.Comment: Accepted at 2023 International Conference on Management of Data
(SIGMOD '23
Learning a Partitioning Advisor with Deep Reinforcement Learning
Commercial data analytics products such as Microsoft Azure SQL Data Warehouse
or Amazon Redshift provide ready-to-use scale-out database solutions for
OLAP-style workloads in the cloud. While the provisioning of a database cluster
is usually fully automated by cloud providers, customers typically still have
to make important design decisions which were traditionally made by the
database administrator such as selecting the partitioning schemes.
In this paper we introduce a learned partitioning advisor for analytical
OLAP-style workloads based on Deep Reinforcement Learning (DRL). The main idea
is that a DRL agent learns its decisions based on experience by monitoring the
rewards for different workloads and partitioning schemes. We evaluate our
learned partitioning advisor in an experimental evaluation with different
databases schemata and workloads of varying complexity. In the evaluation, we
show that our advisor is not only able to find partitionings that outperform
existing approaches for automated partitioning design but that it also can
easily adjust to different deployments. This is especially important in cloud
setups where customers can easily migrate their cluster to a new set of
(virtual) machines
Workload-Aware Performance Tuning for Autonomous DBMSs
Optimal configuration is vital for a DataBase Management System (DBMS) to achieve high performance. There is no one-size-fits-all configuration that works for different workloads since each workload has varying patterns with different resource requirements. There is a relationship between configuration, workload, and system performance. If a configuration cannot adapt to the dynamic changes of a workload, there could be a significant degradation in the overall performance of DBMS unless a sophisticated administrator is continuously re-configuring the DBMS. In this tutorial, we focus on autonomous workload-aware performance tuning, which is expected to automatically and continuously tune the configuration as the workload changes. We survey three research directions, including 1) workload classification, 2) workload forecasting, and 3) workload-based tuning. While the first two topics address the issue of obtaining accurate workload information, the third one tackles the problem of how to properly use the workload information to optimize performance. We also identify research challenges and open problems, and give real-world examples about leveraging workload information for database tuning in commercial products (e.g., Amazon Redshift). We will demonstrate workload-aware performance tuning in Amazon Redshift in the presentation.Peer reviewe
- …