66 research outputs found
Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning
Fine tuning distributed systems is considered to be a craftsmanship, relying
on intuition and experience. This becomes even more challenging when the
systems need to react in near real time, as streaming engines have to do to
maintain pre-agreed service quality metrics. In this article, we present an
automated approach that builds on a combination of supervised and reinforcement
learning methods to recommend the most appropriate lever configurations based
on previous load. With this, streaming engines can be automatically tuned
without requiring a human to determine the right way and proper time to deploy
them. This opens the door to new configurations that are not being applied
today since the complexity of managing these systems has surpassed the
abilities of human experts. We show how reinforcement learning systems can find
substantially better configurations in less time than their human counterparts
and adapt to changing workloads
Query-Driven Learning for Next Generation Predictive Modeling & Analytics
As data-size is increasing exponentially, new paradigm shifts have to emerge allowing fast exploitation of data by every- body. Large-scale predictive analytics is restricted to wealthy organizations as small-scale enterprises (SMEs) struggle to compete and are inundated by the sheer monetary cost of either procuring data infrastructures or analyzing datasets over the Cloud. The aim of this work is to study mechanisms which can democratize analytics, in the sense of making them affordable, while at the same time ensuring high efficiency, scalability, and accuracy. The crux of this proposal lies in developing query-driven solutions that can be used off the Cloud thus minimizing costs. Our query-driven approach will learn and adapt on-the-fly machine learning models, based solely on query-answer interactions, which can be used for answering analytical queries. In this abstract we describe the methodology followed for the implementation and evaluation of the system designed
Learning a Partitioning Advisor with Deep Reinforcement Learning
Commercial data analytics products such as Microsoft Azure SQL Data Warehouse
or Amazon Redshift provide ready-to-use scale-out database solutions for
OLAP-style workloads in the cloud. While the provisioning of a database cluster
is usually fully automated by cloud providers, customers typically still have
to make important design decisions which were traditionally made by the
database administrator such as selecting the partitioning schemes.
In this paper we introduce a learned partitioning advisor for analytical
OLAP-style workloads based on Deep Reinforcement Learning (DRL). The main idea
is that a DRL agent learns its decisions based on experience by monitoring the
rewards for different workloads and partitioning schemes. We evaluate our
learned partitioning advisor in an experimental evaluation with different
databases schemata and workloads of varying complexity. In the evaluation, we
show that our advisor is not only able to find partitionings that outperform
existing approaches for automated partitioning design but that it also can
easily adjust to different deployments. This is especially important in cloud
setups where customers can easily migrate their cluster to a new set of
(virtual) machines
- …