22,881 research outputs found
Recommended from our members
ReoptSMART: A Learning Query Plan Cache
The task of query optimization in modern relational database systems is important but can be computationally expensive. Parametric query optimization(PQO) has as its goal the prediction of optimal query execution plans based on historical results, without consulting the query optimizer. We develop machine learning techniques that can accurately model the output of a query optimizer. Our algorithms handle non-linear boundaries in plan space and achieve high prediction accuracy even when a limited amount of data is available for training. We use both predicted and actual query execution times for learning, and are the first to demonstrate a total net win of a PQO method over a state-of-the-art query optimizer for some workloads. ReoptSMART realizes savings not only in optimization time, but also in query execution time, for an over-all improvement by more than an order of magnitude in some cases
Kepler: Robust Learning for Faster Parametric Query Optimization
Most existing parametric query optimization (PQO) techniques rely on
traditional query optimizer cost models, which are often inaccurate and result
in suboptimal query performance. We propose Kepler, an end-to-end
learning-based approach to PQO that demonstrates significant speedups in query
latency over a traditional query optimizer. Central to our method is Row Count
Evolution (RCE), a novel plan generation algorithm based on perturbations in
the sub-plan cardinality space. While previous approaches require accurate cost
models, we bypass this requirement by evaluating candidate plans via actual
execution data and training an ML model to predict the fastest plan given
parameter binding values. Our models leverage recent advances in neural network
uncertainty in order to robustly predict faster plans while avoiding
regressions in query performance. Experimentally, we show that Kepler achieves
significant improvements in query runtime on multiple datasets on PostgreSQL.Comment: SIGMOD 202
Optimal web-scale tiering as a flow problem
We present a fast online solver for large scale parametric max-flow problems as they occur in portfolio optimization, inventory management, computer vision, and logistics. Our algorithm solves an integer linear program in an online fashion. It exploits total unimodularity of the constraint matrix and a Lagrangian relaxation to solve the problem as a convex online game. The algorithm generates approximate solutions of max-flow problems by performing stochastic gradient descent on a set of flows. We apply the algorithm to optimize tier arrangement of over 84 million web pages on a layered set of caches to serve an incoming query stream optimally
Multi-Objective Parametric Query Optimization
Classical query optimization compares query plans according to one cost metric and associates each plan with a constant cost value. In this paper, we introduce the Multi-Objective Parametric Query Optimization (MPQ) problem where query plans are compared according to multiple cost metrics and the cost of a given plan according to a given metric is modeled as a function that depends on multiple parameters. The cost metrics may for instance include execution time or monetary fees; a parameter may represent the selectivity of a query predicate that is unspecified at optimization time. MPQ generalizes parametric query optimization (which allows multiple parameters but only one cost metric) and multi-objective query optimization (which allows multiple cost metrics but no parameters). We formally analyze the novel MPQ problem and show why existing algorithms are inapplicable. We present a generic algorithm for MPQ and a specialized version for MPQ with piecewise-linear plan cost functions. We prove that both algorithms find all relevant query plans and experimentally evaluate the performance of our second algorithm in a Cloud computing scenario
- …