1 research outputs found
Guided Automated Learning for query workload re-Optimization
Query optimization is a hallmark of database systems enabling complex SQL
queries of today's applications to be run efficiently. The query optimizer
often fails to find the best plan, when logical subtleties in business queries
and schemas circumvent it. When a query runs more expensively than is viable or
warranted, determination of the performance issues is usually performed
manually in consultation with experts through the analysis of query's execution
plan (QEP). However, this is an excessively time consuming, human error-prone,
and costly process. GALO is a novel system that automates this process. The
tool automatically learns recurring problem patterns in query plans over
workloads in an offline learning phase, to build a knowledge base of
plan-rewrite remedies. It then uses the knowledge base online to re-optimize
queries queued for execution to improve performance, often quite drastically.
GALO's knowledge base is built on RDF and SPARQL, W3C graph database
standards, which is well suited for manipulating and querying over SQL query
plans, which are graphs themselves. GALO acts as a third-tier of
re-optimization, after query rewrite and cost-based optimization, as a query
plan rewrite. Since the knowledge base is not tied to the context of supplied
QEPs, table and column names are matched automatically during the
re-optimization phase. Thus, problem patterns learned over a particular query
workload can be applied in other query workloads. GALO's knowledge base is also
an invaluable tool for database experts to debug query performance issues by
tracking to known issues and solutions as well as refining the optimizer with
new tuned techniques by the development team. We demonstrate an experimental
study of the effectiveness of our techniques over synthetic TPC-DS and real IBM
client query workloads.Comment: 12 pages, 15 figure