194 research outputs found
Deep Reinforcement Learning for Join Order Enumeration
Join order selection plays a significant role in query performance. However,
modern query optimizers typically employ static join enumeration algorithms
that do not receive any feedback about the quality of the resulting plan.
Hence, optimizers often repeatedly choose the same bad plan, as they do not
have a mechanism for "learning from their mistakes". In this paper, we argue
that existing deep reinforcement learning techniques can be applied to address
this challenge. These techniques, powered by artificial neural networks, can
automatically improve decision making by incorporating feedback from their
successes and failures. Towards this goal, we present ReJOIN, a
proof-of-concept join enumerator, and present preliminary results indicating
that ReJOIN can match or outperform the PostgreSQL optimizer in terms of plan
quality and join enumeration efficiency
Distributed Triangle Counting in the Graphulo Matrix Math Library
Triangle counting is a key algorithm for large graph analysis. The Graphulo
library provides a framework for implementing graph algorithms on the Apache
Accumulo distributed database. In this work we adapt two algorithms for
counting triangles, one that uses the adjacency matrix and another that also
uses the incidence matrix, to the Graphulo library for server-side processing
inside Accumulo. Cloud-based experiments show a similar performance profile for
these different approaches on the family of power law Graph500 graphs, for
which data skew increasingly bottlenecks. These results motivate the design of
skew-aware hybrid algorithms that we propose for future work.Comment: Honorable mention in the 2017 IEEE HPEC's Graph Challeng
Основанные на вейвлетах гистограммы для оценки селективности запросов
Гистограммы, основанные на совокупных значениях данных, дают хорошее приближение с ограниченным используемым объемом памяти. Метод построения гистограмм, основанный на кратномасштабном вейвлет-преобразовании может быть применен к базам данных, статистике и моделированию. Предлагаются быстрые алгоритмы построения гистограмм и их использование для оценки селективности в режиме прямого доступаyesБелГ
Constructing Optimal Bushy Processing Trees for Join Queries is NP-hard
We show that constructing optimal bushy processing trees for join queriesis NP-hard. More specifically, we show that even the construction of optimal bushy trees for computing the cross product for a set of relations is NP-hard
Оптимизация запросов в системах баз данных на параллельных структурах
В работе представлены наиболее распространенные виды параллельных реляционных СУБД. Определяются понятия времени
выполнения запроса (времени отклика системы на запрос) в параллельной системе. Предлагаются эффективные способы
вычисления времени отклика.The most widespread types of parallel relational DBMS are presented in paper. The notions of query execution time (response time) are
determined in the parallel system. The effective methods of response time calculation are offered
- …