Search CORE

13,112 research outputs found

Optimal Joins Using Compact Data Structures

Author: Navarro Gonzalo
Reutter Juan L.
Rojas-Ledesma Javiel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

Worst-case optimal join algorithms have gained a lot of attention in the database literature. We now count with several algorithms that are optimal in the worst case, and many of them have been implemented and validated in practice. However, the implementation of these algorithms often requires an enhanced indexing structure: to achieve optimality we either need to build completely new indexes, or we must populate the database with several instantiations of indexes such as B+-trees. Either way, this means spending an extra amount of storage space that may be non-negligible. We show that optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of extra storage. Our representation is a compact quadtree for the static indexes, and a dynamic quadtree sharing subtrees (which we dub a qdag) for intermediate results. We develop a compositional algorithm to process full join queries under this representation, and show that the running time of this algorithm is worst-case optimal in data complexity. Remarkably, we can extend our framework to evaluate more expressive queries from relational algebra by introducing a lazy version of qdags (lqdags). Once again, we can show that the running time of our algorithms is worst-case optimal

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Labeling Schemes with Queries

Author: Korman Amos
Kutten Shay
Publication venue
Publication date: 01/01/2006
Field of study

We study the question of ``how robust are the known lower bounds of labeling schemes when one increases the number of consulted labels''. Let

f

be a function on pairs of vertices. An

f

-labeling scheme for a family of graphs \cF labels the vertices of all graphs in \cF such that for every graph G\in\cF and every two vertices

u,v\in G

, the value

f(u,v)

can be inferred by merely inspecting the labels of

u

and

v

. This paper introduces a natural generalization: the notion of

f

-labeling schemes with queries, in which the value

f(u,v)

can be inferred by inspecting not only the labels of

u

and

v

but possibly the labels of some additional vertices. We show that inspecting the label of a single additional vertex (one {\em query}) enables us to reduce the label size of many labeling schemes significantly

arXiv.org e-Print Archive

CiteSeerX

Old Techniques for New Join Algorithms: A Case Study in RDF Processing

Author: Aberger Christopher R.
Olukotun Kunle
Ré Christopher
Tu Susan
Publication venue
Publication date: 10/02/2016
Field of study

Recently there has been significant interest around designing specialized RDF engines, as traditional query processing mechanisms incur orders of magnitude performance gaps on many RDF workloads. At the same time researchers have released new worst-case optimal join algorithms which can be asymptotically better than the join algorithms in traditional engines. In this paper we apply worst-case optimal join algorithms to a standard RDF workload, the LUBM benchmark, for the first time. We do so using two worst-case optimal engines: (1) LogicBlox, a commercial database engine, and (2) EmptyHeaded, our prototype research engine with enhanced worst-case optimal join algorithms. We show that without any added optimizations both LogicBlox and EmptyHeaded outperform two state-of-the-art specialized RDF engines, RDF-3X and TripleBit, by up to 6x on cyclic join queries-the queries where traditional optimizers are suboptimal. On the remaining, less complex queries in the LUBM benchmark, we show that three classic query optimization techniques enable EmptyHeaded to compete with RDF engines, even when there is no asymptotic advantage to the worst-case optimal approach. We validate that our design has merit as EmptyHeaded outperforms MonetDB by three orders of magnitude and LogicBlox by two orders of magnitude, while remaining within an order of magnitude of RDF-3X and TripleBit

arXiv.org e-Print Archive

Crossref

Deductive Optimization of Relational Data Storage

Author: Feser John K.
Madden Samuel
Solar-Lezama Armando
Tang Nan
Publication venue
Publication date: 05/02/2020
Field of study

Optimizing the physical data storage and retrieval of data are two key database management problems. In this paper, we propose a language that can express a wide range of physical database layouts, going well beyond the row- and column-based methods that are widely used in database management systems. We use deductive synthesis to turn a high-level relational representation of a database query into a highly optimized low-level implementation which operates on a specialized layout of the dataset. We build a compiler for this language and conduct experiments using a popular database benchmark, which shows that the performance of these specialized queries is competitive with a state-of-the-art in memory compiled database system

arXiv.org e-Print Archive

DSpace@MIT

On Efficient Distributed Construction of Near Optimal Routing Schemes

Author: Elkin Michael
Neiman Ofer
Publication venue
Publication date: 20/11/2016
Field of study

Given a distributed network represented by a weighted undirected graph

G=(V,E)

n

vertices, and a parameter

k

, we devise a distributed algorithm that computes a routing scheme in

(n^{1/2+1/k}+D)\cdot n^{o(1)}

rounds, where

D

is the hop-diameter of the network. The running time matches the lower bound of

\tilde{\Omega}(n^{1/2}+D)

rounds (which holds for any scheme with polynomial stretch), up to lower order terms. The routing tables are of size

\tilde{O}(n^{1/k})

, the labels are of size

O(k\log^2n)

, and every packet is routed on a path suffering stretch at most

4k-5+o(1)

. Our construction nearly matches the state-of-the-art for routing schemes built in a centralized sequential manner. The previous best algorithms for building routing tables in a distributed small messages model were by \cite[STOC 2013]{LP13} and \cite[PODC 2015]{LP15}. The former has similar properties but suffers from substantially larger routing tables of size

O(n^{1/2+1/k})

, while the latter has sub-optimal running time of

\tilde{O}(\min\{(nD)^{1/2}\cdot n^{1/k},n^{2/3+2/(3k)}+D\})

arXiv.org e-Print Archive

Crossref

Physics-inspired Performace Evaluation of a Structured Peer-to-Peer Overlay Network

Author: Aurell Erik
El-Ansary Sameh
Haridi Seif
Publication venue
Publication date: 01/01/2005
Field of study

In the majority of structured peer-to-peer overlay networks a graph with a desirable topology is constructed. In most cases, the graph is maintained by a periodic activity performed by each node in the graph to preserve the desirable structure in face of the continuous change of the set of nodes. The interaction of the autonomous periodic activities of the nodes renders the performance analysis of such systems complex and simulation of scales of interest can be prohibitive. Physicists, however, are accustomed to dealing with scale by characterizing a system using intensive variables, i.e. variables that are size independent. The approach has proved its usefulness when applied to satisfiability theory. This work is the first attempt to apply it in the area of distributed systems. The contribution of this paper is two-fold. First, we describe a methodology to be used for analyzing the performance of large scale distributed systems. Second, we show how we applied the methodology to find an intensive variable that describe the characteristic behavior of the Chord overlay network, namely, the ratio of the magnitude of perturbation of the network (joins/failures) to the magnitude of periodic stabilization of the network

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive