Search CORE

1,869 research outputs found

Tight Lower Bounds for Query Processing on Streaming and External Memory Data

Author: Grohe Martin
Koch Christoph
Schweikardt Nicole
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2011
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Lower Bounds for Multi-Pass Processing of Multiple Data Streams

Author: Schweikardt Nicole
Publication venue
Publication date: 01/01/2009
Field of study

This paper gives a brief overview of computation models for data stream processing, and it introduces a new model for multi-pass processing of multiple streams, the so-called mp2s-automata. Two algorithms for solving the set disjointness problem wi th these automata are presented. The main technical contribution of this paper is the proof of a lower bound on the size of memory and the number of heads that are required for solvin g the set disjointness problem with mp2s-automata

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Hochschulschriftenserver - Universität Frankfurt am Main

Worst-Case Optimal Algorithms for Parallel Query Processing

Author: Beame Paul
Koutris Paraschos
Suciu Dan
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we study the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with

p

servers. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data (either data without skew, or data with specific types of skew), in this work we focus on worst-case analysis of the communication cost. The goal is to find worst-case optimal parallel algorithms, similar to the work of [18] for sequential algorithms. We first show that for a single round we can obtain an optimal worst-case algorithm. The optimal load for a conjunctive query

q

when all relations have size equal to

M

O(M/p^{1/\psi^*})

, where

\psi^*

is a new query-related quantity called the edge quasi-packing number, which is different from both the edge packing number and edge cover number of the query hypergraph. For multiple rounds, we present algorithms that are optimal for several classes of queries. Finally, we show a surprising connection to the external memory model, which allows us to translate parallel algorithms to external memory algorithms. This technique allows us to recover (within a polylogarithmic factor) several recent results on the I/O complexity for computing join queries, and also obtain optimal algorithms for other classes of queries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On The Communication Complexity of Linear Algebraic Problems in the Message Passing Model

Author: Li Yi
Sun Xiaoming
Wang Chengu
Woodruff David P.
Publication venue
Publication date: 01/01/2014
Field of study

We study the communication complexity of linear algebraic problems over finite fields in the multi-player message passing model, proving a number of tight lower bounds. Specifically, for a matrix which is distributed among a number of players, we consider the problem of determining its rank, of computing entries in its inverse, and of solving linear equations. We also consider related problems such as computing the generalized inner product of vectors held on different servers. We give a general framework for reducing these multi-player problems to their two-player counterparts, showing that the randomized

s

-player communication complexity of these problems is at least

s

times the randomized two-player communication complexity. Provided the problem has a certain amount of algebraic symmetry, which we formally define, we can show the hardest input distribution is a symmetric distribution, and therefore apply a recent multi-player lower bound technique of Phillips et al. Further, we give new two-player lower bounds for a number of these problems. In particular, our optimal lower bound for the two-player version of the matrix rank problem resolves an open question of Sun and Wang. A common feature of our lower bounds is that they apply even to the special "threshold promise" versions of these problems, wherein the underlying quantity, e.g., rank, is promised to be one of just two values, one on each side of some critical threshold. These kinds of promise problems are commonplace in the literature on data streaming as sources of hardness for reductions giving space lower bounds

arXiv.org e-Print Archive

MPG.PuRe

New Algorithms and Lower Bounds for Sequential-Access Data Compression

Author: Gagie Travis
Publication venue
Publication date: 01/01/2009
Field of study

This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by character, outputting each character's self-delimiting codeword before reading the next one. We show how to encode and decode each character in constant worst-case time while producing an encoding whose length is worst-case optimal. In another chapter we consider one-pass compression with memory bounded in terms of the alphabet size and context length, and prove a nearly tight tradeoff between the amount of memory we can use and the quality of the compression we can achieve. In a third chapter we consider compression in the read/write streams model, which allows us passes and memory both polylogarithmic in the size of the input. We first show how to achieve universal compression using only one pass over one stream. We then show that one stream is not sufficient for achieving good grammar-based compression. Finally, we show that two streams are necessary and sufficient for achieving entropy-only bounds.Comment: draft of PhD thesi

arXiv.org e-Print Archive

Publications at Bielefeld University