89 research outputs found
Recognizing well-parenthesized expressions in the streaming model
Motivated by a concrete problem and with the goal of understanding the sense
in which the complexity of streaming algorithms is related to the complexity of
formal languages, we investigate the problem Dyck(s) of checking matching
parentheses, with different types of parenthesis.
We present a one-pass randomized streaming algorithm for Dyck(2) with space
\Order(\sqrt{n}\log n), time per letter \polylog (n), and one-sided error.
We prove that this one-pass algorithm is optimal, up to a \polylog n factor,
even when two-sided error is allowed. For the lower bound, we prove a direct
sum result on hard instances by following the "information cost" approach, but
with a few twists. Indeed, we play a subtle game between public and private
coins. This mixture between public and private coins results from a balancing
act between the direct sum result and a combinatorial lower bound for the base
case.
Surprisingly, the space requirement shrinks drastically if we have access to
the input stream in reverse. We present a two-pass randomized streaming
algorithm for Dyck(2) with space \Order((\log n)^2), time \polylog (n) and
one-sided error, where the second pass is in the reverse direction. Both
algorithms can be extended to Dyck(s) since this problem is reducible to
Dyck(2) for a suitable notion of reduction in the streaming model.Comment: 20 pages, 5 figure
Streaming Complexity of Checking Priority Queues
This work is in the line of designing efficient checkers for testing the reliability of some massive data structures. Given a sequential access to the insert/extract operations on such a structure, one would like to decide, a posteriori only, if it corresponds to the evolution of a reliable structure.
In a context of massive data, one would like to minimize both the amount of reliable memory of the checker and the number of passes on the sequence of operations.
Chu, Kannan and McGregor (M. Chu, S. Kannan, and A. McGregor, 2007) initiated the study of checking priority queues in this setting. They showed that the use of timestamps allows to check a priority queue with a single pass and memory space tilde{Order}(sqrt{N}). Later, Chakrabarti, Cormode, Kondapally and McGregor (A. Chakrabarti, G. Cormode, R. Kondapally, and A. McGregor, 2010) removed the use of timestamps, and proved that more passes do not help.
We show that, even in the presence of timestamps, more passes do not help, solving an open problem
of (M. Chu, S. Kannan, and A. McGregor, 2007; A. Chakrabarti, G. Cormode, R. Kondapally, and A. McGregor). On the other hand, we show that a second pass, but in reverse direction shrinks the memory space to tilde{Order}((log N)^2), extending a phenomenon the first time observed by Magniez, Mathieu and Nayak (F. Magniez, C. Mathieu, and A. Nayak, 2010) for checking well-parenthesized expressions
Streaming Property Testing of Visibly Pushdown Languages
In the context of language recognition, we demonstrate the superiority of
streaming property testers against streaming algorithms and property testers,
when they are not combined. Initiated by Feigenbaum et al., a streaming
property tester is a streaming algorithm recognizing a language under the
property testing approximation: it must distinguish inputs of the language from
those that are -far from it, while using the smallest possible
memory (rather than limiting its number of input queries).
Our main result is a streaming -property tester for visibly
pushdown languages (VPL) with one-sided error using memory space
.
This constructions relies on a (non-streaming) property tester for weighted
regular languages based on a previous tester by Alon et al. We provide a simple
application of this tester for streaming testing special cases of instances of
VPL that are already hard for both streaming algorithms and property testers.
Our main algorithm is a combination of an original simulation of visibly
pushdown automata using a stack with small height but possible items of linear
size. In a second step, those items are replaced by small sketches. Those
sketches relies on a notion of suffix-sampling we introduce. This sampling is
the key idea connecting our streaming tester algorithm to property testers.Comment: 23 pages. Major modifications in the presentatio
Streaming algorithms for language recognition problems
We study the complexity of the following problems in the streaming model.
Membership testing for \DLIN We show that every language in \DLIN\ can be
recognised by a randomized one-pass space algorithm with inverse
polynomial one-sided error, and by a deterministic p-pass space
algorithm. We show that these algorithms are optimal.
Membership testing for \LL For languages generated by \LL grammars
with a bound of on the number of nonterminals at any stage in the left-most
derivation, we show that membership can be tested by a randomized one-pass
space algorithm with inverse polynomial (in ) one-sided error.
Membership testing for \DCFL We show that randomized algorithms as efficient
as the ones described above for \DLIN\ and \LL(k) (which are subclasses of
\DCFL) cannot exist for all of \DCFL: there is a language in \VPL\ (a subclass
of \DCFL) for which any randomized p-pass algorithm with error bounded by
must use space.
Degree sequence problem We study the problem of determining, given a sequence
and a graph , whether the degree sequence of is
precisely . We give a randomized one-pass space
algorithm with inverse polynomial one-sided error probability. We show that our
algorithms are optimal.
Our randomized algorithms are based on the recent work of Magniez et al.
\cite{MMN09}; our lower bounds are obtained by considering related
communication complexity problems
Improved bounds for testing Dyck languages
In this paper we consider the problem of deciding membership in Dyck
languages, a fundamental family of context-free languages, comprised of
well-balanced strings of parentheses. In this problem we are given a string of
length in the alphabet of parentheses of types and must decide if it is
well-balanced. We consider this problem in the property testing setting, where
one would like to make the decision while querying as few characters of the
input as possible.
Property testing of strings for Dyck language membership for , with a
number of queries independent of the input size , was provided in [Alon,
Krivelevich, Newman and Szegedy, SICOMP 2001]. Property testing of strings for
Dyck language membership for was first investigated in [Parnas, Ron
and Rubinfeld, RSA 2003]. They showed an upper bound and a lower bound for
distinguishing strings belonging to the language from strings that are far (in
terms of the Hamming distance) from the language, which are respectively (up to
polylogarithmic factors) the power and the power of the input size
.
Here we improve the power of in both bounds. For the upper bound, we
introduce a recursion technique, that together with a refinement of the methods
in the original work provides a test for any power of larger than .
For the lower bound, we introduce a new problem called Truestring Equivalence,
which is easily reducible to the -type Dyck language property testing
problem. For this new problem, we show a lower bound of to the power of
Information Cost Tradeoffs for Augmented Index and Streaming Language Recognition
This paper makes three main contributions to the theory of communication
complexity and stream computation. First, we present new bounds on the
information complexity of AUGMENTED-INDEX. In contrast to analogous results for
INDEX by Jain, Radhakrishnan and Sen [J. ACM, 2009], we have to overcome the
significant technical challenge that protocols for AUGMENTED-INDEX may violate
the "rectangle property" due to the inherent input sharing. Second, we use
these bounds to resolve an open problem of Magniez, Mathieu and Nayak [STOC,
2010] that asked about the multi-pass complexity of recognizing Dyck languages.
This results in a natural separation between the standard multi-pass model and
the multi-pass model that permits reverse passes. Third, we present the first
passive memory checkers that verify the interaction transcripts of priority
queues, stacks, and double-ended queues. We obtain tight upper and lower bounds
for these problems, thereby addressing an important sub-class of the memory
checking framework of Blum et al. [Algorithmica, 1994]
- …