27 research outputs found
One-variable word equations in linear time
In this paper we consider word equations with one variable (and arbitrary
many appearances of it). A recent technique of recompression, which is
applicable to general word equations, is shown to be suitable also in this
case. While in general case it is non-deterministic, it determinises in case of
one variable and the obtained running time is O(n + #_X log n), where #_X is
the number of appearances of the variable in the equation. This matches the
previously-best algorithm due to D\k{a}browski and Plandowski. Then, using a
couple of heuristics as well as more detailed time analysis the running time is
lowered to O(n) in RAM model. Unfortunately no new properties of solutions are
shown.Comment: submitted to a journal, general overhaul over the previous versio
Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P)
In this paper, a compressed membership problem for finite automata, both
deterministic and non-deterministic, with compressed transition labels is
studied. The compression is represented by straight-line programs (SLPs), i.e.
context-free grammars generating exactly one string. A novel technique of
dealing with SLPs is introduced: the SLPs are recompressed, so that substrings
of the input text are encoded in SLPs labelling the transitions of the NFA
(DFA) in the same way, as in the SLP representing the input text. To this end,
the SLPs are locally decompressed and then recompressed in a uniform way.
Furthermore, such recompression induces only small changes in the automaton, in
particular, the size of the automaton remains polynomial.
Using this technique it is shown that the compressed membership for NFA with
compressed labels is in NP, thus confirming the conjecture of Plandowski and
Rytter and extending the partial result of Lohrey and Mathissen; as it is
already known, that this problem is NP-hard, we settle its exact computational
complexity. Moreover, the same technique applied to the compressed membership
for DFA with compressed labels yields that this problem is in P; for this
problem, only trivial upper-bound PSPACE was known
A really simple approximation of smallest grammar
In this paper we present a really simple linear-time algorithm constructing a
context-free grammar of size O(g log (N/g)) for the input string, where N is
the size of the input string and g the size of the optimal grammar generating
this string. The algorithm works for arbitrary size alphabets, but the running
time is linear assuming that the alphabet Sigma of the input string can be
identified with numbers from 1,ldots, N^c for some constant c. Algorithms with
such an approximation guarantee and running time are known, however all of them
were non-trivial and their analyses were involved. The here presented algorithm
computes the LZ77 factorisation and transforms it in phases to a grammar. In
each phase it maintains an LZ77-like factorisation of the word with at most l
factors as well as additional O(l) letters, where l was the size of the
original LZ77 factorisation. In one phase in a greedy way (by a left-to-right
sweep and a help of the factorisation) we choose a set of pairs of consecutive
letters to be replaced with new symbols, i.e. nonterminals of the constructed
grammar. We choose at least 2/3 of the letters in the word and there are O(l)
many different pairs among them. Hence there are O(log N) phases, each of them
introduces O(l) nonterminals to a grammar. A more precise analysis yields a
bound O(l log(N/l)). As l \leq g, this yields the desired bound O(g log(N/g)).Comment: Accepted for CPM 201
Context unification is in PSPACE
Contexts are terms with one `hole', i.e. a place in which we can substitute
an argument. In context unification we are given an equation over terms with
variables representing contexts and ask about the satisfiability of this
equation. Context unification is a natural subvariant of second-order
unification, which is undecidable, and a generalization of word equations,
which are decidable, at the same time. It is the unique problem between those
two whose decidability is uncertain (for already almost two decades). In this
paper we show that the context unification is in PSPACE. The result holds under
a (usual) assumption that the first-order signature is finite.
This result is obtained by an extension of the recompression technique,
recently developed by the author and used in particular to obtain a new PSPACE
algorithm for satisfiability of word equations, to context unification. The
recompression is based on performing simple compression rules (replacing pairs
of neighbouring function symbols), which are (conceptually) applied on the
solution of the context equation and modifying the equation in a way so that
such compression steps can be in fact performed directly on the equation,
without the knowledge of the actual solution.Comment: 27 pages, submitted, small notation changes and small improvements
over the previous tex
Finding All Solutions of Equations in Free Groups and Monoids with Involution
The aim of this paper is to present a PSPACE algorithm which yields a finite
graph of exponential size and which describes the set of all solutions of
equations in free groups as well as the set of all solutions of equations in
free monoids with involution in the presence of rational constraints. This
became possible due to the recently invented emph{recompression} technique of
the second author.
He successfully applied the recompression technique for pure word equations
without involution or rational constraints. In particular, his method could not
be used as a black box for free groups (even without rational constraints).
Actually, the presence of an involution (inverse elements) and rational
constraints complicates the situation and some additional analysis is
necessary. Still, the recompression technique is general enough to accommodate
both extensions. In the end, it simplifies proofs that solving word equations
is in PSPACE (Plandowski 1999) and the corresponding result for equations in
free groups with rational constraints (Diekert, Hagenah and Gutierrez 2001). As
a byproduct we obtain a direct proof that it is decidable in PSPACE whether or
not the solution set is finite.Comment: A preliminary version of this paper was presented as an invited talk
at CSR 2014 in Moscow, June 7 - 11, 201
Equations over free inverse monoids with idempotent variables
We introduce the notion of idempotent variables for studying equations in
inverse monoids.
It is proved that it is decidable in singly exponential time (DEXPTIME)
whether a system of equations in idempotent variables over a free inverse
monoid has a solution. The result is proved by a direct reduction to solve
language equations with one-sided concatenation and a known complexity result
by Baader and Narendran: Unification of concept terms in description logics,
2001. We also show that the problem becomes DEXPTIME hard , as soon as the
quotient group of the free inverse monoid has rank at least two.
Decidability for systems of typed equations over a free inverse monoid with
one irreducible variable and at least one unbalanced equation is proved with
the same complexity for the upper bound.
Our results improve known complexity bounds by Deis, Meakin, and Senizergues:
Equations in free inverse monoids, 2007.
Our results also apply to larger families of equations where no decidability
has been previously known.Comment: 28 pages. The conference version of this paper appeared in the
proceedings of 10th International Computer Science Symposium in Russia, CSR
2015, Listvyanka, Russia, July 13-17, 2015. Springer LNCS 9139, pp. 173-188
(2015
Efficient LZ78 factorization of grammar compressed text
We present an efficient algorithm for computing the LZ78 factorization of a
text, where the text is represented as a straight line program (SLP), which is
a context free grammar in the Chomsky normal form that generates a single
string. Given an SLP of size representing a text of length , our
algorithm computes the LZ78 factorization of in time
and space, where is the number of resulting LZ78 factors.
We also show how to improve the algorithm so that the term in the
time and space complexities becomes either , where is the length of the
longest LZ78 factor, or where is a quantity
which depends on the amount of redundancy that the SLP captures with respect to
substrings of of a certain length. Since where
is the alphabet size, the latter is asymptotically at least as fast as
a linear time algorithm which runs on the uncompressed string when is
constant, and can be more efficient when the text is compressible, i.e. when
and are small.Comment: SPIRE 201
Rpair: Rescaling RePair with Rsync
Data compression is a powerful tool for managing massive but repetitive datasets, especially schemes such as grammar-based compression that support computation over the data without decompressing it. In the best case such a scheme takes a dataset so big that it must be stored on disk and shrinks it enough that it can be stored and processed in internal memory. Even then, however, the scheme is essentially useless unless it can be built on the original dataset reasonably quickly while keeping the dataset on disk. In this paper we show how we can preprocess such datasets with context-triggered piecewise hashing such that afterwards we can apply RePair and other grammar-based compressors more easily. We first give our algorithm, then show how a variant of it can be used to approximate the LZ77 parse, then leverage that to prove theoretical bounds on compression, and finally give experimental evidence that our approach is competitive in practice
Regular Matching and Inclusion on Compressed Tree Patterns with Context Variables
International audienceWe study the complexity of regular matching and inclusion for compressed tree patterns extended by context variables. The addition of context variables to tree patterns permits us to properly capture compressed string patterns but also compressed patterns for unranked trees with tree and hedge variables. Regular inclusion for the latter is relevant to certain query answering on Xml streams with references