392 research outputs found
Faster Compact On-Line Lempel-Ziv Factorization
We present a new on-line algorithm for computing the Lempel-Ziv factorization
of a string that runs in time and uses only bits
of working space, where is the length of the string and is the
size of the alphabet. This is a notable improvement compared to the performance
of previous on-line algorithms using the same order of working space but
running in either time (Okanohara & Sadakane 2009) or
time (Starikovskaya 2012). The key to our new algorithm is in the
utilization of an elegant but less popular index structure called Directed
Acyclic Word Graphs, or DAWGs (Blumer et al. 1985). We also present an
opportunistic variant of our algorithm, which, given the run length encoding of
size of a string of length , computes the Lempel-Ziv factorization
on-line, in time
and bits of space, which is faster and more space efficient when
the string is run-length compressible
Fully dynamic data structure for LCE queries in compressed space
A Longest Common Extension (LCE) query on a text of length asks for
the length of the longest common prefix of suffixes starting at given two
positions. We show that the signature encoding of size [Mehlhorn et al., Algorithmica 17(2):183-198,
1997] of , which can be seen as a compressed representation of , has a
capability to support LCE queries in time,
where is the answer to the query, is the size of the Lempel-Ziv77
(LZ77) factorization of , and is an integer that can be handled
in constant time under word RAM model. In compressed space, this is the fastest
deterministic LCE data structure in many cases. Moreover, can be
enhanced to support efficient update operations: After processing
in time, we can insert/delete any (sub)string of length
into/from an arbitrary position of in time, where . This yields
the first fully dynamic LCE data structure. We also present efficient
construction algorithms from various types of inputs: We can construct
in time from uncompressed string ; in
time from grammar-compressed string
represented by a straight-line program of size ; and in time from LZ77-compressed string with factors. On top
of the above contributions, we show several applications of our data structures
which improve previous best known results on grammar-compressed string
processing.Comment: arXiv admin note: text overlap with arXiv:1504.0695
Deterministic sub-linear space LCE data structures with efficient construction
Given a string of symbols, a longest common extension query
asks for the length of the longest common prefix of the
th and th suffixes of . LCE queries have several important
applications in string processing, perhaps most notably to suffix sorting.
Recently, Bille et al. (J. Discrete Algorithms 25:42-50, 2014, Proc. CPM 2015:
65-76) described several data structures for answering LCE queries that offers
a space-time trade-off between data structure size and query time. In
particular, for a parameter , their best deterministic
solution is a data structure of size which allows LCE queries to be
answered in time. However, the construction time for all
deterministic versions of their data structure is quadratic in . In this
paper, we propose a deterministic solution that achieves a similar space-time
trade-off of query time using
space, but significantly improve the construction time to
.Comment: updated titl
- …