Search CORE

4 research outputs found

Optimal Substring-Equality Queries with Applications to Sparse Text Indexing

Author: Prezza Nicola
Publication venue
Publication date: 01/01/2020
Field of study

We consider the problem of encoding a string of length

n

from an integer alphabet of size

\sigma

so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any uniquely-decodable encoding supporting access must take

n\log\sigma + \Theta(\log (n\log\sigma))

bits. We describe a new data structure matching this lower bound when

\sigma\leq n^{O(1)}

while supporting both queries in optimal

O(1)

time. Furthermore, we show that the string can be overwritten in-place with this structure. The redundancy of

\Theta(\log n)

bits and the constant query time break exponentially a lower bound that is known to hold in the read-only model. Using our new string representation, we obtain the first in-place subquadratic (indeed, even sublinear in some cases) algorithms for several string-processing problems in the restore model: the input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to derandomize our algorithms using small space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in

O(n\log n)

time and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in

O(n^{1.5}\sqrt{\log \sigma})

time. Running times of these Las Vegas algorithms hold in the worst case with high probability.Comment: Refactored according to TALG's reviews. New w.h.p. bounds and Las Vegas algorithm

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

When a Dollar Makes a BWT

Author: Giuliani Sara
Liptak Zsuzsanna
Rizzi Romeo
Publication venue
Publication date: 01/01/2019
Field of study

TheBurrows-Wheeler-Transform(BWT)isareversiblestring transformation which plays a central role in text compression and is fun- damental in many modern bioinformatics applications. The BWT is a permutation of the characters, which is in general better compressible and allows to answer several different query types more efficiently than the original string. It is easy to see that not every string is a BWT image, and exact charac- terizations of BWT images are known. We investigate a related combi- natorial question. In many applications, a sentinel character

is added to mark the end of the string, and thus the BWT of a string ending with

contains exactly one

character. We ask, given a string w, in which positions, if any, can the

-character be inserted to turn w into the BWT image of a word ending with the sentinel character. We show that this depends only on the standard permutation of w and give a combinatorial characterization of such positions via this permutation. We then develop an O(n log n)-time algorithm for identifying all such positions, improving on the naive quadratic time algorithm

Catalogo dei prodotti della ricerca

Burrows–Wheeler transform and LCP array construction in constant space

Author: Abeliuk
Abouelhoda
Bauer
Bauer
Belazzougui
Belazzougui
Beller
Bingmann
Brisaboa
Burrows
Cox
Crochemore
Dhaliwal
Elias
Fayolle
Felipe A. Louza
Ferragina
Fischer
Fischer
Fischer
Franceschini
Gog
Gog
Gonnet
Guilherme P. Telles
Kasai
Kärkkäinen
Kärkkäinen
Kärkkäinen
Kärkkäinen
Kärkkäinen
Louza
Louza
Léonard
Manber
Manzini
Munro
Mäkinen
Navarro
Navarro
Navarro
Ohlebusch
Ohlebusch
Okanohara
Policriti
Prezza
Puglisi
Sadakane
Sadakane
Tischler
Tischler
Travis Gagie
Weiner
Witten
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref