Search CORE

5 research outputs found

Algorithms and data structures for grammar-compressed strings

Author: Cording Patrick Hagge
Publication venue: Technical University of Denmark
Publication date: 01/01/2015
Field of study

Finger Search in Grammar-Compressed Strings

Author: Bille Philip
Christiansen Anders Roy
Cording Patrick Hagge
Gørtz Inge Li
Publication venue
Publication date: 01/01/2016
Field of study

Grammar-based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. Given a grammar, the random access problem is to compactly represent the grammar while supporting random access, that is, given a position in the original uncompressed string report the character at that position. In this paper we study the random access problem with the finger search property, that is, the time for a random access query should depend on the distance between a specified index

f

, called the \emph{finger}, and the query index

i

. We consider both a static variant, where we first place a finger and subsequently access indices near the finger efficiently, and a dynamic variant where also moving the finger such that the time depends on the distance moved is supported. Let

n

be the size the grammar, and let

N

be the size of the string. For the static variant we give a linear space representation that supports placing the finger in

O(\log N)

time and subsequently accessing in

O(\log D)

time, where

D

is the distance between the finger and the accessed index. For the dynamic variant we give a linear space representation that supports placing the finger in

O(\log N)

time and accessing and moving the finger in

O(\log D + \log \log N)

time. Compared to the best linear space solution to random access, we improve a

O(\log N)

query bound to

O(\log D)

for the static variant and to

O(\log D + \log \log N)

for the dynamic variant, while maintaining linear space. As an application of our results we obtain an improved solution to the longest common extension problem in grammar compressed strings. To obtain our results, we introduce several new techniques of independent interest, including a novel van Emde Boas style decomposition of grammars

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Online Research Database In Technology

Optimal-Time Queries on BWT-Runs Compressed Indexes

Author: Nishimoto Takaaki
Tabei Yasuo
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)
Publication date: 16/07/2020
Field of study

Indexing highly repetitive strings (i.e., strings with many repetitions) for fast queries has become a central research topic in string processing, because it has a wide variety of applications in bioinformatics and natural language processing. Although a substantial number of indexes for highly repetitive strings have been proposed thus far, developing compressed indexes that support various queries remains a challenge. The run-length Burrows-Wheeler transform (RLBWT) is a lossless data compression by a reversible permutation of an input string and run-length encoding, and it has received interest for indexing highly repetitive strings. LF and ?^{-1} are two key functions for building indexes on RLBWT, and the best previous result computes LF and ?^{-1} in O(log log n) time with O(r) words of space for the string length n and the number r of runs in RLBWT. In this paper, we improve LF and ?^{-1} so that they can be computed in a constant time with O(r) words of space. Subsequently, we present OptBWTR (optimal-time queries on BWT-runs compressed indexes), the first string index that supports various queries including locate, count, extract queries in optimal time and O(r) words of space

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Finger Search in Grammar-Compressed Strings

Author: Bille Philip
Christiansen Anders Roy
Cording Patrick Hagge
Gørtz Inge Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

Online Research Database In Technology

Bookmarks in Grammar-Compressed Strings

Author: CG Nevill-Manning
D Belazzougui
DE Knuth
J Ziv
J Ziv
M Charikar
M Miyazaki
P Bille
P Gawrychowski
T Gagie
TA Welch
W Rytter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Online Research Database In Technology