Search CORE

3,060 research outputs found

A Faster Implementation of Online Run-Length Burrows-Wheeler Transform

Author: A Bowe
D Belazzougui
G Navarro
G Navarro
J Sirén
J Ziv
JI Munro
MA Bender
V Mäkinen
W Hon
Publication venue
Publication date: 14/10/2017
Field of study

Run-length encoding Burrows-Wheeler Transformed strings, resulting in Run-Length BWT (RLBWT), is a powerful tool for processing highly repetitive strings. We propose a new algorithm for online RLBWT working in run-compressed space, which runs in

O(n\lg r)

time and

O(r\lg n)

bits of space, where

n

is the length of input string

S

received so far and

r

is the number of runs in the BWT of the reversed

S

. We improve the state-of-the-art algorithm for online RLBWT in terms of empirical construction time. Adopting the dynamic list for maintaining a total order, we can replace rank queries in a dynamic wavelet tree on a run-length compressed string by the direct comparison of labels in a dynamic list. The empirical result for various benchmarks show the efficiency of our algorithm, especially for highly repetitive strings.Comment: In Proc. IWOCA201

arXiv.org e-Print Archive

Crossref

Simple and Efficient Fully-Functional Succinct Trees

Author: Cordova Joshimar
Navarro Gonzalo
Publication venue
Publication date: 23/03/2016
Field of study

The fully-functional succinct tree representation of Navarro and Sadakane (ACM Transactions on Algorithms, 2014) supports a large number of operations in constant time using

2n+o(n)

bits. However, the full idea is hard to implement. Only a simplified version with

O(\log n)

operation time has been implemented and shown to be practical and competitive. We describe a new variant of the original idea that is much simpler to implement and has worst-case time

O(\log\log n)

for the operations. An implementation based on this version is experimentally shown to be superior to existing implementations

arXiv.org e-Print Archive

Repositorio Académico de la Universidad de Chile

From Theory to Practice: Plug and Play with Succinct Data Structures

Author: F. Claude
G. Navarro
G. Navarro
J.S. Culpepper
K. Sadakane
K. Sadakane
N. Jesper Larsson
R. Grossi
S. Vigna
V. Mäkinen
Publication venue
Publication date: 05/11/2013
Field of study

Engineering efficient implementations of compact and succinct structures is a time-consuming and challenging task, since there is no standard library of easy-to- use, highly optimized, and composable components. One consequence is that measuring the practical impact of new theoretical proposals is a difficult task, since older base- line implementations may not rely on the same basic components, and reimplementing from scratch can be very time-consuming. In this paper we present a framework for experimentation with succinct data structures, providing a large set of configurable components, together with tests, benchmarks, and tools to analyze resource requirements. We demonstrate the functionality of the framework by recomposing succinct solutions for document retrieval.Comment: 10 pages, 4 figures, 3 table

arXiv.org e-Print Archive

CiteSeerX

Crossref

Prospects and limitations of full-text index structures in genome analysis

Author: Dawyndt Peter
De Baets Bernard
Fack Veerle
Vyverman Michaël
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared

Ghent University Academic Bibliography

PubMed Central