Search CORE

3,625 research outputs found

Prospects and limitations of full-text index structures in genome analysis

Author: Dawyndt Peter
De Baets Bernard
Fack Veerle
Vyverman Michaël
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared

Ghent University Academic Bibliography

PubMed Central

Labeling Schemes for Bounded Degree Graphs

Author: A. Korman
C. Gavoille
C. Gavoille
C. Nash-Williams
F.R.K. Chung
L. Esperet
N. Bonichon
N. Bonichon
S. Alstrup
S. Bhatt
S. Butler
Y. Wang
Publication venue
Publication date: 01/01/2014
Field of study

We investigate adjacency labeling schemes for graphs of bounded degree

\Delta = O(1)

. In particular, we present an optimal (up to an additive constant)

\log n + O(1)

adjacency labeling scheme for bounded degree trees. The latter scheme is derived from a labeling scheme for bounded degree outerplanar graphs. Our results complement a similar bound recently obtained for bounded depth trees [Fraigniaud and Korman, SODA 10], and may provide new insights for closing the long standing gap for adjacency in trees [Alstrup and Rauhe, FOCS 02]. We also provide improved labeling schemes for bounded degree planar graphs. Finally, we use combinatorial number systems and present an improved adjacency labeling schemes for graphs of bounded degree

\Delta

with

(e+1)\sqrt{n} < \Delta \leq n/5

arXiv.org e-Print Archive

CiteSeerX

Crossref

Copenhagen University Research Information System

Efficient pebbling for list traversal synopses

Author: Matias Yossi
Porat Ely
Publication venue
Publication date: 01/01/2002
Field of study

We show how to support efficient back traversal in a unidirectional list, using small memory and with essentially no slowdown in forward steps. Using

O(\log n)

memory for a list of size

n

, the

i

'th back-step from the farthest point reached so far takes

O(\log i)

time in the worst case, while the overhead per forward step is at most

\epsilon

for arbitrary small constant

\epsilon>0

. An arbitrary sequence of forward and back steps is allowed. A full trade-off between memory usage and time per back-step is presented:

k

vs.

kn^{1/k}

and vice versa. Our algorithms are based on a novel pebbling technique which moves pebbles on a virtual binary, or

t

-ary, tree that can only be traversed in a pre-order fashion. The compact data structures used by the pebbling algorithms, called list traversal synopses, extend to general directed graphs, and have other interesting applications, including memory efficient hash-chain implementation. Perhaps the most surprising application is in showing that for any program, arbitrary rollback steps can be efficiently supported with small overhead in memory, and marginal overhead in its ordinary execution. More concretely: Let

P

be a program that runs for at most

T

steps, using memory of size

M

. Then, at the cost of recording the input used by the program, and increasing the memory by a factor of

O(\log T)

O(M \log T)

, the program

P

can be extended to support an arbitrary sequence of forward execution and rollback steps: the

i

'th rollback step takes

O(\log i)

time in the worst case, while forward steps take O(1) time in the worst case, and

1+\epsilon

amortized time per step.Comment: 27 page

arXiv.org e-Print Archive

CiteSeerX

Succinct Representations of Dynamic Strings

Author: He Meng
Munro J. Ian
Publication venue
Publication date: 01/01/2010
Field of study

The rank and select operations over a string of length n from an alphabet of size

\sigma

have been used widely in the design of succinct data structures. In many applications, the string itself need be maintained dynamically, allowing characters of the string to be inserted and deleted. Under the word RAM model with word size

w=\Omega(\lg n)

, we design a succinct representation of dynamic strings using

nH_0 + o(n)\lg\sigma + O(w)

bits to support rank, select, insert and delete in

O(\frac{\lg n}{\lg\lg n}(\frac{\lg \sigma}{\lg\lg n}+1))

time. When the alphabet size is small, i.e. when \sigma = O(\polylog (n)), including the case in which the string is a bit vector, these operations are supported in

O(\frac{\lg n}{\lg\lg n})

time. Our data structures are more efficient than previous results on the same problem, and we have applied them to improve results on the design and construction of space-efficient text indexes

arXiv.org e-Print Archive

CiteSeerX