Search CORE

682 research outputs found

Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

Author: Gawrychowski Pawel
Publication venue
Publication date: 01/01/2011
Field of study

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern s[1..m] and a Lempel-Ziv representation of a string t[1..N], does s occur in t? Farach and Thorup gave a randomized O(nlog^2(N/n)+m) time solution for this problem, where n is the size of the compressed representation of t. We improve their result by developing a faster and fully deterministic O(nlog(N/n)+m) time algorithm with the same space complexity. Note that for highly compressible texts, log(N/n) might be of order n, so for such inputs the improvement is very significant. A (tiny) fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan.Comment: submitte

arXiv.org e-Print Archive

CiteSeerX

Do branch lengths help to locate a tree in a phylogenetic network?

Author: Gambette Philippe
Kelk Steven
Pardi Fabio
Scornavacca Celine
van Iersel Leo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be explained by a given network. In mathematical terms, this is often translated in the following way: is a given phylogenetic tree contained in a given phylogenetic network? Recently this tree containment problem has been widely investigated from a computational perspective, but most studies have only focused on the topology of the phylo- genies, ignoring a piece of information that, in the case of phylogenetic trees, is routinely inferred by evolutionary analyses: branch lengths. These measure the amount of change (e.g., nucleotide substitutions) that has occurred along each branch of the phylogeny. Here, we study a number of versions of the tree containment problem that explicitly account for branch lengths. We show that, although length information has the potential to locate more precisely a tree within a network, the problem is computationally hard in its most general form. On a positive note, for a number of special cases of biological relevance, we provide algorithms that solve this problem efficiently. This includes the case of networks of limited complexity, for which it is possible to recover, among the trees contained by the network with the same topology as the input tree, the closest one in terms of branch lengths

arXiv.org e-Print Archive

Maastricht University Research Portal

INRIA a CCSD electronic archive server

HAL-IRD

HAL-CIRAD

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Efficient Dynamic Approximate Distance Oracles for Vertex-Labeled Planar Graphs

Author: BT Wilkinson
DE Willard
M Li
M Thorup
ML Fredman
MR Henzinger
Q-P Gu
R Pagh
RJ Lipton
S Mozes
Publication venue
Publication date: 27/08/2017
Field of study

Let

G

be a graph where each vertex is associated with a label. A Vertex-Labeled Approximate Distance Oracle is a data structure that, given a vertex

v

and a label

\lambda

, returns a

(1+\varepsilon)

-approximation of the distance from

v

to the closest vertex with label

\lambda

G

. Such an oracle is dynamic if it also supports label changes. In this paper we present three different dynamic approximate vertex-labeled distance oracles for planar graphs, all with polylogarithmic query and update times, and nearly linear space requirements

arXiv.org e-Print Archive

Crossref