Search CORE

117 research outputs found

Towards a Taxonomically Intelligent Phylogenetic Database

Author: Roderic Page
Publication venue
Publication date: 18/09/2007
Field of study

This note outlines some of the key intellectual obstacles that stand in the way of creating a usable phylogenetic database. These challenges include the need to accommodate multiple taxonomic names and classifications, and the need for tools to query trees in biologically meaningful ways. Until these problems are addressed, and a taxonomically intelligent phylogenetic database created, much of our phylogenetic knowledge will languish in the pages of journals

Crossref

Nature Precedings

A heuristic approach for multiple restricted multiplication

Author: Cheung PYK
Constantinides GA
Sidahao N
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Published versio

Spiral - Imperial College Digital Repository

An O(n^3)-Time Algorithm for Tree Edit Distance

Author: Benjamin Rossman
Demaine E. D.
Dulucq S.
Erik D. Demaine
Klein P. N.
Oren Weimann
Shay Mozes
Waterman M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/04/2006
Field of study

The {\em edit distance} between two ordered trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this paper, we present a worst-case

O(n^3)

-time algorithm for this problem, improving the previous best

O(n^3\log n)

-time algorithm~\cite{Klein}. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems (which is interesting in its own right), together with a deeper understanding of the previous algorithms for the problem. We also prove the optimality of our algorithm among the family of \emph{decomposition strategy} algorithms--which also includes the previous fastest algorithms--by tightening the known lower bound of

\Omega(n^2\log^2 n)

~\cite{Touzet} to

\Omega(n^3)

, matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds of

\Theta(n m^2 (1 + \log \frac{n}{m}))

when the two trees have different sizes

m

and~

n

, where

m < n

.Comment: 10 pages, 5 figures, 5 .tex files where TED.tex is the main on

arXiv.org e-Print Archive

Crossref

A new balance index for phylogenetic trees

Author: Arnau Mir
Blum
Blum
Blum
Brown
Cavalli-Sforza
Colless
Farris
Felsenstein
Francesc Rosselló
Harding
Heard
Kirkpatrick
Lucı´a Rotger
Matsen
Mir
Mooers
Morell
Mulder
Petkovsek
Rogers
Rogers
Rogers
Rosen
Sackin
Shao
Sokal
Steel
Steel
Valiente
Yule
Publication venue: 'Elsevier BV'
Publication date: 06/02/2012
Field of study

Several indices that measure the degree of balance of a rooted phylogenetic tree have been proposed so far in the literature. In this work we define and study a new index of this kind, which we call the total cophenetic index: the sum, over all pairs of different leaves, of the depth of their least common ancestor. This index makes sense for arbitrary trees, can be computed in linear time and it has a larger range of values and a greater resolution power than other indices like Colless' or Sackin's. We compute its maximum and minimum values for arbitrary and binary trees, as well as exact formulas for its expected value for binary trees under the Yule and the uniform models of evolution. As a byproduct of this study, we obtain an exact formula for the expected value of the Sackin index under the uniform model, a result that seems to be new in the literature.Comment: 24 pages, 2 figures, preliminary version presented at the JBI 201

arXiv.org e-Print Archive

Crossref

Faster Algorithms for the Maximum Common Subtree Isomorphism Problem

Author: Droschinsky Andre
Kriege Nils M.
Mutzel Petra
Publication venue
Publication date: 01/01/2016
Field of study

The maximum common subtree isomorphism problem asks for the largest possible isomorphism between subtrees of two given input trees. This problem is a natural restriction of the maximum common subgraph problem, which is

{\sf NP}

-hard in general graphs. Confining to trees renders polynomial time algorithms possible and is of fundamental importance for approaches on more general graph classes. Various variants of this problem in trees have been intensively studied. We consider the general case, where trees are neither rooted nor ordered and the isomorphism is maximum w.r.t. a weight function on the mapped vertices and edges. For trees of order

n

and maximum degree

\Delta

our algorithm achieves a running time of

\mathcal{O}(n^2\Delta)

by exploiting the structure of the matching instances arising as subproblems. Thus our algorithm outperforms the best previously known approaches. No faster algorithm is possible for trees of bounded degree and for trees of unbounded degree we show that a further reduction of the running time would directly improve the best known approach to the assignment problem. Combining a polynomial-delay algorithm for the enumeration of all maximum common subtree isomorphisms with central ideas of our new algorithm leads to an improvement of its running time from

\mathcal{O}(n^6+Tn^2)

\mathcal{O}(n^3+Tn\Delta)

, where

n

is the order of the larger tree,

T

is the number of different solutions, and

\Delta

is the minimum of the maximum degrees of the input trees. Our theoretical results are supplemented by an experimental evaluation on synthetic and real-world instances

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server