Search CORE

203 research outputs found

On unrooted and root-uncertain variants of several well-known phylogenetic network problems

Author: Boes Olivier
Kelk Steven
Stamoulis Georgios
Stougie Leen
van Iersel Leo
Publication venue
Publication date: 01/01/2016
Field of study

The hybridization number problem requires us to embed a set of binary rooted phylogenetic trees into a binary rooted phylogenetic network such that the number of nodes with indegree two is minimized. However, from a biological point of view accurately inferring the root location in a phylogenetic tree is notoriously difficult and poor root placement can artificially inflate the hybridization number. To this end we study a number of relaxed variants of this problem. We start by showing that the fundamental problem of determining whether an \emph{unrooted} phylogenetic network displays (i.e. embeds) an \emph{unrooted} phylogenetic tree, is NP-hard. On the positive side we show that this problem is FPT in reticulation number. In the rooted case the corresponding FPT result is trivial, but here we require more subtle argumentation. Next we show that the hybridization number problem for unrooted networks (when given two unrooted trees) is equivalent to the problem of computing the Tree Bisection and Reconnect (TBR) distance of the two unrooted trees. In the third part of the paper we consider the "root uncertain" variant of hybridization number. Here we are free to choose the root location in each of a set of unrooted input trees such that the hybridization number of the resulting rooted trees is minimized. On the negative side we show that this problem is APX-hard. On the positive side, we show that the problem is FPT in the hybridization number, via kernelization, for any number of input trees.Comment: 28 pages, 8 Figure

arXiv.org e-Print Archive

Maastricht University Research Portal

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

A simple fixed parameter tractable algorithm for computing the hybridization number of two (not necessarily binary) trees

Author: Kelk Steven
Piovesan Teresa
Publication venue
Publication date: 01/01/2012
Field of study

Here we present a new fixed parameter tractable algorithm to compute the hybridization number r of two rooted, not necessarily binary phylogenetic trees on taxon set X in time (6^r.r!).poly(n)$, where n=|X|. The novelty of this approach is its use of terminals, which are maximal elements of a natural partial order on X, and several insights from the softwired clusters literature. This yields a surprisingly simple and practical bounded-search algorithm and offers an alternative perspective on the underlying combinatorial structure of the hybridization number problem

arXiv.org e-Print Archive

Maastricht University Research Portal

CiteSeerX

CWI's Institutional Repository

International Migration, Integration and Social Cohesion online publications

On unrooted and root-uncertain variants of several well-known phylogenetic network problems

Author: Boes O. (Olivier)
Iersel L.J.J. (Leo) van
Kelk S.M. (Steven)
Stamoulis G. (Georgios)
Stougie L. (Leen)
Publication venue
Publication date: 02/09/2016
Field of study

CWI's Institutional Repository

Agreement forests of caterpillar trees: complexity, kernelization and branching

Author: Kelk Steven
Meuwese Ruben
Publication venue
Publication date: 22/07/2023
Field of study

Given a set

X

of species, a phylogenetic tree is an unrooted binary tree whose leaves are bijectively labelled by

X

. Such trees can be used to show the way species evolve over time. One way of understanding how topologically different two phylogenetic trees are, is to construct a minimum-size agreement forest: a partition of

X

into the smallest number of blocks, such that the blocks induce homeomorphic, non-overlapping subtrees in both trees. This comparison yields insight into commonalities and differences in the evolution of

X

across the two trees. Computing a smallest agreement forest is NP-hard (Hein, Jiang, Wang and Zhang, Discrete Applied Mathematics 71(1-3), 1996). In this work we study the problem on caterpillars, which are path-like phylogenetic trees. We will demonstrate that, even if we restrict the input to this highly restricted subclass, the problem remains NP-hard and is in fact APX-hard. Furthermore we show that for caterpillars two standard reductions rules well known in the literature yield a tight kernel of size at most

7k

, compared to

15k

for general trees (Kelk and Simone, SIAM Journal on Discrete Mathematics 33(3), 2019). Finally we demonstrate that we can determine if two caterpillars have an agreement forest with at most

k

blocks in

O^*(2.49^k)

time, compared to

O^*(3^k)

for general trees (Chen, Fan and Sze, Theoretical Computater Science 562, 2015), where

O^*(.)

suppresses polynomial factors.Comment: 31 pages, 15 figure

arXiv.org e-Print Archive

Pattern discovery in structural databases with applications to bioinformatics

Author: Zhang Sen
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2005
Field of study

Frequent structure mining (FSM) aims to discover and extract patterns frequently occurring in structural data such as trees and graphs. FSM finds many applications in bioinformatics, XML processing, Web log analysis, and so on. In this thesis, two new FSM techniques are proposed for finding patterns in unordered labeled trees. Such trees can be used to model evolutionary histories of different species, among others. The first FSM technique finds cousin pairs in the trees. A cousin pair is a pair of nodes sharing the same parent, the same grandparent, or the same great-grandparent, etc. Given a tree T, our algorithm finds all interesting cousin pairs of T in O(|T|2) time where |T| is the number of nodes in T. Experimental results on synthetic data and phylogenies show the scalability and effectiveness of the proposed technique. This technique has been applied to locating co-occurring patterns in multiple evolutionary trees, evaluating the consensus of equally parsimonious trees, and finding kernel trees of groups of phylogenies. The technique is also extended to undirected acyclic graphs (or free trees). The second FSM technique extends traditional MAST (maximum agreement subtree) algorithms by employing the Apriori data mining technique to find frequent agreement subtrees in multiple phylogenies. The correctness and completeness of the new mining algorithm are presented. The method is also extended to unrooted phylogenetic trees. Both FSM techniques studied in the thesis have been implemented into a toolkit, which is fully operational and accessible on the World Wide Web

Digital Commons @ New Jersey Institute of Technology (NJIT)

On Unrooted and Root-Uncertain Variants of Several Well-Known Phylogenetic Network Problems

Author: A Schrijver
Andreas DM Gunawan
BL Allen
BT Drew
C Semple
C Whidden
C Whidden
C Whidden
CH Papadimitriou
D Richards
DA Morrison
DH Huson
DH Huson
DH Huson
EW Wilberg
Georgios Stamoulis
I Kanj
J Chen
J Chen
J Gramm
JF Wendel
Leen Stougie
Leo van Iersel
LJJ Iersel van
LJJ Iersel van
LJJ Iersel van
LJJ Iersel van
M Baroni
M Bordewich
M Bordewich
M Bordewich
M Bordewich
M Bordewich
Olivier Boes
P Gambette
RG Downey
S Kelk
S Kelk
Steven Kelk
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

EvoMiner: Frequent Subtree Mining in Phylogenetic Databases

Author: Deepak Akshay
Fernández-Baca David
McMahon Michelle M.
Sanderson Michael J.
Tirthapura Srikanta
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2011
Field of study

The problem of mining collections of trees to identify common patterns, called frequent subtrees (FSTs), arises often when trying to make sense of the results of phylogenetic analysis. FST mining generalizes the well-known maximum agreement subtree problem. Here we present EvoMiner, a new algorithm for mining frequent subtrees in collections of phylogenetic trees. EvoMiner is an Apriori-like level-wise method, which uses novel phylogeny-specific constant-time candidate generation scheme, an efficient fingerprinting-based technique for downward closure operation, and a lowest common ancestor based support counting step that requires neither costly subtree operations nor database traversal. As a result of these techniques, our algorithm achieves speed-ups of up to 100 times or more over phylominer, another algorithm for mining phylogenetic trees. EvoMiner can also work in vertical mining mode, to use less memory at the expense of speed

Digital Repository @ Iowa State University (ISU)

A simple fixed parameter tractable algorithm for computing the hybridization number of two (not necessarily binary) trees

Author: Kelk S.M. (Steven)
Piovesan T. (Teresa)
Publication venue: I.E.E.E.
Publication date: 01/01/2013
Field of study

CWI's Institutional Repository