2,977 research outputs found
How tree-based is my network? Proximity measures for unrooted phylogenetic networks
Tree-based networks are a class of phylogenetic networks that attempt to
formally capture what is meant by "tree-like" evolution. A given non-tree-based
phylogenetic network, however, might appear to be very close to being
tree-based, or very far. In this paper, we formalise the notion of proximity to
tree-based for unrooted phylogenetic networks, with a range of proximity
measures. These measures also provide characterisations of tree-based networks.
One measure in particular, related to the nearest neighbour interchange
operation, allows us to define the notion of "tree-based rank". This provides a
subclassification within the tree-based networks themselves, identifying those
networks that are "very" tree-based. Finally, we prove results relating
tree-based networks in the settings of rooted and unrooted phylogenetic
networks, showing effectively that an unrooted network is tree-based if and
only if it can be made a rooted tree-based network by rooting it and orienting
the edges appropriately. This leads to a clarification of the contrasting
decision problems for tree-based networks, which are polynomial in the rooted
case but NP complete in the unrooted
Tree-Based Unrooted Phylogenetic Networks
Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X. It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms. Correction available at dx.doi.org/10.1007/s11538-018-0530-
Rearrangement operations on unrooted phylogenetic networks
Rearrangement operations transform a phylogenetic tree into another one and hence induce a metric on the space of phylogenetic trees. Popular operations for unrooted phylogenetic trees are NNI (nearest neighbour interchange), SPR (subtree prune and regraft), and TBR (tree bisection and reconnection). Recently, these operations have been extended to unrooted phylogenetic networks, which are generalisations of phylogenetic trees that can model reticulated evolutionary relationships. Here, we study global and local properties of spaces of phylogenetic networks under these three operations. In particular, we prove connectedness and asymptotic bounds on the diameters of spaces of different classes of phylogenetic networks, including tree-based and level-k networks. We also examine the behaviour of shortest TBR-sequence between two phylogenetic networks in a class, and whether the TBR-distance changes if intermediate networks from other classes are allowed: for example, the space of phylogenetic trees is an isometric subgraph of the space of phylogenetic networks under TBR. Lastly, we show that computing the TBR-distance and the PR-distance of two phylogenetic networks is NP-hard
On unrooted and root-uncertain variants of several well-known phylogenetic network problems
The hybridization number problem requires us to embed a set of binary rooted
phylogenetic trees into a binary rooted phylogenetic network such that the
number of nodes with indegree two is minimized. However, from a biological
point of view accurately inferring the root location in a phylogenetic tree is
notoriously difficult and poor root placement can artificially inflate the
hybridization number. To this end we study a number of relaxed variants of this
problem. We start by showing that the fundamental problem of determining
whether an \emph{unrooted} phylogenetic network displays (i.e. embeds) an
\emph{unrooted} phylogenetic tree, is NP-hard. On the positive side we show
that this problem is FPT in reticulation number. In the rooted case the
corresponding FPT result is trivial, but here we require more subtle
argumentation. Next we show that the hybridization number problem for unrooted
networks (when given two unrooted trees) is equivalent to the problem of
computing the Tree Bisection and Reconnect (TBR) distance of the two unrooted
trees. In the third part of the paper we consider the "root uncertain" variant
of hybridization number. Here we are free to choose the root location in each
of a set of unrooted input trees such that the hybridization number of the
resulting rooted trees is minimized. On the negative side we show that this
problem is APX-hard. On the positive side, we show that the problem is FPT in
the hybridization number, via kernelization, for any number of input trees.Comment: 28 pages, 8 Figure
Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting
Phylogenetic networks are necessary to represent the tree of life expanded by
edges to represent events such as horizontal gene transfers, hybridizations or
gene flow. Not all species follow the paradigm of vertical inheritance of their
genetic material. While a great deal of research has flourished into the
inference of phylogenetic trees, statistical methods to infer phylogenetic
networks are still limited and under development. The main disadvantage of
existing methods is a lack of scalability. Here, we present a statistical
method to infer phylogenetic networks from multi-locus genetic data in a
pseudolikelihood framework. Our model accounts for incomplete lineage sorting
through the coalescent model, and for horizontal inheritance of genes through
reticulation nodes in the network. Computation of the pseudolikelihood is fast
and simple, and it avoids the burdensome calculation of the full likelihood
which can be intractable with many species. Moreover, estimation at the
quartet-level has the added computational benefit that it is easily
parallelizable. Simulation studies comparing our method to a full likelihood
approach show that our pseudolikelihood approach is much faster without
compromising accuracy. We applied our method to reconstruct the evolutionary
relationships among swordtails and platyfishes (: Poeciliidae),
which is characterized by widespread hybridizations
- …