853 research outputs found

    Neighborhoods of trees in circular orderings

    Get PDF
    In phylogenetics, a common strategy used to construct an evolutionary tree for a set of species X is to search in the space of all such trees for one that optimizes some given score function (such as the minimum evolution, parsimony or likelihood score). As this can be computationally intensive, it was recently proposed to restrict such searches to the set of all those trees that are compatible with some circular ordering of the set X. To inform the design of efficient algorithms to perform such searches, it is therefore of interest to find bounds for the number of trees compatible with a fixed ordering in the neighborhood of a tree that is determined by certain tree operations commonly used to search for trees: the nearest neighbor interchange (nni), the subtree prune and regraft (spr) and the tree bisection and reconnection (tbr) operations. We show that the size of such a neighborhood of a binary tree associated with the nni operation is independent of the tree’s topology, but that this is not the case for the spr and tbr operations. We also give tight upper and lower bounds for the size of the neighborhood of a binary tree for the spr and tbr operations and characterize those trees for which these bounds are attained

    A Note on Encodings of Phylogenetic Networks of Bounded Level

    Full text link
    Driven by the need for better models that allow one to shed light into the question how life's diversity has evolved, phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i.e. uniquely describe) the network that induces it? In this note, we present a complete answer to this question for the special case of a level-1 (phylogenetic) network by characterizing those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. Given that this type of network forms the first layer of the rich hierarchy of level-k networks, k a non-negative integer, it is natural to wonder whether our arguments could be extended to members of that hierarchy for higher values for k. By giving examples, we show that this is not the case

    From trees to networks and back

    Get PDF
    The evolutionary history of a set of species is commonly represented by a phylogenetic tree. Often, however, the data contain conflicting signals, which can be better represented by a more general structure, namely a phylogenetic network. Such networks allow the display of several alternative evolutionary scenarios simultaneously but this can come at the price of complex visual representations. Using so-called circular split networks reduces this complexity, because this type of network can always be visualized in the plane without any crossing edges. These circular split networks form the core of this thesis. We construct them, use them as a search space for minimum evolution trees and explore their properties. More specifically, we present a new method, called SuperQ, to construct a circular split network summarising a collection of phylogenetic trees that have overlapping leaf sets. Then, we explore the set of phylogenetic trees associated with a �fixed circular split network, in particular using it as a search space for optimal trees. This set represents just a tiny fraction of the space of all phylogenetic trees, but we still �find trees within it that compare quite favourably with those obtained by a leading heuristic, which uses tree edit operations for searching the whole tree space. In the last part, we advance our understanding of the set of phylogenetic trees associated with a circular split network. Specifically, we investigate the size of the so-called circular tree neighbourhood for the three tree edit operations, tree bisection and reconnection (tbr), subtree prune and regraft (spr) and nearest neighbour interchange (nni)

    Computing Maximum Agreement Forests without Cluster Partitioning is Folly

    Get PDF
    Computing a maximum (acyclic) agreement forest (M(A)AF) of a pair of phylogenetic trees is known to be fixed-parameter tractable; the two main techniques are kernelization and depth-bounded search. In theory, kernelization-based algorithms for this problem are not competitive, but they perform remarkably well in practice. We shed light on why this is the case. Our results show that, probably unsurprisingly, the kernel is often much smaller in practice than the theoretical worst case, but not small enough to fully explain the good performance of these algorithms. The key to performance is cluster partitioning, a technique used in almost all fast M(A)AF algorithms. In theory, cluster partitioning does not help: some instances are highly clusterable, others not at all. However, our experiments show that cluster partitioning leads to substantial performance improvements for kernelization-based M(A)AF algorithms. In contrast, kernelizing the individual clusters before solving them using exponential search yields only very modest performance improvements or even hurts performance; for the vast majority of inputs, kernelization leads to no reduction in the maximal cluster size at all. The choice of the algorithm applied to solve individual clusters also significantly impacts performance, even though our limited experiment to evaluate this produced no clear winner; depth-bounded search, exponential search interleaved with kernelization, and an ILP-based algorithm all achieved competitive performance

    Deep kernelization for the Tree Bisection and Reconnnect (TBR) distance in phylogenetics

    Full text link
    We describe a kernel of size 9k-8 for the NP-hard problem of computing the Tree Bisection and Reconnect (TBR) distance k between two unrooted binary phylogenetic trees. We achieve this by extending the existing portfolio of reduction rules with three novel new reduction rules. Two of the rules are based on the idea of topologically transforming the trees in a distance-preserving way in order to guarantee execution of earlier reduction rules. The third rule extends the local neighbourhood approach introduced in (Kelk and Linz, Annals of Combinatorics 24(3), 2020) to more global structures, allowing new situations to be identified when deletion of a leaf definitely reduces the TBR distance by one. The bound on the kernel size is tight up to an additive term. Our results also apply to the equivalent problem of computing a Maximum Agreement Forest (MAF) between two unrooted binary phylogenetic trees. We anticipate that our results will be more widely applicable for computing agreement-forest based dissimilarity measures.Comment: 38 pages. In this version a figure has been added, some references have been added, some small typo's have been fixed and the introduction and conclusion have been slightly extended. Submitted for journal revie

    Barking up the wrong tree : some obstacles to phylogenetic reconstruction

    Get PDF
    Phylogenetics is the study of evolutionary relationships between entities, usually biological in nature. The primary aim of such study is to elucidate the structure of these evolutionary histories. Unfortunately, such study can run into a variety of obstacles, both practical and theoretical. In this thesis we explore theoretical obstacles to phylogenetic reconstruction, by examining several scenarios in which distinguishing between similar structures can become quite difficult. In Chapter 2, we consider when metrics on trees and metrics on networks can become indistinguishable, and present several novel results in this area, showing that it is possible for any tree metric to be represented on a non-trivial network, and provide early results on the possible structures of these networks. In Chapter 3, we consider tree-based networks - a phenomenon in which networks have a strong tree-like signal. We present the first findings on these networks in the context of unrooted non-binary networks. We characterise the circumstances under which such networks can become `saturated' by these signals, and provide some graph theoretical results in this area as well. In Chapter 4 we consider the scenario in which two trees can appear similar due to their hierarchical structure. We present a new metric to quantify this similarity, and use simulations to show several promising properties of the metric and the relative accuracy of a function that gives an upper bound to the metric

    New algorithms and mathematical tools for phylogenetics beyond trees

    Get PDF
    Phylogenetic trees and networks are mathematical structures for representing the evolutionary history of a set of taxa. The need for methods to build such structures from various type of data, as well as the need to understand the story these data may tell, give rise to exciting new challenges for mathematics and computer sciences. This thesis presents some recent advances in both these directions. It features new mathematical methodology for reconstructing phylogenetic networks, and new computational tools for inferring complex evolutionary scenarios. These come with a thorough analysis, assessing their attractiveness in terms of their theoretical properties. It expands on previous results, which are themselves briefly reviewed, and conclude with potentially interesting further research questions
    • …
    corecore