20 research outputs found
Hoe zijn ze verwant?
Hoe kunnen we reconstrueren hoe hedendaagse organismen zijn ontstaan uit verre voorouders door verschillende evolutionaire processen? Dit is het doel van onderzoek op het gebied van fylogenetische netwerken: grafen die evolutionaire verwantschappen beschrijven. Leo van Iersel ontving in 2011 een Veni-beurs van NWO voor onderzoek op dit gebied. Eind 2014 is hij begonnen als tenure track universitair docent aan de TU Delft. In dit artikel legt hij uit wat fylogenetische netwerken precies zijn en welke wiskundige problemen in dit vakgebied naar voren komenDelft Institute of Applied MathematicsElectrical Engineering, Mathematics and Computer Scienc
Nonbinary Tree-Based Phylogenetic Networks
Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can for example represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and strictly-tree-based nonbinary phylogenetic networks.We give simple graph-theoretic characterizations of tree-based and strictlytree- based nonbinary phylogenetic networks. Moreover, we show for each of these two classes that it can be decided in polynomial time whether a given network is contained in the class. Our approach also provides a new view on tree-based binary phylogenetic networks. Finally, we discuss two examples of nonbinary phylogenetic networks in biology and show how our results can be applied to them.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.OptimizationInformation Managemen
Reconstructibility of unrooted level-k phylogenetic networks from distances
A phylogenetic network is a graph-theoretical tool that is used by biologists to represent the evolutionary history of a collection of species. One potential way of constructing such networks is via a distance-based approach, where one is asked to find a phylogenetic network that in some way represents a given distance matrix, which gives information on the evolutionary distances between present-day taxa. Here, we consider the following question. For which k are unrooted level-k networks uniquely determined by their distance matrices? We consider this question for shortest distances as well as for the case that the multisets of all distances is given. We prove that level-1 networks and level-2 networks are reconstructible from their shortest distances and multisets of distances, respectively. Furthermore we show that, in general, networks of level higher than 1 are not reconstructible from shortest distances and that networks of level higher than 2 are not reconstructible from their multisets of distances.Optimizatio
A Third Strike Against Perfect Phylogeny
Perfect phylogenies are fundamental in the study of evolutionary trees because they capture the situation when each evolutionary trait emerges only once in history; if such events are believed to be rare, then by Occam's Razor such parsimonious trees are preferable as a hypothesis of evolution. A classical result states that 2-state characters permit a perfect phylogeny precisely if each subset of 2 characters permits one. More recently, it was shown that for 3-state characters the same property holds but for size-3 subsets. A long-standing open problem asked whether such a constant exists for each number of states. More precisely, it has been conjectured that for any fixed number of states there exists a constant such that a set of -state characters has a perfect phylogeny if and only if every subset of at most characters has a perfect phylogeny. Informally, the conjecture states that checking fixed-size subsets of characters is enough to correctly determine whether input data permits a perfect phylogeny, irrespective of the number of characters in the input. In this article, we show that this conjecture is false. In particular, we show that for any constant , there exists a set of -state characters such that has no perfect phylogeny, but there exists a perfect phylogeny for every subset of at most characters. Moreover, there already exists a perfect phylogeny when ignoring just one of the characters, independent of which character you ignore. This negative result complements the two negative results ("strikes") of Bodlaender et al. (1992,2000). We reflect on the consequences of this third strike, pointing out that while it does close off some routes for efficient algorithm development, many others remain open.greenOptimizatio
Not all phylogenetic networks are leaf-reconstructible
Unrooted phylogenetic networks are graphs used to represent reticulate evolutionary relationships. Accurately reconstructing such networks is of great relevance for evolutionary biology. It has recently been conjectured that all unrooted phylogenetic networks for at least five taxa can be uniquely reconstructed from their subnetworks obtained by deleting a single taxon. Here, we show that this conjecture is false, by presenting a counter-example for each possible number of taxa that is at least 4. Moreover, we show that the conjecture is still false when restricted to binary networks. This means that, even if we are able to reconstruct the unrooted evolutionary history of each proper subset of some taxon set, this still does not give us enough information to reconstruct their full unrooted evolutionary history.greenOptimizatio
A priori TSP in the scenario model
In this paper, we consider the a priori traveling salesman problem in the scenario model. In this problem, we are given a list of subsets of the vertices, called scenarios, along with a probability for each scenario. Given a tour on all vertices, the resulting tour for a given scenario is obtained by restricting the solution to the vertices of the scenario. The goal is to find a tour on all vertices that minimizes the expected length of the resulting restricted tour. We show that this problem is already NP-hard and APX-hard when all scenarios have size four. On the positive side, we show that there exists a constant-factor approximation algorithm in three restricted cases: if the number of scenarios is fixed, if the number of missing vertices per scenario is bounded by a constant, and if the scenarios are nested. Finally, we discuss an elegant relation with an a priori minimum spanning tree problem.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Optimizatio
Reconstructing Phylogenetic Networks via Cherry Picking and Machine Learning
Combining a set of phylogenetic trees into a single phylogenetic network that explains all of them is a fundamental challenge in evolutionary studies. In this paper, we apply the recently-introduced theoretical framework of cherry picking to design a class of heuristics that are guaranteed to produce a network containing each of the input trees, for practical-size datasets. The main contribution of this paper is the design and training of a machine learning model that captures essential information on the structure of the input trees and guides the algorithms towards better solutions. This is one of the first applications of machine learning to phylogenetic studies, and we show its promise with a proof-of-concept experimental study conducted on both simulated and real data consisting of binary trees with no missing taxa.Optimizatio
Constructing phylogenetic networks via cherry picking and machine learning
Background: Combining a set of phylogenetic trees into a single phylogenetic network that explains all of them is a fundamental challenge in evolutionary studies. Existing methods are computationally expensive and can either handle only small numbers of phylogenetic trees or are limited to severely restricted classes of networks. Results: In this paper, we apply the recently-introduced theoretical framework of cherry picking to design a class of efficient heuristics that are guaranteed to produce a network containing each of the input trees, for practical-size datasets consisting of binary trees. Some of the heuristics in this framework are based on the design and training of a machine learning model that captures essential information on the structure of the input trees and guides the algorithms towards better solutions. We also propose simple and fast randomised heuristics that prove to be very effective when run multiple times. Conclusions: Unlike the existing exact methods, our heuristics are applicable to datasets of practical size, and the experimental study we conducted on both simulated and real data shows that these solutions are qualitatively good, always within some small constant factor from the optimum. Moreover, our machine-learned heuristics are one of the first applications of machine learning to phylogenetics and show its promise.Discrete Mathematics and Optimizatio
An algorithm for reconstructing level-2 phylogenetic networks from trinets
Evolutionary histories for species that cross with one another or exchange genetic material can be represented by leaf-labelled, directed graphs called phylogenetic networks. A major challenge in the burgeoning area of phylogenetic networks is to develop algorithms for building such networks by amalgamating small networks into a single large network. The level of a phylogenetic network is a measure of its deviation from being a tree; the higher the level of a network, the less treelike it becomes. Various algorithms have been developed for building level-1 networks from small networks. However, level-1 networks may not be able to capture the complexity of some data sets. In this paper, we present a polynomial-time algorithm for constructing a rooted binary level-2 phylogenetic network from a collection of 3-leaf networks or trinets. Moreover, we prove that the algorithm will correctly reconstruct such a network if it is given all of the trinets in the network as input. The algorithm runs in time O(tâ‹…n+n4) with t the number of input trinets and n the number of leaves. We also show that there is a fundamental obstruction to constructing level-3 networks from trinets, and so new approaches will need to be developed for constructing level-3 and higher level-networks.OptimizationElectrical Engineering, Mathematics and Computer Scienc
Orchard Networks are Trees with Additional Horizontal Arcs
Phylogenetic networks are used in biology to represent evolutionary histories. The class of orchard phylogenetic networks was recently introduced for their computational benefits, without any biological justification. Here, we show that orchard networks can be interpreted as trees with additional horizontal arcs. Therefore, they are closely related to tree-based networks, where the difference is that in tree-based networks the additional arcs do not need to be horizontal. Then, we use this new characterization to show that the space of orchard networks on n leaves with k reticulations is connected under the rNNI rearrangement move with diameter O(kn+ nlog (n)).Optimizatio