40 research outputs found

    Consensus Clusters in Robinson-Foulds Reticulation Networks

    Get PDF
    Inference of phylogenetic networks - the evolutionary histories of species involving speciation as well as reticulation events - has proved to be an extremely challenging problem even for smaller datasets easily tackled by supertree inference methods. An effective way to boost the scalability of distance-based supertree methods originates from the Pareto (for clusters) property, which is a highly desirable property for phylogenetic consensus methods. In particular, one can employ strict consensus merger algorithms to boost the scalability and accuracy of supertree methods satisfying Pareto; cf. SuperFine. In this work, we establish a Pareto-like property for phylogenetic networks. Then we consider the recently introduced RF-Net method that heuristically solves the so-called RF-Network problem and which was demonstrated to be an efficient and effective tool for the inference of hybridization and reassortment networks. As our main result, we provide a constructive proof (entailing an explicit refinement algorithm) that the Pareto property applies to the RF-Network problem when the solution space is restricted to the popular class of tree-child networks. This result implies that strict consensus merger strategies, similar to SuperFine, can be directly applied to boost both accuracy and scalability of RF-Net significantly. Finally, we further investigate the optimum solutions to the RF-Network problem; in particular, we describe structural properties of all optimum (tree-child) RF-networks in relation to strict consensus clusters of the input trees

    Inferring explicit weighted consensus networks to represent alternative evolutionary histories

    Get PDF
    Background: The advent of molecular biology techniques and constant increase in availability of genetic material have triggered the development of many phylogenetic tree inference methods. However, several reticulate evolution processes, such as horizontal gene transfer and hybridization, have been shown to blur the species\ud evolutionary history by causing discordance among phylogenies inferred from different genes.\ud Methods: To tackle this problem, we hereby describe a new method for inferring and representing alternative(reticulate) evolutionary histories of species as an explicit weighted consensus network which can be constructed from a collection of gene trees with or without prior knowledge of the species phylogeny.\ud Results: We provide a way of building a weighted phylogenetic network for each of the following reticulation\ud mechanisms: diploid hybridization, intragenic recombination and complete or partial horizontal gene transfer. We successfully tested our method on some synthetic and real datasets to infer the above-mentioned evolutionary events which may have influenced the evolution of many species.\ud Conclusions: Our weighted consensus network inference method allows one to infer, visualize and validate statistically major conflicting signals induced by the mechanisms of reticulate evolution. The results provided by the new method can be used to represent the inferred conflicting signals by means of explicit and easy-to-interpret phylogenetic networks

    Finding Median Reticulation Network

    Get PDF

    Process Planning for Assembly and Hybrid Manufacturing in Smart Environments

    Get PDF
    Manufacturers strive for efficiently managing the consequences arising from the product proliferation during the entire product life cycle. New manufacturing trends such as smart manufacturing (Industry 4.0) present a substantial opportunity for managing variety. The main objective of this research is to help the manufacturers with handling the challenges arising from the product variety by utilizing the technological advances of the new manufacturing trends. This research focuses mainly on the process planning phase. This research aims at developing novel process planning methods for utilizing the technological advances accompanied by the new manufacturing trends such as smart manufacturing (Industry 4.0) in order to manage the product variety. The research has successfully addressed the macro process planning of a product family for two manufacturing domains: assembly and hybrid manufacturing. A new approach was introduced for assembly sequencing based on the notion of soft-wired galled networks used in evolutionary studies in Biological and phylogenetic sciences. A knowledge discovery model was presented by exploiting the assembly sequence data records of the legacy products in order to extract the embedded knowledge in such data and use it to speed up the assembly sequence planning. The new approach has the capability to overcome the critical limitation of assembly sequence retrieval methods that are not able to capture more than one assembly sequence for a given product. A novel genetic algorithm-based model was developed for that purpose. The extracted assembly sequence network is representing alternative assembly sequences. These alternative assembly sequences can be used by a smart system in which its components are connected together through a wireless sensor network to allow a smart material handling system to change its routing in case any disruptions happened. A novel concept in the field of product variety management by generating product family platforms and process plans for customization into different product variants utilizing additive and subtractive processes is introduced for the first time. A new mathematical programming optimization model is proposed. The model objective is to provide the optimum selection of features that can form a single product platform and the processes needed to customize this platform into different product variants that fall within the same product family, taking into consideration combining additive and subtractive manufacturing. For multi-platform and their associated process plans, a phylogenetic median-joining network algorithm based model is used that can be utilized in case of the demand and the costs are unknown. Furthermore, a novel genetic algorithm-based model is developed for generating multi-platform, and their associated process plans in case of the demand and the costs are known. The model\u27s objective is to minimize the total manufacturing cost. The developed models were applied on examples of real products for demonstration and validation. Moreover, comparisons with related existing methods were conducted to demonstrate the superiority of the developed models. The outcomes of this research provide efficient and easy to implement process planning for managing product variety benefiting from the advances in the technology of the new manufacturing trends. The developed models and methods present a package of variety management solutions that can significantly support manufacturers at the process planning stage

    Barking up the wrong tree : some obstacles to phylogenetic reconstruction

    Get PDF
    Phylogenetics is the study of evolutionary relationships between entities, usually biological in nature. The primary aim of such study is to elucidate the structure of these evolutionary histories. Unfortunately, such study can run into a variety of obstacles, both practical and theoretical. In this thesis we explore theoretical obstacles to phylogenetic reconstruction, by examining several scenarios in which distinguishing between similar structures can become quite difficult. In Chapter 2, we consider when metrics on trees and metrics on networks can become indistinguishable, and present several novel results in this area, showing that it is possible for any tree metric to be represented on a non-trivial network, and provide early results on the possible structures of these networks. In Chapter 3, we consider tree-based networks - a phenomenon in which networks have a strong tree-like signal. We present the first findings on these networks in the context of unrooted non-binary networks. We characterise the circumstances under which such networks can become `saturated' by these signals, and provide some graph theoretical results in this area as well. In Chapter 4 we consider the scenario in which two trees can appear similar due to their hierarchical structure. We present a new metric to quantify this similarity, and use simulations to show several promising properties of the metric and the relative accuracy of a function that gives an upper bound to the metric

    Post-processing of phylogenetic trees

    Get PDF

    PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Phylogenies, i.e., the evolutionary histories of groups of taxa, play a major role in representing the interrelationships among biological entities. Many software tools for reconstructing and evaluating such phylogenies have been proposed, almost all of which assume the underlying evolutionary history to be a tree. While trees give a satisfactory first-order approximation for many families of organisms, other families exhibit evolutionary mechanisms that cannot be represented by trees. Processes such as horizontal gene transfer (HGT), hybrid speciation, and interspecific recombination, collectively referred to as <it>reticulate evolutionary events</it>, result in <it>networks</it>, rather than trees, of relationships. Various software tools have been recently developed to analyze reticulate evolutionary relationships, which include SplitsTree4, LatTrans, EEEP, HorizStory, and T-REX.</p> <p>Results</p> <p>In this paper, we report on the PhyloNet software package, which is a suite of tools for analyzing reticulate evolutionary relationships, or <it>evolutionary networks</it>, which are rooted, directed, acyclic graphs, leaf-labeled by a set of taxa. These tools can be classified into four categories: (1) evolutionary network representation: reading/writing evolutionary networks in a newly devised compact form; (2) evolutionary network characterization: analyzing evolutionary networks in terms of three basic building blocks – trees, clusters, and tripartitions; (3) evolutionary network comparison: comparing two evolutionary networks in terms of topological dissimilarities, as well as fitness to sequence evolution under a maximum parsimony criterion; and (4) evolutionary network reconstruction: reconstructing an evolutionary network from a species tree and a set of gene trees.</p> <p>Conclusion</p> <p>The software package, PhyloNet, offers an array of utilities to allow for efficient and accurate analysis of evolutionary networks. The software package will help significantly in analyzing large data sets, as well as in studying the performance of evolutionary network reconstruction methods. Further, the software package supports the proposed eNewick format for compact representation of evolutionary networks, a feature that allows for efficient interoperability of evolutionary network software tools. Currently, all utilities in PhyloNet are invoked on the command line.</p

    Mismatches between phylogenetic trees in Historical Linguistics

    Get PDF
    The field of phylogenetics provides computational methods which can be adapted into (computational) linguistics. Due to parallels between the two fields, the interest of combining both arose. The adapted and modified methods can be used to study the history and evolution of languages and therefore new approaches emerged. One approach is the comparison of two trees. Up to now, trees were only compared to test different reconstruction methods. This thesis exploits the idea of tree comparison for the detection of mismatches. To discover these mismatches, two types of linguistic trees are compared. These trees are so called language and concept trees. The language tree represents the history of languages, whilst concept trees display the evolutionary history of a representation of one specific word. The concept and language tree are compared using popular methods from phylogenetics. One of these methods is the computation of the distance between trees. The underlying data for these trees is provided by the ASJP database (Wichmann et al., 2012). Using this data, linguistic reconstruction algorithms such as the dERC (Jäger, 2013) are able to construct proper linguistic trees which can be compared automatically. The detected mismatches between the trees can be interpreted using linguistic background knowledge to get insights in the evolutionary history of languages. Within an evolutionary network, these mismatches can be depicted by reticulations.Die Phylogenie bietet Rechenverfahren, die für die (Computer-) Linguistik angepasst werden können. Einige dieser Methoden können aufgrund der Gemeinsamkeiten beider Bereiche in die historische Linguistik übernommen werden. Diese, für die Linguistik angepassten und modifizierten Methoden, können angewandt werden, um Geschichte und Entwicklung von Sprachen zu untersuchen, wobei diese Erkenntnisse zu neue Ansätze führen. Eine dieser Herangehensweisen ist der Vergleich zweier Bäume. In der Phyloge- nie werden Bäume hauptsächlich verglichen, um Rekonstruktionsmethoden zu testen. Diese Arbeit fußt auf der Idee, durch den Vergleich der Bäume Unterschiede festzustellen. Um Abweichungen zwischen ihnen berechnen zu können, werden zwei Arten von Bäumen, Sprach- und Konzeptbäume, verglichen. Der Sprachbaum stellt die Geschichte der Sprachen dar, während der Konzeptbaum die evolutionäre Vergangenheit einer bestimmten Repräsentation eines Wortes zeigt. Konzept- und Sprachbaum werden mit phylogenetischen Methoden verglichen. Eines dieser Verfahren ist die Berechnung der Distanz zwischen Bäumen. Die zugrunde liegenden Daten für diese Bäume werden von der ASJP Datenbank bereitgestellt (Wichmann et al., 2012). Mit Hilfe dieser Daten sind linguistische Rekonstruktionsalgorithmen, wie der dERC Algroithmus (Jäger, 2013), in der Lage, sinnvolle Bäume zu konstruieren. Diese können dann automatisch verglichen werden. Die dadurch festgestellten Abweichungen können mit linguistischem Fachwissen interpretiert werden. Dies ermöglicht Einblicke in die Entstehungsgeschichte von Sprachen. Die Unterschiede der Bäume können dann in einem evolutionären Netzwerk visualisiert werden

    Inference of parsimonious species phylogenies from multi-locus data

    Get PDF
    The main focus of this dissertation is the inference of species phylogenies, i.e. evolutionary histories of species. Species phylogenies allow us to gain insights into the mechanisms of evolution and to hypothesize past evolutionary events. They also find applications in medicine, for example, the understanding of antibiotic resistance in bacteria. The reconstruction of species phylogenies is, therefore, of both biological and practical importance. In the traditional method for inferring species trees from genetic data, we sequence a single locus in species genomes, reconstruct a gene tree, and report it as the species tree. Biologists have long acknowledged that a gene tree can be different from a species tree, thus implying that this traditional method might infer the wrong species tree. Moreover, reticulate events such as horizontal gene transfer and hybridization make the evolution of species no longer tree-like. The availability of multi-locus data provides us with excellent opportunities to resolve those long standing problems. In this dissertation, we present parsimony-based algorithms for reconciling species/gene tree incongruence that is assumed to be due solely to lineage sorting. We also describe a unified framework for detecting hybridization despite lineage sorting. To address the first problem of species/gene tree incongruence caused by lineage sorting, we present three algorithms. In Chapter 3, we present an algorithm based on an integer-linear programming (ILP) formula to infer the species tree's topology and divergence times from multiple gene trees. In Chapter 4, we describe two methods that infer the species tree by minimizing deep coalescences (MDC), a criterion introduced by Maddison in 1997. The first method is also based on an ILP formula, but it eliminates the enumeration phase of candidate species trees of the algorithm in Chapter 3. The second algorithm further eliminates the dependence on external ILP solvers by employing dynamic programming. We ran those methods on both biological and simulated data, and experimental results demonstrate their high accuracy and speed in species tree inference, which makes them suitable for analyzing multi-locus data. The second problem this dissertation deals with is reticulation (e.g., horizontal gene transfer, hybridization) detection despite lineage sorting. The phylogeny-based approach compares the evolutionary histories of different genomic regions and test them for incongruence that would indicate hybridization. However, since species tree and gene tree incongruence can also be due to lineage sorting, phylogeny-based hybridization methods might overestimate the amount of hybridization. We present in this dissertation a framework that can handle both hybridization and lineage sorting simultaneously. In this framework, we extend the MDC criterion to phylogenetic networks, and use it to propose a heuristic to detect hybridization despite lineage sorting. Empirical results on a simulated and a yeast data set show its promising performance, as well as several directions for future research
    corecore