Search CORE

358 research outputs found

A cubic-time algorithm for computing the trinet distance between level-1 networks

Author: Cardona
Cardona
Day
Fischer
Gusfield
Huber
Huber
Huber
Huber
Huson
Huson
James Oldman
Jansson
Jansson
Moret
Moulton
Oldman
Pattengale
Robinson
Steel
Taoyang Wu
van Iersel
Vincent Moulton
Wang
Publication venue: 'Elsevier BV'
Publication date: 15/03/2017
Field of study

In evolutionary biology, phylogenetic networks are constructed to represent the evolution of species in which reticulate events are thought to have occurred, such as recombination and hybridization. It is therefore useful to have efficiently computable metrics with which to systematically compare such networks. Through developing an optimal algorithm to enumerate all trinets displayed by a level-1 network (a type of network that is slightly more general than an evolutionary tree), here we propose a cubic-time algorithm to compute the trinet distance between two level-1 networks. Employing simulations, we also present a comparison between the trinet metric and the so-called Robinson-Foulds phylogenetic network metric restricted to level-1 networks. The algorithms described in this paper have been implemented in JAVA and are freely available at (https://www.uea.ac.uk/computing/TriLoNet

arXiv.org e-Print Archive

Crossref

University of East Anglia digital repository

Evolutionary Inference via the Poisson Indel Process

Author: Alexandre Bouchard-Côté
Buiculescu
Cox
Dreyer
Hein
Hein
Huelsenbeck
Michael I. Jordan
Miklós
Nelesen
Roshan
Saitou
Searls
Wheeler
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 18/01/2013
Field of study

We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classical evolutionary process, the TKF91 model, is a continuous-time Markov chain model comprised of insertion, deletion and substitution events. Unfortunately this model gives rise to an intractable computational problem---the computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a new stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The new model is closely related to the TKF91 model, differing only in its treatment of insertions, but the new model has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared to separate inference of phylogenies and alignments.Comment: 33 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Exploring and visualising spaces of tree reconciliations

Author: Huber Katharina T
Moulton Vincent
Sagot Marie-France
Sinaimeri Blerina
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/11/2018
Field of study

Tree reconciliation is the mathematical tool that is used to investigate the coevolution of organisms, such as hosts and parasites. A common approach to tree reconciliation involves specifying a model that assigns costs to certain events, such as cospeciation, and then tries to find a mapping between two specified phylogenetic trees which minimises the total cost of the implied events. For such models, it has been shown that there may be a huge number of optimal solutions, or at least solutions that are close to optimal. It is therefore of interest to be able to systematically compare and visualise whole collections of reconciliations between a specified pair of trees. In this paper, we consider various metrics on the set of all possible reconciliations between a pair of trees, some that have been defined before but also new metrics that we shall propose. We show that the diameter for the resulting spaces of reconciliations can in some cases be determined theoretically, information that we use to normalise and compare properties of the metrics. We also implement the metrics and compare their behaviour on several host parasite datasets, including the shapes of their distributions. In addition, we show that in combination with multidimensional scaling, the metrics can be useful for visualising large collections of reconciliations, much in the same way as phylogenetic tree metrics can be used to explore collections of phylogenetic trees. Implementations of the metrics can be downloaded from: https://team.inria.fr/erable/en/team-members/blerina-sinaimeri/reconciliation-distances

Crossref

ZENODO

INRIA a CCSD electronic archive server

Dryad Digital Repository (Duke University)

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Electronic Archiving System

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

University of East Anglia digital repository

Hal-Diderot

HAL-Rennes 1

The effect of primer choice and short read sequences on the outcome of 16S rRNA gene based diversity studies

Author: De Vos Paul
Ghyselinck Jonas
Heylen Kim
Pfeiffer Stefan
Sessitsch Angela
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Different regions of the bacterial 16S rRNA gene evolve at different evolutionary rates. The scientific outcome of short read sequencing studies therefore alters with the gene region sequenced. We wanted to gain insight in the impact of primer choice on the outcome of short read sequencing efforts. All the unknowns associated with sequencing data, i.e. primer coverage rate, phylogeny, OTU-richness and taxonomic assignment, were therefore implemented in one study for ten well established universal primers (338f/r, 518f/r, 799f/r, 926f/r and 1062f/r) targeting dispersed regions of the bacterial 16S rRNA gene. All analyses were performed on nearly full length and in silico generated short read sequence libraries containing 1175 sequences that were carefully chosen as to present a representative substitute of the SILVA SSU database. The 518f and 799r primers, targeting the V4 region of the 16S rRNA gene, were found to be particularly suited for short read sequencing studies, while the primer 1062r, targeting V6, seemed to be least reliable. Our results will assist scientists in considering whether the best option for their study is to select the most informative primer, or the primer that excludes interferences by host-organelle DNA. The methodology followed can be extrapolated to other primers, allowing their evaluation prior to the experiment

CiteSeerX

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

FigShare

A generalized Robinson-Foulds distance for labeled trees

Author: Briand S
Dessimoz C
El-Mabrouk N
Lafond M
Lobinska G
Publication venue
Publication date: 18/11/2020
Field of study

Background: The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc). Results: We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting “good” edges, i.e. edges shared between the two trees. Conclusions: We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions. Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf

UCL Discovery