Search CORE

8,596 research outputs found

Distinguishing regional from within-codon rate heterogeneity in DNA sequence alignments

Author: A. Gelman
A. Webb
C. Tuffley
D. Husmeier
D. Husmeier
G. Casella
J. Felsenstein
J. Felsenstein
J. Felsenstein
M. Hasegawa
M.A. Suchard
R.J. Boys
V.N. Minin
W.K. Hastings
W.P. Lehrach
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We present an improved phylogenetic factorial hidden Markov model (FHMM) for detecting two types of mosaic structures in DNA sequence alignments, related to (1) recombination and (2) rate heterogeneity. The focus of the present work is on improving the modelling of the latter aspect. Earlier papers have modelled different degrees of rate heterogeneity with separate hidden states of the FHMM. This approach fails to appreciate the intrinsic difference between two types of rate heterogeneity: long-range regional effects, which are potentially related to differences in the selective pressure, and the short-term periodic patterns within the codons, which merely capture the signature of the genetic code. We propose an improved model that explicitly distinguishes between these two effects, and we assess its performance on a set of simulated DNA sequence alignments

Crossref

University of Strathclyde Institutional Repository

Enlighten

Horseshoe-based Bayesian nonparametric estimation of effective population size trajectories

Author: Carpenter B.
Easton N.A.
Felsenstein J.
Holmes C.E.
Murray I.
Neal R.
Palacios J.A.
R Core Team
Watanabe S.
Publication venue
Publication date: 29/07/2019
Field of study

Phylodynamics is an area of population genetics that uses genetic sequence data to estimate past population dynamics. Modern state-of-the-art Bayesian nonparametric methods for recovering population size trajectories of unknown form use either change-point models or Gaussian process priors. Change-point models suffer from computational issues when the number of change-points is unknown and needs to be estimated. Gaussian process-based methods lack local adaptivity and cannot accurately recover trajectories that exhibit features such as abrupt changes in trend or varying levels of smoothness. We propose a novel, locally-adaptive approach to Bayesian nonparametric phylodynamic inference that has the flexibility to accommodate a large class of functional behaviors. Local adaptivity results from modeling the log-transformed effective population size a priori as a horseshoe Markov random field, a recently proposed statistical model that blends together the best properties of the change-point and Gaussian process modeling paradigms. We use simulated data to assess model performance, and find that our proposed method results in reduced bias and increased precision when compared to contemporary methods. We also use our models to reconstruct past changes in genetic diversity of human hepatitis C virus in Egypt and to estimate population size changes of ancient and modern steppe bison. These analyses show that our new method captures features of the population size trajectories that were missed by the state-of-the-art methods.Comment: 36 pages, including supplementary informatio

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Analytic approach to the evolutionary effects of genetic exchange

Author: A. S. Kondrashov
B. Charlesworth
David A. Kessler
Elisheva Cohen
H. J. Muller
Herbert Levine
J. Felsenstein
N. H. Barton
S. F. Elena
S. P. Otto
Publication venue: 'American Physical Society (APS)'
Publication date: 26/09/2005
Field of study

We present an approximate analytic study of our previously introduced model of evolution including the effects of genetic exchange. This model is motivated by the process of bacterial transformation. We solve for the velocity, the rate of increase of fitness, as a function of the fixed population size,

N

. We find the velocity increases with

\ln N

, eventually saturated at an

N

which depends on the strength of the recombination process. The analytical treatment is seen to agree well with direct numerical simulations of our model equations

arXiv.org e-Print Archive

Crossref

In search of lost introns

Author: Adachi
Aldous
Altschul
Bieri
Blum
Carmel
Collins
Coulombe-Huntington
Csűrös
Csűrös
Devroye
Durbin
Edgar
Felsenstein
Felsenstein
Felsenstein
Friedman
Guindon
Harding
Heard
Hubbard
Igor B. Rogozin
IHBSC
J. Andrew Holey
Jeffares
Kececioglu
Kosakovsky Pond
Larget
Ma
Marchler-Bauer
McDiarmid
McKenzie
Miklós Csűrös
Müller
Nguyen
Nielsen
Nixon
Press
Pruitt
Raible
Rogozin
Rogozin
Rosenberg
Roy
Roy
Roy
Roy
Stamatakis
Steel
Sverdlov
Sverdlov
Tatusov
Vaňácová
Zhang
Publication venue
Publication date: 03/02/2007
Field of study

Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after

O(nL)

preprocessing time, subsequent evaluations take

O(nL/\log L)

time almost surely in the Yule-Harding random model of

n

-taxon phylogenies, where

L

is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now

arXiv.org e-Print Archive

Crossref

College of Saint Benedict and Saint John’s University: DigitalCommons@CSB/SJU

Preservation of information in a prebiotic package model

Author: A. Ninjenhuis
A. Oparin
Daniel A. M. M. Silvestre
H. J. Muller
J. Felsenstein
J. M. Hammersley
José F. Fontanari
M. Lynch
M. N. Barber
P. Jagers
T. Czárán
W. Feller
W. J. Ewens
Publication venue: 'American Physical Society (APS)'
Publication date: 19/04/2007
Field of study

The coexistence between different informational molecules has been the preferred mode to circumvent the limitation posed by imperfect replication on the amount of information stored by each of these molecules. Here we reexamine a classic package model in which distinct information carriers or templates are forced to coexist within vesicles, which in turn can proliferate freely through binary division. The combined dynamics of vesicles and templates is described by a multitype branching process which allows us to write equations for the average number of the different types of vesicles as well as for their extinction probabilities. The threshold phenomenon associated to the extinction of the vesicle population is studied quantitatively using finite-size scaling techniques. We conclude that the resultant coexistence is too frail in the presence of parasites and so confinement of templates in vesicles without an explicit mechanism of cooperation does not resolve the information crisis of prebiotic evolution.Comment: 9 pages, 8 figures, accepted version, to be published in PR

arXiv.org e-Print Archive

Crossref

MixtureTree: a program for constructing phylogeny

Author: Bruce G Lindsay
DF Robinson
DL Swofford
F Ronquist
J Felsenstein
J Felsenstein
J Felsenstein
J Li
JP Huelsenbeck
Michael S Rosenberg
O Harismendy
RR Hudson
S Geman
S Holmes
S Kumar
SC Chen
Shu-Chuan Chen
T Margush
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background MixtureTree v1.0 is a Linux based program (written in C++) which implements an algorithm based on mixture models for reconstructing phylogeny from binary sequence data, such as single-nucleotide polymorphisms (SNPs). In addition to the mixture algorithm with three different optimization options, the program also implements a bootstrap procedure with majority-rule consensus. Results The MixtureTree program written in C++ is a Linux based package. The User's Guide and source codes will be available at <url>http://math.asu.edu/~scchen/MixtureTree.html</url> Conclusions The efficiency of the mixture algorithm is relatively higher than some classical methods, such as Neighbor-Joining method, Maximum Parsimony method and Maximum Likelihood method. The shortcoming of the mixture tree algorithms, for example timing consuming, can be improved by implementing other revised Expectation-Maximization(EM) algorithms instead of the traditional EM algorithm.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Asexual and sexual replication in sporulating organisms

Author: Bohyun Lee
C. O. Wilke
C. S. Potten
D. B. Dusenbery
Emmanuel Tannenbaum
G. Bell
G. C. Williams
H. J. Muller
J. Felsenstein
J. Haigh
J. Maynard-Smith
J. R. Merok
K. Negishi
L. Kelly
P. Schuster
R. E. Michod
W. Ebeling
Publication venue: 'American Physical Society (APS)'
Publication date: 18/11/2006
Field of study

This paper develops models describing asexual and sexual replication in sporulating organisms. Replication via sporulation is the replication strategy for all multicellular life, and may even be observed in unicellular life (such as with budding yeast). We consider diploid populations replicating via one of two possible sporulation mechanisms: (1) Asexual sporulation, whereby adult organisms produce single-celled diploid spores that grow into adults themselves. (2) Sexual sporulation, whereby adult organisms produce single-celled diploid spores that divide into haploid gametes. The haploid gametes enter a haploid "pool", where they may recombine with other haploids to form a diploid spore that then grows into an adult. We consider a haploid fusion rate given by second-order reaction kinetics. We work with a simplified model where the diploid genome consists of only two chromosomes, each of which may be rendered defective with a single point mutation of the wild-type. We find that the asexual strategy is favored when the rate of spore production is high compared to the characteristic growth rate from a spore to a reproducing adult. Conversely, the sexual strategy is favored when the rate of spore production is low compared to the characteristic growth rate from a spore to a reproducing adult. As the characteristic growth time increases, or as the population density increases, the critical ratio of spore production rate to organism growth rate at which the asexual strategy overtakes the sexual one is pushed to higher values. Therefore, the results of this model suggest that, for complex multicellular organisms, sexual replication is favored at high population densities, and low growth and sporulation rates.Comment: 8 pages, 5 figures, to be submitted to Journal of Theoretical Biology, figures not included in this submissio

arXiv.org e-Print Archive

Crossref

Efficient FPT algorithms for (strict) compatibility of unrooted phylogenetic trees

Author: AD Gordon
AV Aho
C Scornavacca
D Bryant
D Lokshtanov
F Delsuc
J Felsenstein
M Frick
M Ng
M Steel
OR Bininda-Emonds
R Diestel
T Kloks
W Maddison
Publication venue
Publication date: 01/01/2016
Field of study

In phylogenetics, a central problem is to infer the evolutionary relationships between a set of species

X

; these relationships are often depicted via a phylogenetic tree -- a tree having its leaves univocally labeled by elements of

X

and without degree-2 nodes -- called the "species tree". One common approach for reconstructing a species tree consists in first constructing several phylogenetic trees from primary data (e.g. DNA sequences originating from some species in

X

), and then constructing a single phylogenetic tree maximizing the "concordance" with the input trees. The so-obtained tree is our estimation of the species tree and, when the input trees are defined on overlapping -- but not identical -- sets of labels, is called "supertree". In this paper, we focus on two problems that are central when combining phylogenetic trees into a supertree: the compatibility and the strict compatibility problems for unrooted phylogenetic trees. These problems are strongly related, respectively, to the notions of "containing as a minor" and "containing as a topological minor" in the graph community. Both problems are known to be fixed-parameter tractable in the number of input trees

k

, by using their expressibility in Monadic Second Order Logic and a reduction to graphs of bounded treewidth. Motivated by the fact that the dependency on

k

of these algorithms is prohibitively large, we give the first explicit dynamic programming algorithms for solving these problems, both running in time

2^{O(k^2)} \cdot n

, where

n

is the total size of the input.Comment: 18 pages, 1 figur

arXiv.org e-Print Archive

Nocardia kroppenstedtii sp. nov., a novel actinomycete isolated from a lung transplant patient with a pulmonary infection

Author: Amanda L. Jones
Andrew. J. Fisher
Andrews
Brown-Elliott
Collins
Eoin P. Judge
Everest
Felsenstein
Felsenstein
Gonzalez
Goodfellow
Goodfellow
Goodfellow
Gordon
Hasegawa
Isik
Jannat-Khah
John D. Perry
Jukes
Kate Gould
Kim
Kimberley Boagey
Kluge
Kroppenstedt
Kämpfer
Lechevalier
Margaret M. Hannan
Michael Goodfellow
Minnikin
Minnikin
Moser
Rahul Mahida
Ros Brown
Saitou
Sazak
Staneck
Sutcliffe
Tamura
Uchida
Publication venue: 'Microbiology Society'
Publication date: 01/03/2014
Field of study

An actinomycete, strain N1286T, isolated from a lung transplant patient with a pulmonary infection, was provisionally assigned to the genus Nocardia. The strain had chemotaxonomic and morphological properties typical of members of the genus Nocardia and formed a distinct phyletic line in the Nocardia 16S rRNA gene tree. It was most closely related to Nocardia farcinica DSM 43665T (99.8% gene similarity) but was distinguished from the latter by a low level of DNA:DNA relatedness. These strains were also distinguished by a broad range of phenotypic properties. On the basis of these data, it is proposed that isolate N1286T (=DSM 45810T = NCTC 13617T) should be classified as the type strain of a new Nocardia species for which the name Nocardia kroppenstedtii is proposed

Northumbria University Research Portal

Crossref

University of Birmingham Research Portal

Fast computation of distance estimators

Author: A Rambaut
D Swofford
F Barker
H Kishino
I Elias
Isaac Elias
J Felsenstein
J Felsenstein
Jens Lagergren
K Tamura
K Tuplin
L Arvestad
M Kimura
N Saitou
T Jukes
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Some distance methods are among the most commonly used methods for reconstructing phylogenetic trees from sequence data. The input to a distance method is a distance matrix, containing estimated pairwise distances between all pairs of taxa. Distance methods themselves are often fast, e.g., the famous and popular Neighbor Joining (NJ) algorithm reconstructs a phylogeny of n taxa in time O(n(3)). Unfortunately, the fastest practical algorithms known for Computing the distance matrix, from n sequences of length l, takes time proportional to l·n(2). Since the sequence length typically is much larger than the number of taxa, the distance estimation is the bottleneck in phylogeny reconstruction. This bottleneck is especially apparent in reconstruction of large phylogenies or in applications where many trees have to be reconstructed, e.g., bootstrapping and genome wide applications. RESULTS: We give an advanced algorithm for Computing the number of mutational events between DNA sequences which is significantly faster than both Phylip and Paup. Moreover, we give a new method for estimating pairwise distances between sequences which contain ambiguity Symbols. This new method is shown to be more accurate as well as faster than earlier methods. CONCLUSION: Our novel algorithm for Computing distance estimators provides a valuable tool in phylogeny reconstruction. Since the running time of our distance estimation algorithm is comparable to that of most distance methods, the previous bottleneck is removed. All distance methods, such as NJ, require a distance matrix as input and, hence, our novel algorithm significantly improves the overall running time of all distance methods. In particular, we show for real world biological applications how the running time of phylogeny reconstruction using NJ is improved from a matter of hours to a matter of seconds

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central