Search CORE

9 research outputs found

Phylogenetic Stochastic Mapping without Matrix Exponentiation

Author: Irvahn Jan
Minin Vladimir N.
Publication venue
Publication date: 20/03/2014
Field of study

Phylogenetic stochastic mapping is a method for reconstructing the history of trait changes on a phylogenetic tree relating species/organisms carrying the trait. State-of-the-art methods assume that the trait evolves according to a continuous-time Markov chain (CTMC) and work well for small state spaces. The computations slow down considerably for larger state spaces (e.g. space of codons), because current methodology relies on exponentiating CTMC infinitesimal rate matrices -- an operation whose computational complexity grows as the size of the CTMC state space cubed. In this work, we introduce a new approach, based on a CTMC technique called uniformization, that does not use matrix exponentiation for phylogenetic stochastic mapping. Our method is based on a new Markov chain Monte Carlo (MCMC) algorithm that targets the distribution of trait histories conditional on the trait data observed at the tips of the tree. The computational complexity of our MCMC method grows as the size of the CTMC state space squared. Moreover, in contrast to competing matrix exponentiation methods, if the rate matrix is sparse, we can leverage this sparsity and increase the computational efficiency of our algorithm further. Using simulated data, we illustrate advantages of our MCMC algorithm and investigate how large the state space needs to be for our method to outperform matrix exponentiation approaches. We show that even on the moderately large state space of codons our MCMC method can be significantly faster than currently used matrix exponentiation methods.Comment: 33 pages, including appendice

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

rbrothers: R Package for Bayesian Multiple Change-Point Recombination Detection.

Author: Chattopadhyay Sujay
Irvahn Jan
Minin Vladimir N
Sokurenko Evgeni V
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Phylogenetic recombination detection is a fundamental task in bioinformatics and evolutionary biology. Most of the computational tools developed to attack this important problem are not integrated into the growing suite of R packages for statistical analysis of molecular sequences. Here, we present an R package, rbrothers, that makes a Bayesian multiple change-point model, one of the most sophisticated model-based phylogenetic recombination tools, available to R users. Moreover, we equip the Bayesian change-point model with a set of pre- and post- processing routines that will broaden the application domain of this recombination detection framework. Specifically, we implement an algorithm that forms the set of input trees required by multiple change-point models. We also provide functionality for checking Markov chain Monte Carlo convergence and creating estimation result summaries and graphics. Using rbrothers, we perform a comparative analysis of two Salmonella enterica genes, fimA and fimH, that encode major and adhesive subunits of the type 1 fimbriae, respectively. We believe that rbrothers, available at R-Forge: http://evolmod.r-forge.r-project.org/, will allow researchers to incorporate recombination detection into phylogenetic workflows already implemented in R

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Phylogenetic Stochastic Mapping

Author: Irvahn Jan
Publication venue
Publication date: 01/12/2015
Field of study

Thesis (Ph.D.)--University of Washington, 2015-12Phylogenetic stochastic mapping is a method for reconstructing the history of trait changes on a phylogenetic tree relating species/organisms carrying the trait. State-of-the-art methods assume that the trait evolves according to a continuous-time Markov chain (CTMC) and work well for small state spaces. The computations slow down considerably for larger state spaces (e.g. space of codons), because current methodology relies on exponentiating CTMC infinitesimal rate matrices --- an operation whose computational complexity grows as the size of the CTMC state space cubed. In this work, we introduce a new approach, based on a CTMC technique called uniformization, that does not use matrix exponentiation for phylogenetic stochastic mapping. Our method is based on a new Markov chain Monte Carlo (MCMC) algorithm that targets the distribution of trait histories conditional on the trait data observed at the tips of the tree. The computational complexity of our MCMC method grows as the size of the CTMC state space squared. Moreover, in contrast to competing matrix exponentiation methods, if the rate matrix is sparse, we can leverage this sparsity and increase the computational efficiency of our algorithm further. Using simulated data, we illustrate advantages of our MCMC algorithm and investigate how large the state space needs to be for our method to outperform matrix exponentiation approaches. We show that even on the moderately large state space of codons our MCMC method can be significantly faster than currently used matrix exponentiation methods. We apply our new stochastic mapping technique to two data sets. The first concerns the reproductive parity mode of squamates, and the second concerns the evolution of bioluminescent bacterial photophores in cephalopods. In both cases there were concerns that the standard CTMC model of trait evolution for the binary morphological traits was insufficient due to rate matrix heterogeneity across the phylogeny. To address these concerns we developed a Markov modulated Markov process model of trait evolution and integrated this hidden rates model with our matrix exponentiation free stochastic mapping technique. We found that the evidence supporting multiple gains of bioluminescence in cephalopods was mildly attenuated by accounting for potential rate matrix heterogeneity. Conversely, we found that accounting for rate matrix heterogeneity on the squamate phylogeny dramatically changed conclusions about the reproductive parity mode of the most recent common ancestor of squamates. The standard two state CTMC model of trait evolution found insufficient evidence to distinguish between oviparity and viviparity at the root of Squamata while a variety of hidden rates models found strong evidence that the most recent common ancestor of squamates was oviparous

DSpace at The University of Washington

rbrothers: R Package for Bayesian Multiple Change-Point Recombination Detection.

Author: Irvahn Jan,
Publication venue
Publication date: 16/05/2020
Field of study

Ezid

Phylogenetic stochastic mapping without matrix exponentiation.

Author: Irvahn Jan,
Publication venue
Publication date: 25/05/2018
Field of study

Ezid

Recommended from our members

Phylogenetic stochastic mapping without matrix exponentiation.

Author: Irvahn Jan
Minin Vladimir N
Publication venue: eScholarship, University of California
Publication date: 01/09/2014
Field of study

Phylogenetic stochastic mapping is a method for reconstructing the history of trait changes on a phylogenetic tree relating species/organism carrying the trait. State-of-the-art methods assume that the trait evolves according to a continuous-time Markov chain (CTMC) and works well for small state spaces. The computations slow down considerably for larger state spaces (e.g., space of codons), because current methodology relies on exponentiating CTMC infinitesimal rate matrices-an operation whose computational complexity grows as the size of the CTMC state space cubed. In this work, we introduce a new approach, based on a CTMC technique called uniformization, which does not use matrix exponentiation for phylogenetic stochastic mapping. Our method is based on a new Markov chain Monte Carlo (MCMC) algorithm that targets the distribution of trait histories conditional on the trait data observed at the tips of the tree. The computational complexity of our MCMC method grows as the size of the CTMC state space squared. Moreover, in contrast to competing matrix exponentiation methods, if the rate matrix is sparse, we can leverage this sparsity and increase the computational efficiency of our algorithm further. Using simulated data, we illustrate advantages of our MCMC algorithm and investigate how large the state space needs to be for our method to outperform matrix exponentiation approaches. We show that even on the moderately large state space of codons our MCMC method can be significantly faster than currently used matrix exponentiation methods

eScholarship - University of California

rbrothers: R Package for Bayesian Multiple Change-Point Recombination Detection

Author: Chattopadhyay Sujay
Irvahn Jan
Minin Vladimir N.
Sokurenko Evgeni V.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2013
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Phylogenetic Stochastic Mapping Without Matrix Exponentiation

Author: Goldman N.
Jan Irvahn
Jones D.T.
Plummer M.
Rao V.
Spencer M.
Vladimir N. Minin
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

A Review of Imputation Strategies for Isobaric Labeling-Based Shotgun Proteomics

Author: Bobbie-Jo M. Webb-Robertson
Jan Irvahn
Josse J.
Karin D. Rodland
Lisa M. Bramer
Loader C.
Paul D. Piehowski
Venables W. N.
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref