Search CORE

Distinguishing regional from within-codon rate heterogeneity in DNA sequence alignments

Author: A. Gelman
A. Webb
C. Tuffley
D. Husmeier
D. Husmeier
G. Casella
J. Felsenstein
J. Felsenstein
J. Felsenstein
M. Hasegawa
M.A. Suchard
R.J. Boys
V.N. Minin
W.K. Hastings
W.P. Lehrach
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We present an improved phylogenetic factorial hidden Markov model (FHMM) for detecting two types of mosaic structures in DNA sequence alignments, related to (1) recombination and (2) rate heterogeneity. The focus of the present work is on improving the modelling of the latter aspect. Earlier papers have modelled different degrees of rate heterogeneity with separate hidden states of the FHMM. This approach fails to appreciate the intrinsic difference between two types of rate heterogeneity: long-range regional effects, which are potentially related to differences in the selective pressure, and the short-term periodic patterns within the codons, which merely capture the signature of the genetic code. We propose an improved model that explicitly distinguishes between these two effects, and we assess its performance on a set of simulated DNA sequence alignments

University of Strathclyde Institutional Repository

Enlighten

Fully Bayesian tests of neutrality using genealogical summary statistics

Author: A Drummond
A Drummond
A Eyre-Walker
A Eyre-Walker
A Gelman
A McKenzie
A O'Hagan
AJ Drummond
Alexei J Drummond
B Grenfell
C Edwards
C Strobeck
D Aldous
D Colless
D Rubin
DJ Begun
F Tajima
G Box
G McVean
H Innan
H Innan
H Li
I Barnes
J Avise
J Bollback
J Fay
J Kingman
J McDonald
JK Kelly
K Lange
K Zlateva
M Hasegawa
M Hasegawa
M Kirkpatrick
M Newton
M Przeworski
M Slatkin
M Suchard
M Suchard
M Suchard
M Suchard
Marc A Suchard
MW Hahn
N Ferguson
N Metropolis
P Haddrill
R Hudson
R Kass
R Nielsen
R Nielsen
S Bennett
S Mousset
S Ramos-Onsins
S Williamson
W Fitch
W Hastings
XL Meng
Y Benjamini
Y Fu
Y Fu
YX Fu
Z Yang
Publication venue: BioMed Central
Publication date: 01/10/2008
Field of study

Abstract Background Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Results Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Conclusion Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.</p

Springer - Publisher Connector

Directory of Open Access Journals

Edinburgh Research Explorer

Determinants of dengue virus dispersal in the Americas

Author: Allicock Orchid M
Auguste Albert J
Carrington Christine V F
Lemey Philippe
Rambaut Andrew
Sahadeo Nikita
Suchard Marc A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/07/2020
Field of study

Dengue viruses (DENVs) are classified into four serotypes, each of which contains multiple genotypes. DENV genotypes introduced into the Americas over the past five decades have exhibited different rates and patterns of spatial dispersal. In order to understand factors underlying these patterns, we utilized a statistical framework that allows for the integration of ecological, socioeconomic, and air transport mobility data as predictors of viral diffusion while inferring the phylogeographic history. Predictors describing spatial diffusion based on several covariates were compared using a generalized linear model approach, where the support for each scenario and its contribution is estimated simultaneously from the data set. Although different predictors were identified for different serotypes, our analysis suggests that overall diffusion of DENV-1, -2, and -3 in the Americas was associated with airline traffic. The other significant predictors included human population size, the geographical distance between countries and between urban centers and the density of people living in urban environments

Unifying the spatial epidemiology and molecular evolution of emerging epidemics

Author: A. Rambaut
Bourhy
Bowman
Busch
Cruz-Pacheco
Drummond
E. L. Delwart
F. J. Bernardin
F. W. Crawford
Fitch
Grenfell
Grenfell
LaDeau
Lanciotti
Lewis
Liu
M. A. Suchard
M. P. Busch
Magori
Maidana
Melbourne
Mundt
Murray
N. Arinaminpathy
Noble
O. G. Pybus
P. Lemey
Pybus
R. R. Gray
Rappole
Reed
S. L. Stramer
SKELLAM
Suchard
Wonham
Yiannakoulias
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2012
Field of study

We introduce a conceptual bridge between the previously unlinked fields of phylogenetics and mathematical spatial ecology, which enables the spatial parameters of an emerging epidemic to be directly estimated from sampled pathogen genome sequences. By using phylogenetic history to correct for spatial autocorrelation, we illustrate how a fundamental spatial variable, the diffusion coefficient, can be estimated using robust nonparametric statistics, and how heterogeneity in dispersal can be readily quantified. We apply this framework to the spread of the West Nile virus across North America, an important recent instance of spatial invasion by an emerging infectious disease. We demonstrate that the dispersal of West Nile virus is greater and far more variable than previously measured, such that its dissemination was critically determined by rare, long-range movements that are unlikely to be discerned during field observations. Our results indicate that, by ignoring this heterogeneity, previous models of the epidemic have substantially overestimated its basic reproductive number. More generally, our approach demonstrates that easily obtainable genetic data can be used to measure the spatial dynamics of natural populations that are otherwise difficult or costly to quantify

Lirias

Edinburgh Research Explorer

Oxford University Research Archive

TreeFlow: probabilistic programming and automatic differentiation for phylogenetics

Author: Drummond A
Fourment M
IV FAM
Ji X
Nasif H
Suchard MA
Swanepoel C
Publication venue
Publication date: 04/01/2023
Field of study

Probabilistic programming frameworks are powerful tools for statistical modelling and inference. They are not immediately generalisable to phylogenetic problems due to the particular computational properties of the phylogenetic tree object. TreeFlow is a software library for probabilistic programming and automatic differentiation with phylogenetic trees. It implements inference algorithms for phylogenetic tree times and model parameters, given a tree topology. We demonstrate how TreeFlow can be used to quickly implement and assess new models. We also show that it provides reasonable performance for gradient-based inference algorithms compared to specialized computational libraries for phylogenetics.Data processing pipeline can be found at https://github.com/christiaanjs/treeflow-paper Tree topologies inferred using RAxML 8.2.12 Tree topologies are rooted using LSD 0.2 BEAST analyses are performed using BEAST 2.6.7 Variational inference analyses are performed using TreeFlow 0.0.1beta Sequences have been removed H3N2 BEAST XML as a result of license conflicts. This complete version of this file is generated by the above pipeline.Funding provided by: University of AucklandCrossref Funder Registry ID: http://dx.doi.org/10.13039/501100001537Award Number:Carnivores sequence alignment accessed from benchmark in BEAST examples H3N2 sequence alignment taken from Vaughan TG, Kühnert D, Popinga A, Welch D, Drummond AJ. Efficient Bayesian inference under the structured coalescent. Bioinformatics. 2014 Aug 15;30(16):2272-9. doi: 10.1093/bioinformatics/btu20

OPUS - University of Technology Sydney

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Leptospira interrogans Endostatin-Like Outer Membrane Proteins Bind Host Fibronectin, Laminin and Regulators of Complement

Author: Brissette Catherine A.
Choy Henry A.
Cooley Anne E.
Creamer Trevor P.
DeMoll Edward
Haake David A.
Kraiczy Peter
Miller M. Clarke
Pinne Marija
Rotondi Matthew L.
Stevenson Brian
Suchard Marc A.
Verma Ashutosh
Publication venue: Public Library of Science
Publication date: 01/11/2007
Field of study

The pathogenic spirochete Leptospira interrogans disseminates throughout its hosts via the bloodstream, then invades and colonizes a variety of host tissues. Infectious leptospires are resistant to killing by their hosts' alternative pathway of complement-mediated killing, and interact with various host extracellular matrix (ECM) components. The LenA outer surface protein (formerly called LfhA and Lsa24) was previously shown to bind the host ECM component laminin and the complement regulators factor H and factor H-related protein-1. We now demonstrate that infectious L. interrogans contain five additional paralogs of lenA, which we designated lenB, lenC, lenD, lenE and lenF. All six genes encode domains predicted to bear structural and functional similarities with mammalian endostatins. Sequence analyses of genes from seven infectious L. interrogans serovars indicated development of sequence diversity through recombination and intragenic duplication. LenB was found to bind human factor H, and all of the newly-described Len proteins bound laminin. In addition, LenB, LenC, LenD, LenE and LenF all exhibited affinities for fibronectin, a distinct host extracellular matrix protein. These characteristics suggest that Len proteins together facilitate invasion and colonization of host tissues, and protect against host immune responses during mammalian infection

Public Library of Science (PLOS)

Directory of Open Access Journals

University of Kentucky

Hochschulschriftenserver - Universität Frankfurt am Main

An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration

While statisticians are well-accustomed to performing exploratory analysis in the modeling stage of an analysis, the notion of conducting preliminary general-purpose exploratory analysis in the Monte Carlo stage (or more generally, the model-fitting stage) of an analysis is an area which we feel deserves much further attention. Towards this aim, this paper proposes a general-purpose algorithm for automatic density exploration. The proposed exploration algorithm combines and expands upon components from various adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at its heart. Additionally, the algorithm is run on interacting parallel chains -- a feature which both decreases computational cost as well as stabilizes the algorithm, improving its ability to explore the density. Performance is studied in several applications. Through a Bayesian variable selection example, the authors demonstrate the convergence gains obtained with interacting chains. The ability of the algorithm's adaptive proposal to induce mode-jumping is illustrated through a trimodal density and a Bayesian mixture modeling application. Lastly, through a 2D Ising model, the authors demonstrate the ability of the algorithm to overcome the high correlations encountered in spatial models.Comment: 33 pages, 20 figures (the supplementary materials are included as appendices

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Polytechnique

Oskar Bordeaux

Sequence-based prediction for vaccine strain selection and identification of antigenic variability in foot-and-mouth disease virus

Author: A Bastos
A Bastos
A Samuel
A Thomas
A Thomas
AD Bastos
ADS Bastos
AFY Poon
AJ Drummond
B Baxt
B Shapiro
Belinda Blignaut
C Bolwell
D Paton
Daniel T. Haydon
DJ Smith
E Beck
Elizabeth E. Fry
Elizabeth Rieder
F Yates
Francois F. Maree
Hester G. O'Neill
HG van Rensburg
J Crowther
J Crowther
J Felsenstein
J Holland
J Kitson
Jacques Theron
Jan J. Esterhuysen
JC Saiz
Louise Matthews
M Lee
M Rweyemamu
M Rweyemamu
M Rweyemamu
M Suchard
M-S Lee
Mark M. Tanaka
MG Mateu
N Knowles
N Mattion
N Mattion
P Barnett
P Barnett
Pamela Opperman
R Boom
R Garten
RA Fisher
Richard Reeve
S Holm
S Lea
S Parida
Tjaart A. P. de Beer
W Vosloo
W Vosloo
W Vosloo
Wilna Vosloo
Y-C Liao
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2010
Field of study

Identifying when past exposure to an infectious disease will protect against newly emerging strains is central to understanding the spread and the severity of epidemics, but the prediction of viral cross-protection remains an important unsolved problem. For foot-and-mouth disease virus (FMDV) research in particular, improved methods for predicting this cross-protection are critical for predicting the severity of outbreaks within endemic settings where multiple serotypes and subtypes commonly co-circulate, as well as for deciding whether appropriate vaccine(s) exist and how much they could mitigate the effects of any outbreak. To identify antigenic relationships and their predictors, we used linear mixed effects models to account for variation in pairwise cross-neutralization titres using only viral sequences and structural data. We identified those substitutions in surface-exposed structural proteins that are correlates of loss of cross-reactivity. These allowed prediction of both the best vaccine match for any single virus and the breadth of coverage of new vaccine candidates from their capsid sequences as effectively as or better than serology. Sub-sequences chosen by the model-building process all contained sites that are known epitopes on other serotypes. Furthermore, for the SAT1 serotype, for which epitopes have never previously been identified, we provide strong evidence - by controlling for phylogenetic structure - for the presence of three epitopes across a panel of viruses and quantify the relative significance of some individual residues in determining cross-neutralization. Identifying and quantifying the importance of sites that predict viral strain cross-reactivity not just for single viruses but across entire serotypes can help in the design of vaccines with better targeting and broader coverage. These techniques can be generalized to any infectious agents where cross-reactivity assays have been carried out. As the parameterization uses pre-existing datasets, this approach quickly and cheaply increases both our understanding of antigenic relationships and our power to control disease

Public Library of Science (PLOS)

North-West University Institutional Repository

Directory of Open Access Journals