Search CORE

532 research outputs found

Distinguishing regional from within-codon rate heterogeneity in DNA sequence alignments

Author: A. Gelman
A. Webb
C. Tuffley
D. Husmeier
D. Husmeier
G. Casella
J. Felsenstein
J. Felsenstein
J. Felsenstein
M. Hasegawa
M.A. Suchard
R.J. Boys
V.N. Minin
W.K. Hastings
W.P. Lehrach
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We present an improved phylogenetic factorial hidden Markov model (FHMM) for detecting two types of mosaic structures in DNA sequence alignments, related to (1) recombination and (2) rate heterogeneity. The focus of the present work is on improving the modelling of the latter aspect. Earlier papers have modelled different degrees of rate heterogeneity with separate hidden states of the FHMM. This approach fails to appreciate the intrinsic difference between two types of rate heterogeneity: long-range regional effects, which are potentially related to differences in the selective pressure, and the short-term periodic patterns within the codons, which merely capture the signature of the genetic code. We propose an improved model that explicitly distinguishes between these two effects, and we assess its performance on a set of simulated DNA sequence alignments

Crossref

University of Strathclyde Institutional Repository

Enlighten

Fully Bayesian tests of neutrality using genealogical summary statistics

Author: A Drummond
A Drummond
A Eyre-Walker
A Eyre-Walker
A Gelman
A McKenzie
A O'Hagan
AJ Drummond
Alexei J Drummond
B Grenfell
C Edwards
C Strobeck
D Aldous
D Colless
D Rubin
DJ Begun
F Tajima
G Box
G McVean
H Innan
H Innan
H Li
I Barnes
J Avise
J Bollback
J Fay
J Kingman
J McDonald
JK Kelly
K Lange
K Zlateva
M Hasegawa
M Hasegawa
M Kirkpatrick
M Newton
M Przeworski
M Slatkin
M Suchard
M Suchard
M Suchard
M Suchard
Marc A Suchard
MW Hahn
N Ferguson
N Metropolis
P Haddrill
R Hudson
R Kass
R Nielsen
R Nielsen
S Bennett
S Mousset
S Ramos-Onsins
S Williamson
W Fitch
W Hastings
XL Meng
Y Benjamini
Y Fu
Y Fu
YX Fu
Z Yang
Publication venue: BioMed Central
Publication date: 01/10/2008
Field of study

Abstract Background Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Results Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Conclusion Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Determinants of dengue virus dispersal in the Americas

Author: Allicock Orchid M
Auguste Albert J
Carrington Christine V F
Lemey Philippe
Rambaut Andrew
Sahadeo Nikita
Suchard Marc A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/07/2020
Field of study

Dengue viruses (DENVs) are classified into four serotypes, each of which contains multiple genotypes. DENV genotypes introduced into the Americas over the past five decades have exhibited different rates and patterns of spatial dispersal. In order to understand factors underlying these patterns, we utilized a statistical framework that allows for the integration of ecological, socioeconomic, and air transport mobility data as predictors of viral diffusion while inferring the phylogeographic history. Predictors describing spatial diffusion based on several covariates were compared using a generalized linear model approach, where the support for each scenario and its contribution is estimated simultaneously from the data set. Although different predictors were identified for different serotypes, our analysis suggests that overall diffusion of DENV-1, -2, and -3 in the Americas was associated with airline traffic. The other significant predictors included human population size, the geographical distance between countries and between urban centers and the density of people living in urban environments

Edinburgh Research Explorer

eScholarship - University of California

Unifying the spatial epidemiology and molecular evolution of emerging epidemics

Author: A. Rambaut
Bourhy
Bowman
Busch
Cruz-Pacheco
Drummond
E. L. Delwart
F. J. Bernardin
F. W. Crawford
Fitch
Grenfell
Grenfell
LaDeau
Lanciotti
Lewis
Liu
M. A. Suchard
M. P. Busch
Magori
Maidana
Melbourne
Mundt
Murray
N. Arinaminpathy
Noble
O. G. Pybus
P. Lemey
Pybus
R. R. Gray
Rappole
Reed
S. L. Stramer
SKELLAM
Suchard
Wonham
Yiannakoulias
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2012
Field of study

We introduce a conceptual bridge between the previously unlinked fields of phylogenetics and mathematical spatial ecology, which enables the spatial parameters of an emerging epidemic to be directly estimated from sampled pathogen genome sequences. By using phylogenetic history to correct for spatial autocorrelation, we illustrate how a fundamental spatial variable, the diffusion coefficient, can be estimated using robust nonparametric statistics, and how heterogeneity in dispersal can be readily quantified. We apply this framework to the spread of the West Nile virus across North America, an important recent instance of spatial invasion by an emerging infectious disease. We demonstrate that the dispersal of West Nile virus is greater and far more variable than previously measured, such that its dissemination was critically determined by rare, long-range movements that are unlikely to be discerned during field observations. Our results indicate that, by ignoring this heterogeneity, previous models of the epidemic have substantially overestimated its basic reproductive number. More generally, our approach demonstrates that easily obtainable genetic data can be used to measure the spatial dynamics of natural populations that are otherwise difficult or costly to quantify

Lirias

Crossref

PubMed Central

Edinburgh Research Explorer

Oxford University Research Archive

Leptospira interrogans Endostatin-Like Outer Membrane Proteins Bind Host Fibronectin, Laminin and Regulators of Complement

Author: Brissette Catherine A.
Choy Henry A.
Cooley Anne E.
Creamer Trevor P.
DeMoll Edward
Haake David A.
Kraiczy Peter
Miller M. Clarke
Pinne Marija
Rotondi Matthew L.
Stevenson Brian
Suchard Marc A.
Verma Ashutosh
Publication venue: Public Library of Science
Publication date: 01/11/2007
Field of study

The pathogenic spirochete Leptospira interrogans disseminates throughout its hosts via the bloodstream, then invades and colonizes a variety of host tissues. Infectious leptospires are resistant to killing by their hosts' alternative pathway of complement-mediated killing, and interact with various host extracellular matrix (ECM) components. The LenA outer surface protein (formerly called LfhA and Lsa24) was previously shown to bind the host ECM component laminin and the complement regulators factor H and factor H-related protein-1. We now demonstrate that infectious L. interrogans contain five additional paralogs of lenA, which we designated lenB, lenC, lenD, lenE and lenF. All six genes encode domains predicted to bear structural and functional similarities with mammalian endostatins. Sequence analyses of genes from seven infectious L. interrogans serovars indicated development of sequence diversity through recombination and intragenic duplication. LenB was found to bind human factor H, and all of the newly-described Len proteins bound laminin. In addition, LenB, LenC, LenD, LenE and LenF all exhibited affinities for fibronectin, a distinct host extracellular matrix protein. These characteristics suggest that Len proteins together facilitate invasion and colonization of host tissues, and protect against host immune responses during mammalian infection

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

University of Kentucky

eScholarship - University of California

Hochschulschriftenserver - Universität Frankfurt am Main

An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration

While statisticians are well-accustomed to performing exploratory analysis in the modeling stage of an analysis, the notion of conducting preliminary general-purpose exploratory analysis in the Monte Carlo stage (or more generally, the model-fitting stage) of an analysis is an area which we feel deserves much further attention. Towards this aim, this paper proposes a general-purpose algorithm for automatic density exploration. The proposed exploration algorithm combines and expands upon components from various adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at its heart. Additionally, the algorithm is run on interacting parallel chains -- a feature which both decreases computational cost as well as stabilizes the algorithm, improving its ability to explore the density. Performance is studied in several applications. Through a Bayesian variable selection example, the authors demonstrate the convergence gains obtained with interacting chains. The ability of the algorithm's adaptive proposal to induce mode-jumping is illustrated through a trimodal density and a Bayesian mixture modeling application. Lastly, through a 2D Ising model, the authors demonstrate the ability of the algorithm to overcome the high correlations encountered in spatial models.Comment: 33 pages, 20 figures (the supplementary materials are included as appendices

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Polytechnique

Oskar Bordeaux

Sequence-based prediction for vaccine strain selection and identification of antigenic variability in foot-and-mouth disease virus

Author: A Bastos
A Bastos
A Samuel
A Thomas
A Thomas
AD Bastos
ADS Bastos
AFY Poon
AJ Drummond
B Baxt
B Shapiro
Belinda Blignaut
C Bolwell
D Paton
Daniel T. Haydon
DJ Smith
E Beck
Elizabeth E. Fry
Elizabeth Rieder
F Yates
Francois F. Maree
Hester G. O'Neill
HG van Rensburg
J Crowther
J Crowther
J Felsenstein
J Holland
J Kitson
Jacques Theron
Jan J. Esterhuysen
JC Saiz
Louise Matthews
M Lee
M Rweyemamu
M Rweyemamu
M Rweyemamu
M Suchard
M-S Lee
Mark M. Tanaka
MG Mateu
N Knowles
N Mattion
N Mattion
P Barnett
P Barnett
Pamela Opperman
R Boom
R Garten
RA Fisher
Richard Reeve
S Holm
S Lea
S Parida
Tjaart A. P. de Beer
W Vosloo
W Vosloo
W Vosloo
Wilna Vosloo
Y-C Liao
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2010
Field of study

Identifying when past exposure to an infectious disease will protect against newly emerging strains is central to understanding the spread and the severity of epidemics, but the prediction of viral cross-protection remains an important unsolved problem. For foot-and-mouth disease virus (FMDV) research in particular, improved methods for predicting this cross-protection are critical for predicting the severity of outbreaks within endemic settings where multiple serotypes and subtypes commonly co-circulate, as well as for deciding whether appropriate vaccine(s) exist and how much they could mitigate the effects of any outbreak. To identify antigenic relationships and their predictors, we used linear mixed effects models to account for variation in pairwise cross-neutralization titres using only viral sequences and structural data. We identified those substitutions in surface-exposed structural proteins that are correlates of loss of cross-reactivity. These allowed prediction of both the best vaccine match for any single virus and the breadth of coverage of new vaccine candidates from their capsid sequences as effectively as or better than serology. Sub-sequences chosen by the model-building process all contained sites that are known epitopes on other serotypes. Furthermore, for the SAT1 serotype, for which epitopes have never previously been identified, we provide strong evidence - by controlling for phylogenetic structure - for the presence of three epitopes across a panel of viruses and quantify the relative significance of some individual residues in determining cross-neutralization. Identifying and quantifying the importance of sites that predict viral strain cross-reactivity not just for single viruses but across entire serotypes can help in the design of vaccines with better targeting and broader coverage. These techniques can be generalized to any infectious agents where cross-reactivity assays have been carried out. As the parameterization uses pre-existing datasets, this approach quickly and cheaply increases both our understanding of antigenic relationships and our power to control disease

Public Library of Science (PLOS)

Crossref

North-West University Institutional Repository

Directory of Open Access Journals

PubMed Central

Enlighten

UPSpace at the University of Pretoria

Assessing the role of live poultry trade in community-structured transmission of avian influenza in China

Author: Bi Y
Lemey P
Liu D
Pybus O G
Qi W
Shi W
Stenseth N C
Suchard M A
Tian H
Yang Q
Zhang G
Zhao X
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2020
Field of study

The live poultry trade is thought to play an important role in the spread and maintenance of highly pathogenic avian influenza A viruses (HP AIVs) in Asia. Despite an abundance of small-scale observational studies, the role of the poultry trade in disseminating AIV over large geographic areas is still unclear, especially for developing countries with complex poultry production systems. Here we combine virus genomes and reconstructed poultry transportation data to measure and compare the spatial spread in China of three key subtypes of AIV: H5N1, H7N9, and H5N6. Although it is difficult to disentangle the contribution of confounding factors, such as bird migration and spatial distance, we find evidence that the dissemination of these subtypes among domestic poultry is geographically continuous and likely associated with the intensity of the live poultry trade in China. Using two independent data sources and network analysis methods, we report a regional-scale community structure in China that might explain the spread of AIV subtypes in the country. The identification of this structure has the potential to inform more targeted strategies for the prevention and control of AIV in China

Lirias

eScholarship - University of California

Oxford University Research Archive

NORA - Norwegian Open Research Archives

Evolutionary distances in the twilight zone -- a rational kernel approach

Author: A Keller
A Löytynoja
A Stamatakis
B Chor
B Schölkopf
Benjamin Merget
C Cortes
C Daskalakis
CB Do
E Rivas
F Bemm
Florian Markowetz
Frank Förster
G Talavera
HH Otu
I Ulitsky
J Felsenstein
J Friedrich
J Hein
JL Thorne
JL Thorne
Jörg Schultz
KM Wong
LS Wang
M Höhl
M Höhl
M Mohri
M Mohri
M Wolf
MA Buchheim
MA Suchard
Matthias Wolf
MJ Bishop
MK Kuhner
MS Waterman
N Goldman
N Higham
R Durbin
RC Edgar
RF Doolittle
Roland F. Schwarz
S Roch
S Whelan
SR Eddy
T Mailund
T Müller
TH Ogden
V Levenshtein
W Fletcher
W Fletcher
Wayne Delport
William Fletcher
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/11/2010
Field of study

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

MDC Repository