Search CORE

Harvard University - DASH

HIV-Specific Probabilistic Models of Protein Evolution

Author: D Jones
D Posada
David C. Nickle
DC Nickle
DF Feng
DG George
H Tang
J Adachi
J Felsenstein
James I. Mullins
JT Herbeck
K Tamura
KP Burnham
L Stanfel
Laura Heath
LM Mansky
LY Yampolsky
M Hasegawa
Mark A. Jensen
MO Dayhoff
MW Dimmic
MW Nachman
N Goldman
N Saitou
N Sugiura
Oliver Pybus
Peter B. Gilbert
R Shankarappa
S Henikoff
S Karlin
S Whelan
Sergei L. Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
SLK Pond
SV Muse
T Leitner
TC Friedrich
TM Allen
Y Liu
Z Yang
Publication venue: Public Library of Science
Publication date: 01/06/2007
Field of study

Comparative sequence analyses, including such fundamental bioinformatics techniques as similarity searching, sequence alignment and phylogenetic inference, have become a mainstay for researchers studying type 1 Human Immunodeficiency Virus (HIV-1) genome structure and evolution. Implicit in comparative analyses is an underlying model of evolution, and the chosen model can significantly affect the results. In general, evolutionary models describe the probabilities of replacing one amino acid character with another over a period of time. Most widely used evolutionary models for protein sequences have been derived from curated alignments of hundreds of proteins, usually based on mammalian genomes. It is unclear to what extent these empirical models are generalizable to a very different organism, such as HIV-1–the most extensively sequenced organism in existence. We developed a maximum likelihood model fitting procedure to a collection of HIV-1 alignments sampled from different viral genes, and inferred two empirical substitution models, suitable for describing between-and within-host evolution. Our procedure pools the information from multiple sequence alignments, and provided software implementation can be run efficiently in parallel on a computer cluster. We describe how the inferred substitution models can be used to generate scoring matrices suitable for alignment and similarity searches. Our models had a consistently superior fit relative to the best existing models and to parameter-rich data-driven models when benchmarked on independent HIV-1 alignments, demonstrating evolutionary biases in amino-acid substitution that are unique to HIV, and that are not captured by the existing models. The scoring matrices derived from the models showed a marked difference from common amino-acid scoring matrices. The use of an appropriate evolutionary model recovered a known viral transmission history, whereas a poorly chosen model introduced phylogenetic error. We argue that our model derivation procedure is immediately applicable to other organisms with extensive sequence data available, such as Hepatitis C and Influenza A viruses

Proceedings - University of Groningen

Differential lung tissue gene expression in males and females:implications for the susceptibility to develop COPD

Author: Bossé Yohan
Brandsma Corry-Anke
Colombo Francesca
de Vries Maaike
Dragani Tommaso A
Faiz Alen
Hao Ke
Laviolette Michel
Nickle David C
Obeidat Ma'en
Paré Peter D
Postma Dirkje S
Rathnayake Senani N H
Sin Don D
Timens Wim
van den Berge Maarten
Publication venue: 'European Respiratory Society (ERS)'
Publication date: 01/07/2019
Field of study

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Nematode endoparasites do not codiversify with their stick insect hosts.

Author: Brooks D. R.
Clarke B.
Colbo M. H.
Cribb T. H.
Galtier N.
Hafner M. S.
Kaiser H.
Kennedy C. R.
Mebrahtu Y.
Nickle W. R.
Nieberding C.
Nikdel M.
Noble E. R.
Page R. D.
Page R. D.
Page R. D.
Poinar G. O.
Poinar G. O. Jr.
Price P. W
Sandoval C. P.
Schmid‐Hempel P.
Tao N.
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Host-parasite coevolution stems from reciprocal selection on host resistance and parasite infectivity, and can generate some of the strongest selective pressures known in nature. It is widely seen as a major driver of diversification, the most extreme case being parallel speciation in hosts and their associated parasites. Here, we report on endoparasitic nematodes, most likely members of the mermithid family, infecting different Timema stick insect species throughout California. The nematodes develop in the hemolymph of their insect host and kill it upon emergence, completely impeding host reproduction. Given the direct exposure of the endoparasites to the host's immune system in the hemolymph, and the consequences of infection on host fitness, we predicted that divergence among hosts may drive parallel divergence in the endoparasites. Our phylogenetic analyses suggested the presence of two differentiated endoparasite lineages. However, independently of whether the two lineages were considered separately or jointly, we found a complete lack of codivergence between the endoparasitic nematodes and their hosts in spite of extensive genetic variation among hosts and among parasites. Instead, there was strong isolation by distance among the endoparasitic nematodes, indicating that geography plays a more important role than host-related adaptations in driving parasite diversification in this system. The accumulating evidence for lack of codiversification between parasites and their hosts at macroevolutionary scales contrasts with the overwhelming evidence for coevolution within populations, and calls for studies linking micro- versus macroevolutionary dynamics in host-parasite interactions

Serveur académique lausannois

Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution

Author: Ben Murrell
C Kosiol
D Posada
D Posada
D Robinson
Daniel Kaliski
DC Nickle
DD Lee
DJ Lipman
DT Jones
F Abascal
Gerdus Benade
J Adachi
J Felsenstein
J Felsenstein
Jan Buys
K Devarajan
Konrad Scheffler
KP Burnham
KP Burnham
L Stanfel
Lise du Buisson
MO Dayhoff
MO Dayhoff
MW Dimmic
N Goldman
N Lartillot
Robert Ketteringham
S Whelan
S Whelan
S Zoller
SA Guindon
Sasha Moola
SL Kosakovsky Pond
SL Kosakovsky Pond
SQ Le
SQ Le
Thomas Mailund
Thomas Weighill
Tristan Hands
W Delport
Y Cao
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models

Cape Town University OpenUCT

Stellenbosch University SUNScholar Repository

CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes

Cronfa at Swansea University

Stellenbosch University SUNScholar Repository

Modeling HIV-1 Drug Resistance as Episodic Directional Selection

The evolution of substitutions conferring drug resistance to HIV-1 is both episodic, occurring when patients are on antiretroviral therapy, and strongly directional, with site-specific resistant residues increasing in frequency over time. While methods exist to detect episodic diversifying selection and continuous directional selection, no evolutionary model combining these two properties has been proposed. We present two models of episodic directional selection (MEDS and EDEPS) which allow the a priori specification of lineages expected to have undergone directional selection. The models infer the sites and target residues that were likely subject to directional selection, using either codon or protein sequences. Compared to its null model of episodic diversifying selection, MEDS provides a superior fit to most sites known to be involved in drug resistance, and neither one test for episodic diversifying selection nor another for constant directional selection are able to detect as many true positives as MEDS and EDEPS while maintaining acceptable levels of false positives. This suggests that episodic directional selection is a better description of the process driving the evolution of drug resistance