377 research outputs found

    CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

    Get PDF
    Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes

    Modeling HIV-1 Drug Resistance as Episodic Directional Selection

    Get PDF
    The evolution of substitutions conferring drug resistance to HIV-1 is both episodic, occurring when patients are on antiretroviral therapy, and strongly directional, with site-specific resistant residues increasing in frequency over time. While methods exist to detect episodic diversifying selection and continuous directional selection, no evolutionary model combining these two properties has been proposed. We present two models of episodic directional selection (MEDS and EDEPS) which allow the a priori specification of lineages expected to have undergone directional selection. The models infer the sites and target residues that were likely subject to directional selection, using either codon or protein sequences. Compared to its null model of episodic diversifying selection, MEDS provides a superior fit to most sites known to be involved in drug resistance, and neither one test for episodic diversifying selection nor another for constant directional selection are able to detect as many true positives as MEDS and EDEPS while maintaining acceptable levels of false positives. This suggests that episodic directional selection is a better description of the process driving the evolution of drug resistance

    HIV-Specific Probabilistic Models of Protein Evolution

    Get PDF
    Comparative sequence analyses, including such fundamental bioinformatics techniques as similarity searching, sequence alignment and phylogenetic inference, have become a mainstay for researchers studying type 1 Human Immunodeficiency Virus (HIV-1) genome structure and evolution. Implicit in comparative analyses is an underlying model of evolution, and the chosen model can significantly affect the results. In general, evolutionary models describe the probabilities of replacing one amino acid character with another over a period of time. Most widely used evolutionary models for protein sequences have been derived from curated alignments of hundreds of proteins, usually based on mammalian genomes. It is unclear to what extent these empirical models are generalizable to a very different organism, such as HIV-1–the most extensively sequenced organism in existence. We developed a maximum likelihood model fitting procedure to a collection of HIV-1 alignments sampled from different viral genes, and inferred two empirical substitution models, suitable for describing between-and within-host evolution. Our procedure pools the information from multiple sequence alignments, and provided software implementation can be run efficiently in parallel on a computer cluster. We describe how the inferred substitution models can be used to generate scoring matrices suitable for alignment and similarity searches. Our models had a consistently superior fit relative to the best existing models and to parameter-rich data-driven models when benchmarked on independent HIV-1 alignments, demonstrating evolutionary biases in amino-acid substitution that are unique to HIV, and that are not captured by the existing models. The scoring matrices derived from the models showed a marked difference from common amino-acid scoring matrices. The use of an appropriate evolutionary model recovered a known viral transmission history, whereas a poorly chosen model introduced phylogenetic error. We argue that our model derivation procedure is immediately applicable to other organisms with extensive sequence data available, such as Hepatitis C and Influenza A viruses

    Evolutionary Dynamics and Emergence of Panzootic H5N1 Influenza Viruses

    Get PDF
    The highly pathogenic avian influenza (HPAI) H5N1 virus lineage has undergone extensive genetic reassortment with viruses from different sources to produce numerous H5N1 genotypes, and also developed into multiple genetically distinct sublineages in China. From there, the virus has spread to over 60 countries. The ecological success of this virus in diverse species of both poultry and wild birds with frequent introduction to humans suggests that it is a likely source of the next human pandemic. Therefore, the evolutionary and ecological characteristics of its emergence from wild birds into poultry are of considerable interest. Here, we apply the latest analytical techniques to infer the early evolutionary dynamics of H5N1 virus in the population from which it emerged (wild birds and domestic poultry). By estimating the time of most recent common ancestors of each gene segment, we show that the H5N1 prototype virus was likely introduced from wild birds into poultry as a non-reassortant low pathogenic avian influenza H5N1 virus and was not generated by reassortment in poultry. In contrast, more recent H5N1 genotypes were generated locally in aquatic poultry after the prototype virus (A/goose/Guangdong/1/96) introduction occurred, i.e., they were not a result of additional emergence from wild birds. We show that the H5N1 virus was introduced into Indonesia and Vietnam 3–6 months prior to detection of the first outbreaks in those countries. Population dynamics analyses revealed a rapid increase in the genetic diversity of A/goose/Guangdong/1/96 lineage viruses from mid-1999 to early 2000. Our results suggest that the transmission of reassortant viruses through the mixed poultry population in farms and markets in China has selected HPAI H5N1 viruses that are well adapted to multiple hosts and reduced the interspecies transmission barrier of those viruses

    Phylodynamic Reconstruction Reveals Norovirus GII.4 Epidemic Expansions and their Molecular Determinants

    Get PDF
    Noroviruses are the most common cause of viral gastroenteritis. An increase in the number of globally reported norovirus outbreaks was seen the past decade, especially for outbreaks caused by successive genogroup II genotype 4 (GII.4) variants. Whether this observed increase was due to an upswing in the number of infections, or to a surveillance artifact caused by heightened awareness and concomitant improved reporting, remained unclear. Therefore, we set out to study the population structure and changes thereof of GII.4 strains detected through systematic outbreak surveillance since the early 1990s. We collected 1383 partial polymerase and 194 full capsid GII.4 sequences. A Bayesian MCMC coalescent analysis revealed an increase in the number of GII.4 infections during the last decade. The GII.4 strains included in our analyses evolved at a rate of 4.3–9.0×10−3 mutations per site per year, and share a most recent common ancestor in the early 1980s. Determinants of adaptation in the capsid protein were studied using different maximum likelihood approaches to identify sites subject to diversifying or directional selection and sites that co-evolved. While a number of the computationally determined adaptively evolving sites were on the surface of the capsid and possible subject to immune selection, we also detected sites that were subject to constrained or compensatory evolution due to secondary RNA structures, relevant in virus-replication. We highlight codons that may prove useful in identifying emerging novel variants, and, using these, indicate that the novel 2008 variant is more likely to cause a future epidemic than the 2007 variant. While norovirus infections are generally mild and self-limiting, more severe outcomes of infection frequently occur in elderly and immunocompromized people, and no treatment is available. The observed pattern of continually emerging novel variants of GII.4, causing elevated numbers of infections, is therefore a cause for concern

    Understanding the molecular determinants driving the immunological specificity of the protective pilus 2a backbone protein of Group B Streptococcus

    Get PDF
    The pilus 2a backbone protein (BP-2a) is one of the most structurally and functionally characterized components of a potential vaccine formulation against Group B Streptococcus. It is characterized by six main immunologically distinct allelic variants, each inducing variant-specific protection. To investigate the molecular determinants driving the variant immunogenic specificity of BP-2a, in terms of single residue contributions, we generated six monoclonal antibodies against a specific protein variant based on their capability to recognize the polymerized pili structure on the bacterial surface. Three mAbs were also able to induce complement-dependent opsonophagocytosis killing of live GBS and target the same linear epitope present in the structurally defined and immunodominant domain D3 of the protein. Molecular docking between the modelled scFv antibody sequences and the BP-2a crystal structure revealed the potential role at the binding interface of some non-conserved antigen residues. Mutagenesis analysis confirmed the necessity of a perfect balance between charges, size and polarity at the binding interface to obtain specific binding of mAbs to the protein antigen for a neutralizing response

    Active Methamphetamine Use is Associated with Transmitted Drug Resis-tance to Non-Nucleoside Reverse Transcriptase Inhibitors in Individuals with HIV Infection of Unknown Duration

    Get PDF
    BackgroundFrequent methamphetamine use among recently HIV infected individuals is associated with transmitted drug resistance (TDR) to non-nucleoside reverse transcriptase inhibitors (NNRTI); however, the reversion time of TDR to drug susceptible HIV may exceed 3 years. We assessed whether recreational substance use is associated with detectable TDR among individuals newly diagnosed with HIV infection of unknown duration.DesignCross-sectional analysis.MethodsSubjects were enrolled at the University California, San Diego Early Intervention Program. Demographic, clinical and substance use data were collected using structured interviews. Genotypic resistance testing was performed using GeneSeq, Monogram Biosciences. We analyzed the association between substance use and TDR using bivariate analyses and the corresponding transmission networks using phylogenetic models.ResultsBetween April 2004 and July 2006, 115 individuals with genotype data were enrolled. The prevalence of alcohol, marijuana and methamphetamine use were 98%, 71% and 64% respectively. Only active methamphetamine use in the 30 days prior to HIV diagnosis was independently associated with TDR to NNRTI (OR: 6.6; p=0.002).ConclusionDespite not knowing the duration of their HIV infection, individuals reporting active methamphetamine use in the 30 days prior to HIV diagnosis are at an increased risk of having HIV strains that are resistant to NNRTI

    Adaptive Evolution in the Glucose Transporter 4 Gene Slc2a4 in Old World Fruit Bats (Family: Pteropodidae)

    Get PDF
    Frugivorous and nectarivorous bats are able to ingest large quantities of sugar in a short time span while avoiding the potentially adverse side-effects of elevated blood glucose. The glucose transporter 4 protein (GLUT4) encoded by the Slc2a4 gene plays a critical role in transmembrane skeletal muscle glucose uptake and thus glucose homeostasis. To test whether the Slc2a4 gene has undergone adaptive evolution in bats with carbohydrate-rich diets in relation to their insect-eating sister taxa, we sequenced the coding region of the Slc2a4 gene in a number of bat species, including four Old World fruit bats (Pteropodidae) and three New World fruit bats (Phyllostomidae). Our molecular evolutionary analyses revealed evidence that Slc2a4 has undergone a change in selection pressure in Old World fruit bats with 11 amino acid substitutions detected on the ancestral branch, whereas, no positive selection was detected in the New World fruit bats. We noted that in the former group, amino acid replacements were biased towards either Serine or Isoleucine, and, of the 11 changes, six were specific to Old World fruit bats (A133S, A164S, V377F, V386I, V441I and G459S). Our study presents preliminary evidence that the Slc2a4 gene has undergone adaptive changes in Old World fruit bats in relation to their ability to meet the demands of a high sugar diet

    Phylogeography of Japanese encephalitis virus:genotype is associated with climate

    Get PDF
    The circulation of vector-borne zoonotic viruses is largely determined by the overlap in the geographical distributions of virus-competent vectors and reservoir hosts. What is less clear are the factors influencing the distribution of virus-specific lineages. Japanese encephalitis virus (JEV) is the most important etiologic agent of epidemic encephalitis worldwide, and is primarily maintained between vertebrate reservoir hosts (avian and swine) and culicine mosquitoes. There are five genotypes of JEV: GI-V. In recent years, GI has displaced GIII as the dominant JEV genotype and GV has re-emerged after almost 60 years of undetected virus circulation. JEV is found throughout most of Asia, extending from maritime Siberia in the north to Australia in the south, and as far as Pakistan to the west and Saipan to the east. Transmission of JEV in temperate zones is epidemic with the majority of cases occurring in summer months, while transmission in tropical zones is endemic and occurs year-round at lower rates. To test the hypothesis that viruses circulating in these two geographical zones are genetically distinct, we applied Bayesian phylogeographic, categorical data analysis and phylogeny-trait association test techniques to the largest JEV dataset compiled to date, representing the envelope (E) gene of 487 isolates collected from 12 countries over 75 years. We demonstrated that GIII and the recently emerged GI-b are temperate genotypes likely maintained year-round in northern latitudes, while GI-a and GII are tropical genotypes likely maintained primarily through mosquito-avian and mosquito-swine transmission cycles. This study represents a new paradigm directly linking viral molecular evolution and climate
    corecore