1,661 research outputs found

    Evaluating purifying selection in the mitochondrial DNA of various mammalian species

    Get PDF
    Mitochondrial DNA (mtDNA), the circular DNA molecule inside the mitochondria of all eukaryotic cells, has been shown to be under the effect of purifying selection in several species. Traditional testing of purifying selection has been based simply on ratios of nonsynonymous to synonymous mutations, without considering the relative age of each mutation, which can be determined by phylogenetic analysis of this non-recombining molecule. The incorporation of a mutation time-ordering from phylogeny and of predicted pathogenicity scores for nonsynonymous mutations allow a quantitative evaluation of the effects of purifying selection in human mtDNA. Here, by using this additional information, we show that purifying selection undoubtedly acts upon the mtDNA of other mammalian species/genera, namely Bos sp., Canis lupus, Mus musculus, Orcinus orca, Pan sp. and Sus scrofa. The effects of purifying selection were comparable in all species, leading to a significant major proportion of nonsynonymous variants with higher pathogenicity scores in the younger branches of the tree. We also derive recalibrated mutation rates for age estimates of ancestors of these various species and proposed a correction curve in order to take into account the effects of selection. Understanding this selection is fundamental to evolutionary studies and to the identification of deleterious mutations

    γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Over the past two decades, there have been several approximate methods that adopt different mutation models and used for estimating nonsynonymous and synonymous substitution rates (Ka and Ks) based on protein-coding sequences across species or even different evolutionary lineages. Among them, MYN method (a Modified version of Yang-Nielsen method) considers three major dynamic features of evolving DNA sequences–bias in transition/transversion rate, nucleotide frequency, and unequal transitional substitution but leaves out another important feature: unequal substitution rates among different sites or nucleotide positions.</p> <p>Results</p> <p>We incorporated a new feature for analyzing evolving DNA sequences–unequal substitution rates among different sites–into MYN method, and proposed a modified version, namely <it>γ </it>(gamma)-MYN, based on an assumption that the evolutionary rate at each site follows a mode of <it>γ</it>-distribution. We applied <it>γ</it>-MYN to analyze the key estimator of selective pressure ω (Ka/Ks) and other relevant parameters in comparison to two other related methods, YN and MYN, and found that neglecting the variation of substitution rates among different sites may lead to biased estimations of ω. Our new method appears to have minimal deviations when relevant parameters vary within normal ranges defined by empirical data.</p> <p>Conclusion</p> <p>Our results indicate that unequal substitution rates among different sites have variable influences on ω under different evolutionary rates while both transition/transversion rate ratio and unequal nucleotide frequencies affect Ka and Ks thus selective pressure ω.</p> <p>Reviewers</p> <p>This paper was reviewed by Kateryna Makova, David A. Liberles (nominated by David H Ardell), Zhaolei Zhang (nominated by Mark Gerstein), and Shamil Sunyaev.</p

    A generalized mechanistic codon model.

    Get PDF
    Models of codon evolution have attracted particular interest because of their unique capabilities to detect selection forces and their high fit when applied to sequence evolution. We described here a novel approach for modeling codon evolution, which is based on Kronecker product of matrices. The 61 × 61 codon substitution rate matrix is created using Kronecker product of three 4 × 4 nucleotide substitution matrices, the equilibrium frequency of codons, and the selection rate parameter. The entities of the nucleotide substitution matrices and selection rate are considered as parameters of the model, which are optimized by maximum likelihood. Our fully mechanistic model allows the instantaneous substitution matrix between codons to be fully estimated with only 19 parameters instead of 3,721, by using the biological interdependence existing between positions within codons. We illustrate the properties of our models using computer simulations and assessed its relevance by comparing the AICc measures of our model and other models of codon evolution on simulations and a large range of empirical data sets. We show that our model fits most biological data better compared with the current codon models. Furthermore, the parameters in our model can be interpreted in a similar way as the exchangeability rates found in empirical codon models

    Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome

    Get PDF
    Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30–42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10–20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits

    Comparative genomics approaches accurately predict deleterious variants in plants

    Get PDF
    Recent advances in genome resequencing have led to increased interest in prediction of the functional consequences of genetic variants. Variants at phylogenetically conserved sites are of particular interest, because they are more likely than variants at phylogenetically variable sites to have deleterious effects on fitness and contribute to phenotypic variation. Numerous comparative genomic approaches have been developed to predict deleterious variants, but the approaches are nearly always assessed based on their ability to identify known disease-causing mutations in humans. Determining the accuracy of deleterious variant predictions in nonhuman species is important to understanding evolution, domestication, and potentially to improving crop quality and yield. To examine our ability to predict deleterious variants in plants we generated a curated database of 2,910 Arabidopsis thaliana mutants with known phenotypes. We evaluated seven approaches and found that while all performed well, their relative ranking differed from prior benchmarks in humans. We conclude that deleterious mutations can be reliably predicted in A. thaliana and likely other plant species, but that the relative performance of various approaches does not necessarily translate from one species to another

    Nonsynonymous substitution rate (Ka) is a relatively consistent parameter for defining fast-evolving and slow-evolving protein-coding genes

    Get PDF
    BACKGROUND: Mammalian genome sequence data are being acquired in large quantities and at enormous speeds. We now have a tremendous opportunity to better understand which genes are the most variable or conserved, and what their particular functions and evolutionary dynamics are, through comparative genomics. RESULTS: We chose human and eleven other high-coverage mammalian genome data-as well as an avian genome as an outgroup-to analyze orthologous protein-coding genes using nonsynonymous (Ka) and synonymous (Ks) substitution rates. After evaluating eight commonly-used methods of Ka and Ks calculation, we observed that these methods yielded a nearly uniform result when estimating Ka, but not Ks (or Ka/Ks). When sorting genes based on Ka, we noticed that fast-evolving and slow-evolving genes often belonged to different functional classes, with respect to species-specificity and lineage-specificity. In particular, we identified two functional classes of genes in the acquired immune system. Fast-evolving genes coded for signal-transducing proteins, such as receptors, ligands, cytokines, and CDs (cluster of differentiation, mostly surface proteins), whereas the slow-evolving genes were for function-modulating proteins, such as kinases and adaptor proteins. In addition, among slow-evolving genes that had functions related to the central nervous system, neurodegenerative disease-related pathways were enriched significantly in most mammalian species. We also confirmed that gene expression was negatively correlated with evolution rate, i.e. slow-evolving genes were expressed at higher levels than fast-evolving genes. Our results indicated that the functional specializations of the three major mammalian clades were: sensory perception and oncogenesis in primates, reproduction and hormone regulation in large mammals, and immunity and angiotensin in rodents. CONCLUSION: Our study suggests that Ka calculation, which is less biased compared to Ks and Ka/Ks, can be used as a parameter to sort genes by evolution rate and can also provide a way to categorize common protein functions and define their interaction networks, either pair-wise or in defined lineages or subgroups. Evaluating gene evolution based on Ka and Ks calculations can be done with large datasets, such as mammalian genomes. REVIEWERS: This article has been reviewed by Drs. Anamaria Necsulea (nominated by Nicolas Galtier), Subhajyoti De (nominated by Sarah Teichmann) and Claus O. Wilke

    Multiple Mechanisms Promote the Retained Expression of Gene Duplicates in the Tetraploid Frog Xenopus laevis

    Get PDF
    Gene duplication provides a window of opportunity for biological variants to persist under the protection of a co-expressed copy with similar or redundant function. Duplication catalyzes innovation (neofunctionalization), subfunction degeneration (subfunctionalization), and genetic buffering (redundancy), and the genetic survival of each paralog is triggered by mechanisms that add, compromise, or do not alter protein function. We tested the applicability of three types of mechanisms for promoting the retained expression of duplicated genes in 290 expressed paralogs of the tetraploid clawed frog, Xenopus laevis. Tests were based on explicit expectations concerning the ka/ks ratio, and the number and location of nonsynonymous substitutions after duplication. Functional constraints on the majority of paralogs are not significantly different from a singleton ortholog. However, we recover strong support that some of them have an asymmetric rate of nonsynonymous substitution: 6% match predictions of the neofunctionalization hypothesis in that (1) each paralog accumulated nonsynonymous substitutions at a significantly different rate and (2) the one that evolves faster has a higher ka/ks ratio than the other paralog and than a singleton ortholog. Fewer paralogs (3%) exhibit a complementary pattern of substitution at the protein level that is predicted by enhancement or degradation of different functional domains, and the remaining 13% have a higher average ka/ks ratio in both paralogs that is consistent with altered functional constraints, diversifying selection, or activity-reducing mutations after duplication. We estimate that these paralogs have been retained since they originated by genome duplication between 21 and 41 million years ago. Multiple mechanisms operate to promote the retained expression of duplicates in the same genome, in genes in the same functional class, over the same period of time following duplication, and sometimes in the same pair of paralogs. None of these paralogs are superfluous; degradation or enhancement of different protein subfunctions and neofunctionalization are plausible hypotheses for the retained expression of some of them. Evolution of most X. laevis paralogs, however, is consistent with retained expression via mechanisms that do not radically alter functional constraints, such as selection to preserve post-duplication stoichiometry or temporal, quantitative, or spatial subfunctionalization

    Rev Variation during Persistent Lentivirus Infection

    Get PDF
    The ability of lentiviruses to continually evolve and escape immune control is the central impediment in developing an effective vaccine for HIV-1 and other lentiviruses. Equine infectious anemia virus (EIAV) is considered a useful model for immune control of lentivirus infection. Virus-specific cytotoxic T lymphocytes (CTL) and broadly neutralizing antibody effectively control EIAV replication during inapparent stages of disease, but after years of low-level replication, the virus is still able to produce evasion genotypes that lead to late re-emergence of disease. There is a high rate of genetic variation in the EIAV surface envelope glycoprotein (SU) and in the region of the transmembrane protein (TM) overlapped by the major exon of Rev. This review examines genetic and phenotypic variation in Rev during EIAV disease and a possible role for Rev in immune evasion and virus persistence

    Inferring nonneutral evolution from contrasting patterns of polymorphisms and divergences in different protein coding regions of enterovirus 71 circulating in Taiwan during 1998-2003

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Enterovirus (EV) 71 is one of the common causative agents for hand, foot, and, mouth disease (HFMD). In recent years, the virus caused several outbreaks with high numbers of deaths and severe neurological complications. Despite the importance of these epidemics, several aspects of the evolutionary and epidemiological dynamics, including viral nucleotide variations within and between different outbreaks, rates of change in immune-related structural regions vs. non-structural regions, and forces driving the evolution of EV71, are still not clear.</p> <p>Results</p> <p>We sequenced four genomic segments, i.e., the 5' untranslated region (UTR), VP1, 2A, and 3C, of 395 EV71 viral strains collected from 1998 to 2003 in Taiwan. The phylogenies derived from different genomic segments revealed different relationships, indicating frequent sequence recombinations as previously noted. In addition to simple recombinations, exchanges of the P1 domain between different species/genotypes of human enterovirus species (HEV)-A were repeatedly observed. Contrasting patterns of polymorphisms and divergences were found between structural (VP1) and non-structural segments (2A and 3C), i.e., the former was less polymorphic within an outbreak but more divergent between different HEV-A species than the latter two. Our computer simulation demonstrated a significant excess of amino acid replacements in the VP1 region implying its possible role in adaptive evolution. Between different epidemic seasons, we observed high viral diversity in the epidemic peaks followed by severe reductions in diversity. Viruses sampled in successive epidemic seasons were not sister to each other, indicating that the annual outbreaks of EV71 were due to genetically distinct lineages.</p> <p>Conclusions</p> <p>Based on observations of accelerated amino acid changes and frequent exchanges of the P1 domain, we propose that positive selection and subsequent frequent domain shuffling are two important mechanisms for generating new genotypes of HEV-A. Our viral dynamics analysis suggested that the importation of EV71 from surrounding areas likely contributes to local EV71 outbreaks.</p

    A genome-wide view of Caenorhabditis elegans base-substitution mutation processes

    Get PDF
    Knowledge of mutation processes is central to understanding virtually all evolutionary phenomena and the underlying nature of genetic disorders and cancers. However, the limitations of standard molecular mutation detection methods have historically precluded a genome-wide understanding of mutation rates and spectra in the nuclear genomes of multicellular organisms. We applied two high-throughput DNA sequencing technologies to identify and characterize hundreds of spontaneously arising base-substitution mutations in 10 Caenorhabditis elegans mutation-accumulation (MA)-line nuclear genomes. C. elegans mutation rate estimates were similar to previous calculations based on smaller numbers of mutations. Mutations were distributed uniformly within and among chromosomes and were not associated with recombination rate variation in the MA lines, suggesting that intragenomic variation in genetic hitchhiking and/or background selection are primarily responsible for the chromosomal distribution patterns of polymorphic nucleotides in C. elegans natural populations. A strong mutational bias from G/C to A/T nucleotides was detected in the MA lines, implicating oxidative DNA damage as a major endogenous mutagenic force in C. elegans. The observed mutational bias also suggests that the C. elegans nuclear genome cannot be at equilibrium because of mutation alone. Transversions dominate the spectrum of spontaneous mutations observed here, whereas transitions dominate patterns of allegedly neutral polymorphism in natural populations of C. elegans and many other animal species; this observation challenges the assumption that natural patterns of molecular variation in noncoding regions of the nuclear genome accurately reflect underlying mutation processes
    corecore