344 research outputs found

    Evidence for Pervasive Adaptive Protein Evolution in Wild Mice

    Get PDF
    The relative contributions of neutral and adaptive substitutions to molecular evolution has been one of the most controversial issues in evolutionary biology for more than 40 years. The analysis of within-species nucleotide polymorphism and between-species divergence data supports a widespread role for adaptive protein evolution in certain taxa. For example, estimates of the proportion of adaptive amino acid substitutions (alpha) are 50% or more in enteric bacteria and Drosophila. In contrast, recent estimates of alpha for hominids have been at most 13%. Here, we estimate alpha for protein sequences of murid rodents based on nucleotide polymorphism data from multiple genes in a population of the house mouse subspecies Mus musculus castaneus, which inhabits the ancestral range of the Mus species complex and nucleotide divergence between M. m. castaneus and M. famulus or the rat. We estimate that 57% of amino acid substitutions in murids have been driven by positive selection. Hominids, therefore, are exceptional in having low apparent levels of adaptive protein evolution. The high frequency of adaptive amino acid substitutions in wild mice is consistent with their large effective population size, leading to effective natural selection at the molecular level. Effective natural selection also manifests itself as a paucity of effectively neutral nonsynonymous mutations in M. m. castaneus compared to humans

    The Effect of Transposable Element Insertions on Gene Expression Evolution in Rodents

    Get PDF
    Background:Many genomes contain a substantial number of transposable elements (TEs), a few of which are known to be involved in regulating gene expression. However, recent observations suggest that TEs may have played a very important role in the evolution of gene expression because many conserved non-genic sequences, some of which are know to be involved in gene regulation, resemble TEs. Results:Here we investigate whether new TE insertions affect gene expression profiles by testing whether gene expression divergence between mouse and rat is correlated to the numbers of new transposable elements inserted near genes. We show that expression divergence is significantly correlated to the number of new LTR and SINE elements, but not to the numbers of LINEs. We also show that expression divergence is not significantly correlated to the numbers of ancestral TEs in most cases, which suggests that the correlations between expression divergence and the numbers of new TEs are causal in nature. We quantify the effect and estimate that TE insertion has accounted for ~20% (95% confidence interval: 12% to 26%) of all expression profile divergence in rodents. Conclusions:We conclude that TE insertions may have had a major impact on the evolution of gene expression levels in rodents

    A Selection Index for Gene Expression Evolution and Its Application to the Divergence between Humans and Chimpanzees

    Get PDF
    The importance of gene regulation in animal evolution is a matter of long-standing interest, but measuring the impact of selection on gene expression has proven a challenge. Here, we propose a selection index of gene expression as a straightforward method for assessing the mode and strength of selection operating on gene expression levels. The index is based on the widely used McDonald-Kreitman test and requires the estimation of four quantities: the within-species and between-species expression variances as well as the sequence heterozygosity and divergence of neutrally evolving sequences. We apply the method to data from human and chimpanzee lymphoblastoid cell lines and show that gene expression is in general under strong stabilizing selection. We also demonstrate how the same framework can be used to estimate the proportion of adaptive gene expression evolution

    Pervasive Adaptive Protein Evolution Apparent in Diversity Patterns around Amino Acid Substitutions in Drosophila simulans

    Get PDF
    In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence—in particular, conclusions about the rate and strength of beneficial substitutions—remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ∼13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation

    Functional Categories Associated with Clusters of Genes That Are Co-Expressed across the NCI-60 Cancer Cell Lines

    Get PDF
    The NCI-60 is a panel of 60 diverse human cancer cell lines used by the U.S. National Cancer Institute to screen compounds for anticancer activity. In the current study, gene expression levels from five platforms were integrated to yield a single composite transcriptome profile. The comprehensive and reliable nature of that dataset allows us to study gene co-expression across cancer cell lines.Hierarchical clustering revealed numerous clusters of genes in which the genes co-vary across the NCI-60. To determine functional categorization associated with each cluster, we used the Gene Ontology (GO) Consortium database and the GoMiner tool. GO maps genes to hierarchically-organized biological process categories. GoMiner can leverage GO to perform ontological analyses of gene expression studies, generating a list of significant functional categories.GoMiner analysis revealed many clusters of coregulated genes that are associated with functional groupings of GO biological process categories. Notably, those categories arising from coherent co-expression groupings reflect cancer-related themes such as adhesion, cell migration, RNA splicing, immune response and signal transduction. Thus, these clusters demonstrate transcriptional coregulation of functionally-related genes

    The surprising negative correlation of gene length and optimal codon use - disentangling translational selection from GC-biased gene conversion in yeast

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Surprisingly, in several multi-cellular eukaryotes optimal codon use correlates negatively with gene length. This contrasts with the expectation under selection for translational accuracy. While suggested explanations focus on variation in strength and efficiency of translational selection, it has rarely been noticed that the negative correlation is reported only in organisms whose optimal codons are biased towards codons that end with G or C (-GC). This raises the question whether forces that affect base composition - such as GC-biased gene conversion - contribute to the negative correlation between optimal codon use and gene length.</p> <p>Results</p> <p>Yeast is a good organism to study this as equal numbers of optimal codons end in -GC and -AT and one may hence compare frequencies of optimal GC- with optimal AT-ending codons to disentangle the forces. Results of this study demonstrate in yeast frequencies of GC-ending (optimal AND non-optimal) codons decrease with gene length and increase with recombination. A decrease of GC-ending codons along genes contributes to the negative correlation with gene length. Correlations with recombination and gene expression differentiate between GC-ending and optimal codons, and also substitution patterns support effects of GC-biased gene conversion.</p> <p>Conclusion</p> <p>While the general effect of GC-biased gene conversion is well known, the negative correlation of optimal codon use with gene length has not been considered in this context before. Initiation of gene conversion events in promoter regions and the presence of a gene conversion gradient most likely explain the observed decrease of GC-ending codons with gene length and gene position.</p

    Does codon bias have an evolutionary origin?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There is a 3-fold redundancy in the Genetic Code; most amino acids are encoded by more than one codon. These synonymous codons are not used equally; there is a Codon Usage Bias (CUB). This article will provide novel information about the origin and evolution of this bias.</p> <p>Results</p> <p>Codon Usage Bias (CUB, defined here as deviation from equal usage of synonymous codons) was studied in 113 species. The average CUB was 29.3 ± 1.1% (S.E.M, n = 113) of the theoretical maximum and declined progressively with evolution and increasing genome complexity. A Pan-Genomic Codon Usage Frequency (CUF) Table was constructed to describe genome-wide relationships among codons. Significant correlations were found between the number of synonymous codons and (i) the frequency of the respective amino acids (ii) the size of CUB. Numerous, statistically highly significant, internal correlations were found among codons and the nucleic acids they comprise. These strong correlations made it possible to predict missing synonymous codons (wobble bases) reliably from the remaining codons or codon residues.</p> <p>Conclusion</p> <p>The results put the concept of "codon bias" into a novel perspective. The internal connectivity of codons indicates that all synonymous codons might be integrated parts of the Genetic Code with equal importance in maintaining its functional integrity.</p

    Generalized Connective Tissue Disease in Crtap-/- Mouse

    Get PDF
    Mutations in CRTAP (coding for cartilage-associated protein), LEPRE1 (coding for prolyl 3-hydroxylase 1 [P3H1]) or PPIB (coding for Cyclophilin B [CYPB]) cause recessive forms of osteogenesis imperfecta and loss or decrease of type I collagen prolyl 3-hydroxylation. A comprehensive analysis of the phenotype of the Crtap-/- mice revealed multiple abnormalities of connective tissue, including in the lungs, kidneys, and skin, consistent with systemic dysregulation of collagen homeostasis within the extracellular matrix. Both Crtap-/- lung and kidney glomeruli showed increased cellular proliferation. Histologically, the lungs showed increased alveolar spacing, while the kidneys showed evidence of segmental glomerulosclerosis, with abnormal collagen deposition. The Crtap-/- skin had decreased mechanical integrity. In addition to the expected loss of proline 986 3-hydroxylation in α1(I) and α1(II) chains, there was also loss of 3Hyp at proline 986 in α2(V) chains. In contrast, at two of the known 3Hyp sites in α1(IV) chains from Crtap-/- kidneys there were normal levels of 3-hydroxylation. On a cellular level, loss of CRTAP in human OI fibroblasts led to a secondary loss of P3H1, and vice versa. These data suggest that both CRTAP and P3H1 are required to maintain a stable complex that 3-hydroxylates canonical proline sites within clade A (types I, II, and V) collagen chains. Loss of this activity leads to a multi-systemic connective tissue disease that affects bone, cartilage, lung, kidney, and skin
    corecore