6 research outputs found

    Correlated Evolution of Nearby Residues in Drosophilid Proteins

    Get PDF
    Here we investigate the correlations between coding sequence substitutions as a function of their separation along the protein sequence. We consider both substitutions between the reference genomes of several Drosophilids as well as polymorphisms in a population sample of Zimbabwean Drosophila melanogaster. We find that amino acid substitutions are “clustered” along the protein sequence, that is, the frequency of additional substitutions is strongly enhanced within ≈10 residues of a first such substitution. No such clustering is observed for synonymous substitutions, supporting a “correlation length” associated with selection on proteins as the causative mechanism. Clustering is stronger between substitutions that arose in the same lineage than it is between substitutions that arose in different lineages. We consider several possible origins of clustering, concluding that epistasis (interactions between amino acids within a protein that affect function) and positional heterogeneity in the strength of purifying selection are primarily responsible. The role of epistasis is directly supported by the tendency of nearby substitutions that arose on the same lineage to preserve the total charge of the residues within the correlation length and by the preferential cosegregation of neighboring derived alleles in our population sample. We interpret the observed length scale of clustering as a statistical reflection of the functional locality (or modularity) of proteins: amino acids that are near each other on the protein backbone are more likely to contribute to, and collaborate toward, a common subfunction

    The use of information theory in evolutionary biology

    Full text link
    Information is a key concept in evolutionary biology. Information is stored in biological organism's genomes, and used to generate the organism as well as to maintain and control it. Information is also "that which evolves". When a population adapts to a local environment, information about this environment is fixed in a representative genome. However, when an environment changes, information can be lost. At the same time, information is processed by animal brains to survive in complex environments, and the capacity for information processing also evolves. Here I review applications of information theory to the evolution of proteins as well as to the evolution of information processing in simulated agents that adapt to perform a complex task.Comment: 25 pages, 7 figures. To appear in "The Year in Evolutionary Biology", of the Annals of the NY Academy of Science

    A genomic map of the effects of linked selection in Drosophila

    Get PDF
    Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of 'linked selection' on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of linked selection from non-classic sweeps. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated

    Evolution of genes related to temperature adaptation in Drosophila melanogaster as revealed by QTL and population genetics analyses

    Get PDF
    The fixation of beneficial variants leaves genomic footprints characterized by a reduction of genetic variation at linked neutral sites and strong, localized allele frequency differentiation among subpopulations. In contrast, for phenotypic evolution the effect of adaptation on the genes controlling the trait is little understood. Theoretical work on polygenic selection suggests that fixations of beneficial alleles (causing selective sweeps) are less likely than small-to-moderate allele frequency shifts among subpopulations. This thesis encompasses three projects in which we have experimentally addressed the issue of selective sweeps vs. allele frequency shifts in the context of polygenic adaptation. We studied three X-linked QTL underlying variation in chill coma recovery time (CCRT), a proxy for cold tolerance, in Drosophila melanogaster from temperate (European) and tropical (African) environments. The analysis of these QTL was performed by means of selective sweep mapping and quantitative complementation tests coupled with expression assays. While the results of the selective sweep mapping approach identified a gene (CG4491) that is unlikely to be affecting CCRT, quantitative and gene expression analyses revealed two linked candidate genes (brk and CG1677) that appear to differ in their evolutionary histories. We found that the difference in expression of the gene brk between populations affects CCRT variation. Cold tolerant flies from the temperate zone have a lower expression of this gene than cold sensitive flies from the tropics. We found that a likely cause of this difference is variation in a cis-regulatory element in the brk 5’ enhancer region. Sequence variants in this element exhibit moderate frequency differences between populations from temperate and tropical environments, forming two latitudinal clines: one from the equator to the north and another one in opposite direction to the south. In contrast, the other gene within the same QTL (CG1677), which is linked to brk, showed no measurable effect on cold tolerance but is a likely target of strong positive selection leading to a selective sweep in the European population. These results are consistent with the aforementioned theoretical predictions about footprints of selection in polygenic adaptation. They are also proof of the conceptual bias incurred when identifying candidate genes within a QTL via selective sweep mapping, at least in naturally evolving populations. The challenge for the evolutionary genetics community in the coming years is to develop statistical tools that are as powerful and robust as those already available to map selective sweeps to identify sites in the genome where allele frequency shifts have occurred due to adaptive evolution at the phenotypic level. Finally, the last section of the results is a report of a new population genetics dataset. It consists of a collection of 80 inbred lines from a natural D. melanogaster population in Sweden and 19 full genome sequences derived from this sample. We hope this material will provide us with further insight into the processes underlying adaptation to novel and stressful environments
    corecore