32,616 research outputs found

    A Solvable Sequence Evolution Model and Genomic Correlations

    Full text link
    We study a minimal model for genome evolution whose elementary processes are single site mutation, duplication and deletion of sequence regions and insertion of random segments. These processes are found to generate long-range correlations in the composition of letters as long as the sequence length is growing, i.e., the combined rates of duplications and insertions are higher than the deletion rate. For constant sequence length, on the other hand, all initial correlations decay exponentially. These results are obtained analytically and by simulations. They are compared with the long-range correlations observed in genomic DNA, and the implications for genome evolution are discussed.Comment: 4 pages, 4 figure

    Rate and cost of adaptation in the Drosophila Genome

    Full text link
    Recent studies have consistently inferred high rates of adaptive molecular evolution between Drosophila species. At the same time, the Drosophila genome evolves under different rates of recombination, which results in partial genetic linkage between alleles at neighboring genomic loci. Here we analyze how linkage correlations affect adaptive evolution. We develop a new inference method for adaptation that takes into account the effect on an allele at a focal site caused by neighboring deleterious alleles (background selection) and by neighboring adaptive substitutions (hitchhiking). Using complete genome sequence data and fine-scale recombination maps, we infer a highly heterogeneous scenario of adaptation in Drosophila. In high-recombining regions, about 50% of all amino acid substitutions are adaptive, together with about 20% of all substitutions in proximal intergenic regions. In low-recombining regions, only a small fraction of the amino acid substitutions are adaptive, while hitchhiking accounts for the majority of these changes. Hitchhiking of deleterious alleles generates a substantial collateral cost of adaptation, leading to a fitness decline of about 30/2N per gene and per million years in the lowest-recombining regions. Our results show how recombination shapes rate and efficacy of the adaptive dynamics in eukaryotic genomes

    A Model-Based Analysis of GC-Biased Gene Conversion in the Human and Chimpanzee Genomes

    Get PDF
    GC-biased gene conversion (gBGC) is a recombination-associated process that favors the fixation of G/C alleles over A/T alleles. In mammals, gBGC is hypothesized to contribute to variation in GC content, rapidly evolving sequences, and the fixation of deleterious mutations, but its prevalence and general functional consequences remain poorly understood. gBGC is difficult to incorporate into models of molecular evolution and so far has primarily been studied using summary statistics from genomic comparisons. Here, we introduce a new probabilistic model that captures the joint effects of natural selection and gBGC on nucleotide substitution patterns, while allowing for correlations along the genome in these effects. We implemented our model in a computer program, called phastBias, that can accurately detect gBGC tracts about 1 kilobase or longer in simulated sequence alignments. When applied to real primate genome sequences, phastBias predicts gBGC tracts that cover roughly 0.3% of the human and chimpanzee genomes and account for 1.2% of human-chimpanzee nucleotide differences. These tracts fall in clusters, particularly in subtelomeric regions; they are enriched for recombination hotspots and fast-evolving sequences; and they display an ongoing fixation preference for G and C alleles. They are also significantly enriched for disease-associated polymorphisms, suggesting that they contribute to the fixation of deleterious alleles. The gBGC tracts provide a unique window into historical recombination processes along the human and chimpanzee lineages. They supply additional evidence of long-term conservation of megabase-scale recombination rates accompanied by rapid turnover of hotspots. Together, these findings shed new light on the evolutionary, functional, and disease implications of gBGC. The phastBias program and our predicted tracts are freely available. © 2013 Capra et al

    Extensive divergence of transcription factor binding in Drosophila embryos with highly conserved gene expression

    Get PDF
    Extensive divergence of transcription factor binding in Drosophila embryos with highly conserved gene expressionComment: 7 figures, 20 supplementary figures, 6 supplementary tables Paris M, Kaplan T, Li XY, Villalta JE, Lott SE, et al. (2013) Extensive Divergence of Transcription Factor Binding in Drosophila Embryos with Highly Conserved Gene Expression. PLoS Genet 9(9): e1003748. doi:10.1371/journal.pgen.100374

    Substantial regional variation in substitution rates in the human genome: importance of GC content, gene density and telomere-specific effects

    Full text link
    This study presents the first global, 1 Mbp level analysis of patterns of nucleotide substitutions along the human lineage. The study is based on the analysis of a large amount of repetitive elements deposited into the human genome since the mammalian radiation, yielding a number of results that would have been difficult to obtain using the more conventional comparative method of analysis. This analysis revealed substantial and consistent variability of rates of substitution, with the variability ranging up to 2-fold among different regions. The rates of substitutions of C or G nucleotides with A or T nucleotides vary much more sharply than the reverse rates suggesting that much of that variation is due to differences in mutation rates rather than in the probabilities of fixation of C/G vs. A/T nucleotides across the genome. For all types of substitution we observe substantially more hotspots than coldspots, with hotspots showing substantial clustering over tens of Mbp's. Our analysis revealed that GC-content of surrounding sequences is the best predictor of the rates of substitution. The pattern of substitution appears very different near telomeres compared to the rest of the genome and cannot be explained by the genome-wide correlations of the substitution rates with GC content or exon density. The telomere pattern of substitution is consistent with natural selection or biased gene conversion acting to increase the GC-content of the sequences that are within 10-15 Mbp away from the telomere.Comment: 35 pages, 6 figure
    • …
    corecore