32,616 research outputs found
A Solvable Sequence Evolution Model and Genomic Correlations
We study a minimal model for genome evolution whose elementary processes are
single site mutation, duplication and deletion of sequence regions and
insertion of random segments. These processes are found to generate long-range
correlations in the composition of letters as long as the sequence length is
growing, i.e., the combined rates of duplications and insertions are higher
than the deletion rate. For constant sequence length, on the other hand, all
initial correlations decay exponentially. These results are obtained
analytically and by simulations. They are compared with the long-range
correlations observed in genomic DNA, and the implications for genome evolution
are discussed.Comment: 4 pages, 4 figure
Rate and cost of adaptation in the Drosophila Genome
Recent studies have consistently inferred high rates of adaptive molecular
evolution between Drosophila species. At the same time, the Drosophila genome
evolves under different rates of recombination, which results in partial
genetic linkage between alleles at neighboring genomic loci. Here we analyze
how linkage correlations affect adaptive evolution. We develop a new inference
method for adaptation that takes into account the effect on an allele at a
focal site caused by neighboring deleterious alleles (background selection) and
by neighboring adaptive substitutions (hitchhiking). Using complete genome
sequence data and fine-scale recombination maps, we infer a highly
heterogeneous scenario of adaptation in Drosophila. In high-recombining
regions, about 50% of all amino acid substitutions are adaptive, together with
about 20% of all substitutions in proximal intergenic regions. In
low-recombining regions, only a small fraction of the amino acid substitutions
are adaptive, while hitchhiking accounts for the majority of these changes.
Hitchhiking of deleterious alleles generates a substantial collateral cost of
adaptation, leading to a fitness decline of about 30/2N per gene and per
million years in the lowest-recombining regions. Our results show how
recombination shapes rate and efficacy of the adaptive dynamics in eukaryotic
genomes
A Model-Based Analysis of GC-Biased Gene Conversion in the Human and Chimpanzee Genomes
GC-biased gene conversion (gBGC) is a recombination-associated process that favors the fixation of G/C alleles over A/T alleles. In mammals, gBGC is hypothesized to contribute to variation in GC content, rapidly evolving sequences, and the fixation of deleterious mutations, but its prevalence and general functional consequences remain poorly understood. gBGC is difficult to incorporate into models of molecular evolution and so far has primarily been studied using summary statistics from genomic comparisons. Here, we introduce a new probabilistic model that captures the joint effects of natural selection and gBGC on nucleotide substitution patterns, while allowing for correlations along the genome in these effects. We implemented our model in a computer program, called phastBias, that can accurately detect gBGC tracts about 1 kilobase or longer in simulated sequence alignments. When applied to real primate genome sequences, phastBias predicts gBGC tracts that cover roughly 0.3% of the human and chimpanzee genomes and account for 1.2% of human-chimpanzee nucleotide differences. These tracts fall in clusters, particularly in subtelomeric regions; they are enriched for recombination hotspots and fast-evolving sequences; and they display an ongoing fixation preference for G and C alleles. They are also significantly enriched for disease-associated polymorphisms, suggesting that they contribute to the fixation of deleterious alleles. The gBGC tracts provide a unique window into historical recombination processes along the human and chimpanzee lineages. They supply additional evidence of long-term conservation of megabase-scale recombination rates accompanied by rapid turnover of hotspots. Together, these findings shed new light on the evolutionary, functional, and disease implications of gBGC. The phastBias program and our predicted tracts are freely available. © 2013 Capra et al
Extensive divergence of transcription factor binding in Drosophila embryos with highly conserved gene expression
Extensive divergence of transcription factor binding in Drosophila embryos
with highly conserved gene expressionComment: 7 figures, 20 supplementary figures, 6 supplementary tables Paris M,
Kaplan T, Li XY, Villalta JE, Lott SE, et al. (2013) Extensive Divergence of
Transcription Factor Binding in Drosophila Embryos with Highly Conserved Gene
Expression. PLoS Genet 9(9): e1003748. doi:10.1371/journal.pgen.100374
Substantial regional variation in substitution rates in the human genome: importance of GC content, gene density and telomere-specific effects
This study presents the first global, 1 Mbp level analysis of patterns of
nucleotide substitutions along the human lineage. The study is based on the
analysis of a large amount of repetitive elements deposited into the human
genome since the mammalian radiation, yielding a number of results that would
have been difficult to obtain using the more conventional comparative method of
analysis. This analysis revealed substantial and consistent variability of
rates of substitution, with the variability ranging up to 2-fold among
different regions. The rates of substitutions of C or G nucleotides with A or T
nucleotides vary much more sharply than the reverse rates suggesting that much
of that variation is due to differences in mutation rates rather than in the
probabilities of fixation of C/G vs. A/T nucleotides across the genome. For all
types of substitution we observe substantially more hotspots than coldspots,
with hotspots showing substantial clustering over tens of Mbp's. Our analysis
revealed that GC-content of surrounding sequences is the best predictor of the
rates of substitution. The pattern of substitution appears very different near
telomeres compared to the rest of the genome and cannot be explained by the
genome-wide correlations of the substitution rates with GC content or exon
density. The telomere pattern of substitution is consistent with natural
selection or biased gene conversion acting to increase the GC-content of the
sequences that are within 10-15 Mbp away from the telomere.Comment: 35 pages, 6 figure
- …