8 research outputs found

    Forces Shaping the Fastest Evolving Regions in the Human Genome

    Get PDF
    Comparative genomics allow us to search the human genome for segments that were extensively changed in the last ~5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome

    Substitution Bias and Acceleration

    No full text
    <div><p>W→S substitutions (red) increase with acceleration, while S→W substitutions (blue) do not.</p> <p>(A) Proportion of all bases that have W→S and S→W changes versus acceleration in our genome-wide scan of 34,498 elements. The mean proportion of each type of substitution is plotted for four groups based on the amount of acceleration as quantified by the LRT: extreme (<i>p</i> < 4.5<i>e−4)</i>, high (4.5<i>e−4 ≤ p</i> < 0.05), medium (0.05 ≤ <i>p</i> < 0.1), and low (<i>p</i> ≤ 0.1). These groups correspond to HAR1–HAR5, HAR6–HAR49, HAR50–HAR202, and the remaining ~34,000 conserved elements. The normal 95% confidence interval for each mean is shown with dotted lines. These are estimates of the unconditional probability <i>P(human = S, ancestor = W)</i> that a base is strong in human and weak in the ancestral consensus sequence, and vice versa. The differences between substitution types are statistically significant in the extreme and high groups.</p> <p>(B) The same plot, but dividing by the proportion of ancestral bases that are weak or strong. These are estimates of the conditional probability <i>P(human = S| ancestor = W)</i> that a base is strong in human, given that the ancestral base is weak, and vice versa. The differences between substitution types are significant in the extreme group only.</p></div

    Comparison of Substitution Rates in HAR1–HAR5

    No full text
    <p>For each HAR element, the estimated substitution rate is indicated by a circle for the human lineage and by a triangle for the chimp lineage. As a benchmark, background human-chimp substitution rates estimated from 4d sites in ENCODE regions [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020168#pgen-0020168-b039" target="_blank">39</a>] are marked with vertical lines, solid red for the genome-wide neutral rate, and dotted blue for the neutral rate in final chromosome bands. The chimp rates in all five elements fall well below the human rates, which exceed the background rates by as much as an order of magnitude. H, human; C, chimp.</p

    Perspectives on ENCODE

    No full text
    The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.11Nsciescopu

    Expanded encyclopaedias of DNA elements in the human and mouse genomes

    No full text
    AbstractThe human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.11Nsciescopu
    corecore