4 research outputs found

    The landscape of tolerated genetic variation in humans and primates

    Get PDF

    Identification of constrained sequence elements across 239 primate genomes

    Get PDF
    Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3–9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals

    The landscape of tolerated genetic variation in humans and primates.

    Get PDF
    Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases

    Identification of constrained sequence elements across 243 primate genomes

    No full text
    Noncoding DNA is central to our understanding of human gene regulation and complex diseases, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome. Identifying the genomic elements that have become constrained specifically in primates has remained largely elusive due to the faster evolution of noncoding DNA compared to protein-coding DNA, the relatively short timescales separating primate species, and the previously limited availability of whole genome sequences. Here, we construct a whole genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements under selective constraint across primates and other mammals at a 5% false discovery rate. We detect 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants affecting gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals
    corecore