25 research outputs found
Expanded encyclopaedias of DNA elements in the human and mouse genomes
All data are available on the ENCODE data portal: www.encodeproject. org. All code is available on GitHub from the links provided in the methods section. Code related to the Registry of cCREs can be found at https:// github.com/weng-lab/ENCODE-cCREs. Code related to SCREEN can be found at https://github.com/weng-lab/SCREEN.© The Author(s) 2020. The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.This work was supported by grants from the NIH under U01HG007019, U01HG007033, U01HG007036, U01HG007037, U41HG006992, U41HG006993, U41HG006994, U41HG006995, U41HG006996, U41HG006997, U41HG006998, U41HG006999, U41HG007000, U41HG007001, U41HG007002, U41HG007003, U54HG006991, U54HG006997, U54HG006998, U54HG007004, U54HG007005, U54HG007010 and UM1HG009442
Allele-specific control of replication timing and genome organization during development
DNA replication occurs in a defined temporal order known as the replication-timing (RT) program. RT is regulated during development in discrete chromosomal units, coordinated with transcriptional activity and 3D genome organization. Here, we derived distinct cell types from F1 hybrid musculus × castaneus mouse crosses and exploited the high single-nucleotide polymorphism (SNP) density to characterize allelic differences in RT (Repli-seq), genome organization (Hi-C and promoter-capture Hi-C), gene expression (total nuclear RNA-seq), and chromatin accessibility (ATAC-seq). We also present HARP, a new computational tool for sorting SNPs in phased genomes to efficiently measure allele-specific genome-wide data. Analysis of six different hybrid mESC clones with different genomes (C57BL/6, 129/sv, and CAST/Ei), parental configurations, and gender revealed significant RT asynchrony between alleles across ∼12% of the autosomal genome linked to subspecies genomes but not to parental origin, growth conditions, or gender. RT asynchrony in mESCs strongly correlated with changes in Hi-C compartments between alleles but not as strongly with SNP density, gene expression, imprinting, or chromatin accessibility. We then tracked mESC RT asynchronous regions during development by analyzing differentiated cell types, including extraembryonic endoderm stem (XEN) cells, four male and female primary mouse embryonic fibroblasts (MEFs), and neural precursor cells (NPCs) differentiated in vitro from mESCs with opposite parental configurations. We found that RT asynchrony and allelic discordance in Hi-C compartments seen in mESCs were largely lost in all differentiated cell types, accompanied by novel sites of allelic asynchrony at a considerably smaller proportion of the genome, suggesting that genome organization of homologs converges to similar folding patterns during cell fate commitment
Transcription-dependent regulation of replication dynamics modulates genome stability
International audienceCommon fragile sites (CFSs) are loci that are hypersensitive to replication stress and hotspots for chromosomal rearrangements in cancers. CFSs replicate late in S phase, are cell-type specific and nest in large genes. The relative impact of transcription-replication conflicts versus a low density in initiation events on fragility is currently debated. Here we addressed the relationships between transcription, replication, and instability by manipulating the transcription of endogenous large genes in chicken and human cells. We found that inducing low transcription with a weak promoter destabilized large genes, whereas stimulating their transcription with strong promoters alleviated instability. Notably, strong promoters triggered a switch to an earlier replication timing, supporting a model in which high transcription levels give cells more time to complete replication before mitosis. Transcription could therefore contribute to maintaining genome integrity, challenging the dominant view that it is exclusively a threat
Integrative detection and analysis of structural variation in cancer genomes
Structural variants (SVs) can contribute to oncogenesis through a variety of mechanisms. Despite their importance, the identification of SVs in cancer genomes remains challenging. Here, we present a framework that integrates optical mapping, high-throughput chromosome conformation capture (Hi-C), and whole-genome sequencing to systematically detect SVs in a variety of normal or cancer samples and cell lines. We identify the unique strengths of each method and demonstrate that only integrative approaches can comprehensively identify SVs in the genome. By combining Hi-C and optical mapping, we resolve complex SVs and phase multiple SV events to a single haplotype. Furthermore, we observe widespread structural variation events affecting the functions of noncoding sequences, including the deletion of distal regulatory sequences, alteration of DNA replication timing, and the creation of novel three-dimensional chromatin structural domains. Our results indicate that noncoding SVs may be underappreciated mutational drivers in cancer genomes