18 research outputs found

    A neural network based model effectively predicts enhancers from clinical ATAC-seq samples.

    Get PDF
    Enhancers are cis-acting sequences that regulate transcription rates of their target genes in a cell-specific manner and harbor disease-associated sequence variants in cognate cell types. Many complex diseases are associated with enhancer malfunction, necessitating the discovery and study of enhancers from clinical samples. Assay for Transposase Accessible Chromatin (ATAC-seq) technology can interrogate chromatin accessibility from small cell numbers and facilitate studying enhancers in pathologies. However, on average, ~35% of open chromatin regions (OCRs) from ATAC-seq samples map to enhancers. We developed a neural network-based model, Predicting Enhancers from ATAC-Seq data (PEAS), to effectively infer enhancers from clinical ATAC-seq samples by extracting ATAC-seq data features and integrating these with sequence-related features (e.g., GC ratio). PEAS recapitulated ChromHMM-defined enhancers in CD14+ monocytes, CD4+ T cells, GM12878, peripheral blood mononuclear cells, and pancreatic islets. PEAS models trained on these 5 cell types effectively predicted enhancers in four cell types that are not used in model training (EndoC-βH1, naïve CD8+ T, MCF7, and K562 cells). Finally, PEAS inferred individual-specific enhancers from 19 islet ATAC-seq samples and revealed variability in enhancer activity across individuals, including those driven by genetic differences. PEAS is an easy-to-use tool developed to study enhancers in pathologies by taking advantage of the increasing number of clinical epigenomes

    CoRE-ATAC: A deep learning model for the functional classification of regulatory elements from single cell and bulk ATAC-seq data.

    Get PDF
    Cis-Regulatory elements (cis-REs) include promoters, enhancers, and insulators that regulate gene expression programs via binding of transcription factors. ATAC-seq technology effectively identifies active cis-REs in a given cell type (including from single cells) by mapping accessible chromatin at base-pair resolution. However, these maps are not immediately useful for inferring specific functions of cis-REs. For this purpose, we developed a deep learning framework (CoRE-ATAC) with novel data encoders that integrate DNA sequence (reference or personal genotypes) with ATAC-seq cut sites and read pileups. CoRE-ATAC was trained on 4 cell types (n = 6 samples/replicates) and accurately predicted known cis-RE functions from 7 cell types (n = 40 samples) that were not used in model training (mean average precision = 0.80, mean F1 score = 0.70). CoRE-ATAC enhancer predictions from 19 human islet samples coincided with genetically modulated gain/loss of enhancer activity, which was confirmed by massively parallel reporter assays (MPRAs). Finally, CoRE-ATAC effectively inferred cis-RE function from aggregate single nucleus ATAC-seq (snATAC) data from human blood-derived immune cells that overlapped with known functional annotations in sorted immune cells, which established the efficacy of these models to study cis-RE functions of rare cells without the need for cell sorting. ATAC-seq maps from primary human cells reveal individual- and cell-specific variation in cis-RE activity. CoRE-ATAC increases the functional resolution of these maps, a critical step for studying regulatory disruptions behind diseases

    Concomitant inhibition of PPARγ and mTORC1 induces the differentiation of human monocytes into highly immunogenic dendritic cells.

    Get PDF
    Monocytes can differentiate into macrophages (Mo-Macs) or dendritic cells (Mo-DCs). The cytokine granulocyte-macrophage colony-stimulating factor (GM-CSF) induces the differentiation of monocytes into Mo-Macs, while the combination of GM-CSF/interleukin (IL)-4 is widely used to generate Mo-DCs for clinical applications and to study human DC biology. Here, we report that pharmacological inhibition of the nuclear receptor peroxisome proliferator-activated receptor gamma (PPARγ) in the presence of GM-CSF and the absence of IL-4 induces monocyte differentiation into Mo-DCs. Remarkably, we find that simultaneous inhibition of PPARγ and the nutrient sensor mammalian target of rapamycin complex 1 (mTORC1) induces the differentiation of Mo-DCs with stronger phenotypic stability, superior immunogenicity, and a transcriptional profile characterized by a strong type I interferon (IFN) signature, a lower expression of a large set of tolerogenic genes, and the differential expression of several transcription factors compared with GM-CSF/IL-4 Mo-DCs. Our findings uncover a pathway that tailors Mo-DC differentiation with potential implications in the fields of DC vaccination and cancer immunotherapy

    AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data.

    Get PDF
    Detecting multiplets in single nucleus (sn)ATAC-seq data is challenging due to data sparsity and limited dynamic range. AMULET (ATAC-seq MULtiplet Estimation Tool) enumerates regions with greater than two uniquely aligned reads across the genome to effectively detect multiplets. We evaluate the method by generating snATAC-seq data in the human blood and pancreatic islet samples. AMULET has high precision, estimated via donor-based multiplexing, and high recall, estimated via simulated multiplets, compared to alternatives and identifies multiplets most effectively when a certain read depth of 25K median valid reads per nucleus is achieved

    Multiomic Profiling Identifies cis-Regulatory Networks Underlying Human Pancreatic β Cell Identity and Function.

    Get PDF
    EndoC-βH1 is emerging as a critical human β cell model to study the genetic and environmental etiologies of β cell (dys)function and diabetes. Comprehensive knowledge of its molecular landscape is lacking, yet required, for effective use of this model. Here, we report chromosomal (spectral karyotyping), genetic (genotyping), epigenomic (ChIP-seq and ATAC-seq), chromatin interaction (Hi-C and Pol2 ChIA-PET), and transcriptomic (RNA-seq and miRNA-seq) maps of EndoC-βH1. Analyses of these maps define known (e.g., PDX1 and ISL1) and putative (e.g., PCSK1 and mir-375) β cell-specific transcriptional cis-regulatory networks and identify allelic effects on cis-regulatory element use. Importantly, comparison with maps generated in primary human islets and/or β cells indicates preservation of chromatin looping but also highlights chromosomal aberrations and fetal genomic signatures in EndoC-βH1. Together, these maps, and a web application we created for their exploration, provide important tools for the design of experiments to probe and manipulate the genetic programs governing β cell identity and (dys)function in diabetes

    Integrative Machine Learning and Network Mining Models for the Inference of Regulatory Elements and Interactions in Human Cells

    Get PDF
    With the increase in diverse genome profiling technologies and publicly available ontology databases ranging from open chromatin profiles to the 3D structure of the genome, it is imperative to build novel computational methods that take full advantage of these diverse datasets to uncover the regulatory mechanisms behind cellular functions. Integrating these datasets offers the opportunity to identify regulatory elements (i.e., promoter, enhancers, etc.) and interactions critical for cell-type-specific functions. Here, the goal’s two fold: 1) inference of regulatory interactions and networks from 3D chromatin interaction datasets and 2) inference of cell-specific and non-specific regulatory elements such as enhancers (regulatory elements that target gene promoters and regulate their expression). To address the first goal, two software tools were developed: (1) a web-accessible application: Querying and visualizing chromatin Interaction Network (QuIN) and (2) a pathway analysis prioritization tool: Triangulation of Perturbation Origins and Identification of Non-Coding Targets (TriPOINT). QuIN enables users to easily mine chromatin interaction datasets and integrate them with other sources such as SNPs and epigenetic marks to ultimately build networks to query and visualize them in downstream analyses and to prioritize genomic loci (i.e., disease-causing variants). Similarly, TriPOINT uses pathways in conjunction with chromatin interaction networks to identify perturbed genes in treatment vs. control cases, implementing pathway topology based approaches for identifying inconsistencies in pathways and incorporating the capabilities of QuIN to integrate non-coding regulators targeting genes in these pathways through chromatin interaction data. The second goal was achieved using two approaches. First, features obtained from network mining were trained on support vector machines to assess the predictive power in identifying cell-type-specific promoters (broad domains) and enhancers (super enhancers) from chromatin interaction networks. Network signatures were mined in three cell lines (MCF-7, K562, and GM12878) using QuIN across multiple chromatin interaction assays (ChIA-PET, Hi-C, and HiChIP) and it was discovered that network related features could effectively discriminate typical promoters and enhancers from cell-type-specific ones. Second, features from Assay for Transposase Accessible Chromatin (ATAC-seq) were profiled to identify enhancers from accessible chromatin in neural network models. Models were highly predictive of enhancers; useful for individual specific and clinical sample settings

    Chromatin interaction networks revealed unique connectivity patterns of broad H3K4me3 domains and super enhancers in 3D chromatin.

    No full text
    Broad domain promoters and super enhancers are regulatory elements that govern cell-specific functions and harbor disease-associated sequence variants. These elements are characterized by distinct epigenomic profiles, such as expanded deposition of histone marks H3K27ac for super enhancers and H3K4me3 for broad domains, however little is known about how they interact with each other and the rest of the genome in three-dimensional chromatin space. Using network theory methods, we studied chromatin interactions between broad domains and super enhancers in three ENCODE cell lines (K562, MCF7, GM12878) obtained via ChIA-PET, Hi-C, and Hi-CHIP assays. In these networks, broad domains and super enhancers interact more frequently with each other compared to their typical counterparts. Network measures and graphlets revealed distinct connectivity patterns associated with these regulatory elements that are robust across cell types and alternative assays. Machine learning models showed that these connectivity patterns could effectively discriminate broad domains from typical promoters and super enhancers from typical enhancers. Finally, targets of broad domains in these networks were enriched in disease-causing SNPs of cognate cell types. Taken together these results suggest a robust and unique organization of the chromatin around broad domains and super enhancers: loci critical for pathologies and cell-specific functions. Sci Rep 2017 Oct 31; 7(1):1446

    CoRE-ATAC: A deep learning model for the functional classification of regulatory elements from single cell and bulk ATAC-seq data.

    No full text
    Cis-Regulatory elements (cis-REs) include promoters, enhancers, and insulators that regulate gene expression programs via binding of transcription factors. ATAC-seq technology effectively identifies active cis-REs in a given cell type (including from single cells) by mapping accessible chromatin at base-pair resolution. However, these maps are not immediately useful for inferring specific functions of cis-REs. For this purpose, we developed a deep learning framework (CoRE-ATAC) with novel data encoders that integrate DNA sequence (reference or personal genotypes) with ATAC-seq cut sites and read pileups. CoRE-ATAC was trained on 4 cell types (n = 6 samples/replicates) and accurately predicted known cis-RE functions from 7 cell types (n = 40 samples) that were not used in model training (mean average precision = 0.80, mean F1 score = 0.70). CoRE-ATAC enhancer predictions from 19 human islet samples coincided with genetically modulated gain/loss of enhancer activity, which was confirmed by massively parallel reporter assays (MPRAs). Finally, CoRE-ATAC effectively inferred cis-RE function from aggregate single nucleus ATAC-seq (snATAC) data from human blood-derived immune cells that overlapped with known functional annotations in sorted immune cells, which established the efficacy of these models to study cis-RE functions of rare cells without the need for cell sorting. ATAC-seq maps from primary human cells reveal individual- and cell-specific variation in cis-RE activity. CoRE-ATAC increases the functional resolution of these maps, a critical step for studying regulatory disruptions behind diseases

    QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks

    No full text
    <div><p>Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. <b>AVAILABILITY:</b> QuIN’s web server is available at <a href="http://quin.jax.org" target="_blank">http://quin.jax.org</a> QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: <a href="https://github.com/UcarLab/QuIN/" target="_blank">https://github.com/UcarLab/QuIN/</a>.</p></div
    corecore