908 research outputs found

    Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information

    Get PDF
    Background: Chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) is increasingly being applied to study genome-wide binding sites of transcription factors. There is an increasing interest in understanding the mechanism of action of co-regulator proteins, which do not bind DNA directly, but exert their effects by binding to transcription factors such as the estrogen receptor (ER). However, due to the nature of detecting indirect protein-DNA interaction, ChIP-seq signals from co-regulators can be relatively weak and thus biologically meaningful interactions remain difficult to identify

    INFORMATION INTEGRATION APPROACHES FOR INVESTIGATING ESTROGEN RECEPTOR MEDIATED TRANSCRIPTION

    Get PDF
    Estrogen plays essential roles in the function of normal physiology and diseases. Its effects are mainly mediated through two intracellular estrogen receptors, ERα and ERβ, which belong to a family of nuclear receptors (NRs) functioning as transcription regulators. In the first part of this thesis, we aim to derive a holistic view of the transcription machineries at estrogen-responsive genes and further, to reveal different mechanisms of estrogen-mediated transcription regulation. In order to achieve this, we integrated and systematically dissected a variety of genome-wide high-throughput datasets, including gene expression arrays, ChIP-seq, GRO-seq, and ChIA-PET. Our analyses have led to the following novel findings: In the absence of the ligand, most of the estrogen-responsive genes assumed a high-order chromatin configuration that involved Pol II, ERα and ERα-pioneer factors. Without the ligand, estrogen-induced genes showed active transcription at promoters but failed to elongate into gene bodies, and such a pause was lifted after estrogen treatment. However, the estrogen-repressed genes showed coordinated transcription at promoters and gene bodies in the absence and presence of estrogen. Through information integration, we inferred that, for estrogen-repressed genes, the majority of the high-order chromatin complexes containing actively transcribed genes were disrupted after estrogen treatment. The analyses led to the hypothesis that one mechanism for estrogen-mediated repression is through disrupting the original transcription-favoring chromatin structures. Further, nuclear receptors such as ERs interact with co-regulators to regulate gene transcription. Understanding the mechanism of action of co-regulator proteins—which do not bind DNA directly, but exert their effects by binding to transcription factors—is important for the study of normal physiology as well as diseased conditions. However, due to the nature of detecting indirect protein-DNA interaction, ChIP-seq signals from co-regulators can be relatively weak and thus biologically meaningful interactions remain difficult to identify. In the second part of this thesis, we investigated and compared different machine learning approaches to integrate multiple types of genomic and transcriptomic information derived from our experiments and from public databases. This helped us to overcome the difficulty of identifying functional DNA binding sites of the co-regulator SRC-1 in the context of estrogen response. Our results indicate that supervised learning with the naïve Bayes algorithm significantly enhanced the peak calling of weak ChIP-seq signals and outperformed other machine learning algorithms. Our integrative approach revealed many potential ERα/SRC-1 DNA binding sites that would otherwise be missed by conventional peak calling algorithms with default settings

    Deciphering a gene regulation network in normal mouse pancreas through a multiomic integrative approach

    Full text link
    Pancreatic acinar cells compose around 85% of the exocrine component of the pancreas, which constitutes the vast majority of the tissue. Genetically Engineered Mouse Models (GEMMs) provide evidence that pancreatic ductal adenocarcinoma (PDAC) can efficiently arise from acinar cells through a transdifferentiation process called acinar-to-ductal-metaplasia (ADM), proposing the loss of acinar cell identity as the predominant origin for PDAC. Here, we present a comprehensive multi-omic integrative approach to generate a network-based resource to interrogate the transcriptional regulation underlying acinar cell identity in wild type (WT) mouse pancreas. As a proof-of-concept, we examine the regulatory activity of several acinarexpressed transcription factors (TFs) involved in pancreas regulation and validate it by comparison with experimental ChIP-seq analysis, obtaining consistent results. We consider that this approach represents a valuable resource to perform a priori analyses that can be experimentally validated providing new knowledge to the field. Moreover, the presented methodology will be further explored to determine the optimal parameters for improving the potential in the detection of different regulatory events, and will be applied to GEMMs displaying different conditions, as well as to other organisms like human to cross-validate the results and the usefulness of our resource

    Transcriptional and epigenetic regulation of differentially activated macrophages

    Get PDF
    Macrophages are an indispensable part of the innate immune system which mediate various functions including host defense against pathogens, metabolism, tissue homeostasis and even developmental processes. These extremely heterogeneous cells can adapt their transcriptional program upon a plethora of stimulatory cues and thus exist in different activation states to facilitate their diverse roles in the body. Corresponding transcriptional changes are established amongst others by transcriptional regulators (TRs) with diverse functions or by complex epigenetic alterations. Next generation sequencing technologies provide excellent experimental methods like ChIP- or RNA-sequencing, with which one can analyze genome wide enrichment properties of DNA binding proteins or the transcriptional activity of genes to elucidate in detail the activation of macrophages on the transcriptional level. Integrating the KO implemented normalization method (KOIN) into the standard peak calling procedure revealed multiple enhancements for ChIP-seq data analysis. False-positive signals can be eliminated in a tremendous amount, while signal-to-noise ratios are increased in low and even high quality ChIP-seq data sets. Besides the identification and removal of a recently identified special type of false-positive signal called “hyper-ChIPable regions”, the biological interpretation can profoundly benefit from KOIN. Overall, the KOIN method demonstrated its value as new possible gold standard control with various advantages compared to the currently established Input chromatin and IgG ChIP-seq controls. Furthermore, the ChIP-seq technology allows the definition of 1) different activity states for promoters or cis-regulatory regions and 2) important regulators in the establishment and maintenance of the transcriptional landscape by the detection of different covalent posttranslational histone modifications (HM), like acetylation or methylation. Four differentially activated primary human macrophages demonstrated a common epigenetic core program, maintained by various promoter sites. Simultaneously, activation state specific epigenetic differences at promoters, super-enhancer regions and especially at enhancer sites could mediate their specialization upon employed stimulatory signals. Finally, despite the detected epigenetic differences an astonishing fraction of genomic loci was defined by accessible promoter and enhancer markings in macrophage activation states. This was especially demonstrated in co-regulation networks for TRs and revealed an uncoupling of epigenetic and transcriptional control in monocyte-derived activated macrophages associated with cellular plasticity in response to microenvironmental signals. Other additional levels of transcriptional fine-tuning like enhancer RNAs, repressor proteins or the cross-talk between HMs could play an important role in fine-tuning macrophage transcription. Especially, the cooperative binding of pioneer transcription factors (TF) like PU.1 with other secondary TFs like STAT proteins to these open genomic macrophage loci could represent an additional important switch in macrophage transcription in concert with HMs

    Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene-induced Signaling

    Get PDF
    Cellular signal transduction generally involves cascades of post-translational protein modifications that rapidly catalyze changes in protein-DNA interactions and gene expression. High-throughput measurements are improving our ability to study each of these stages individually, but do not capture the connections between them. Here we present an approach for building a network of physical links among these data that can be used to prioritize targets for pharmacological intervention. Our method recovers the critical missing links between proteomic and transcriptional data by relating changes in chromatin accessibility to changes in expression and then uses these links to connect proteomic and transcriptome data. We applied our approach to integrate epigenomic, phosphoproteomic and transcriptome changes induced by the variant III mutation of the epidermal growth factor receptor (EGFRvIII) in a cell line model of glioblastoma multiforme (GBM). To test the relevance of the network, we used small molecules to target highly connected nodes implicated by the network model that were not detected by the experimental data in isolation and we found that a large fraction of these agents alter cell viability. Among these are two compounds, ICG-001, targeting CREB binding protein (CREBBP), and PKF118–310, targeting β-catenin (CTNNB1), which have not been tested previously for effectiveness against GBM. At the level of transcriptional regulation, we used chromatin immunoprecipitation sequencing (ChIP-Seq) to experimentally determine the genome-wide binding locations of p300, a transcriptional co-regulator highly connected in the network. Analysis of p300 target genes suggested its role in tumorigenesis. We propose that this general method, in which experimental measurements are used as constraints for building regulatory networks from the interactome while taking into account noise and missing data, should be applicable to a wide range of high-throughput datasets.National Science Foundation (U.S.) (DB1-0821391)National Institutes of Health (U.S.) (Grant U54-CA112967)National Institutes of Health (U.S.) (Grant R01-GM089903)National Institutes of Health (U.S.) (P30-ES002109

    Mapping Transcription Factor Networks and Elucidating Their Biological Determinants

    Get PDF
    A central goal in systems biology is to accurately map the transcription factor (TF) network of a cell. Such a network map is a key component for many downstream applications, from developmental biology to transcriptome engineering, and from disease modeling to drug discovery. Building a reliable network map requires a wide range of data sources including TF binding locations and gene expression data after direct TF perturbations. However, we are facing two roadblocks. First, rich resources are available only for a few well-studied systems and cannot be easily replicated for new organisms or cell types. Second, when TF binding and TF- perturbation response data are available, they rarely converge on a common set of direct and functional targets for a TF. This dissertation explores and validates the best combination of experimental and analytic techniques to map TF networks. First, we introduce an unsupervised inference algorithm that maps TF networks by exploiting only gene expression and genome sequence data. We show that our “data light” method is more accurate at identifying direct targets of TFs than other similar methods. Second, we develop an optimization method to search for a convergent set of target genes that are independently identified by binding locations and perturbation responses of each TF. Combining this method with network inference greatly expanded the high-confidence network maps, especially when applied on datasets obtained by using recently developed experimental methods. Third, we describe a framework for predicting each gene’s responsiveness to a TF perturbation from genomic features. Using this framework, we identified properties of each gene that are independent of the perturbed TF as the major determinants of TF-perturbation responsiveness. This may lead to improvements in network mapping algorithms that exploit TF perturbation responses. Overall, this dissertation provides a scalable framework for mapping high-quality TF networks for a variety of organisms and cell types

    Studying the regulatory landscape of flowering plants

    Get PDF

    Recent advances in functional annotation and prediction of the epitranscriptome

    Get PDF
    RNA modifications, in particular N(6)-methyladenosine (m(6)A), participate in every stages of RNA metabolism and play diverse roles in essential biological processes and disease pathogenesis. Thanks to the advances in sequencing technology, tens of thousands of RNA modification sites can be identified in a typical high-throughput experiment; however, it remains a major challenge to decipher the functional relevance of these sites, such as, affecting alternative splicing, regulation circuit in essential biological processes or association to diseases. As the focus of RNA epigenetics gradually shifts from site discovery to functional studies, we review here recent progress in functional annotation and prediction of RNA modification sites from a bioinformatics perspective. The review covers naïve annotation with associated biological events, e.g., single nucleotide polymorphism (SNP), RNA binding protein (RBP) and alternative splicing, prediction of key sites and their regulatory functions, inference of disease association, and mining the diagnosis and prognosis value of RNA modification regulators. We further discussed the limitations of existing approaches and some future perspectives
    corecore