21 research outputs found
MBASED: allele-specific expression detection in cancer tissues and cell lines
Allele-specific gene expression, ASE, is an important aspect of gene regulation. We developed a novel method MBASED, meta-analysis based allele-specific expression detection for ASE detection using RNA-seq data that aggregates information across multiple single nucleotide variation loci to obtain a gene-level measure of ASE, even when prior phasing information is unavailable. MBASED is capable of one-sample and two-sample analyses and performs well in simulations. We applied MBASED to a panel of cancer cell lines and paired tumor-normal tissue samples, and observed extensive ASE in cancer, but not normal, samples, mainly driven by genomic copy number alterations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0405-3) contains supplementary material, which is available to authorized users
Genome-Wide Analysis of Glucocorticoid Receptor Binding Regions in Adipocytes Reveal Gene Network Involved in Triglyceride Homeostasis
Glucocorticoids play important roles in the regulation of distinct aspects of adipocyte biology. Excess glucocorticoids in adipocytes are associated with metabolic disorders, including central obesity, insulin resistance and dyslipidemia. To understand the mechanisms underlying the glucocorticoid action in adipocytes, we used chromatin immunoprecipitation sequencing to isolate genome-wide glucocorticoid receptor (GR) binding regions (GBRs) in 3T3-L1 adipocytes. Furthermore, gene expression analyses were used to identify genes that were regulated by glucocorticoids. Overall, 274 glucocorticoid-regulated genes contain or locate nearby GBR. We found that many GBRs were located in or nearby genes involved in triglyceride (TG) synthesis (Scd-1, 2, 3, GPAT3, GPAT4, Agpat2, Lpin1), lipolysis (Lipe, Mgll), lipid transport (Cd36, Lrp-1, Vldlr, Slc27a2) and storage (S3-12). Gene expression analysis showed that except for Scd-3, the other 13 genes were induced in mouse inguinal fat upon 4-day glucocorticoid treatment. Reporter gene assays showed that except Agpat2, the other 12 glucocorticoid-regulated genes contain at least one GBR that can mediate hormone response. In agreement with the fact that glucocorticoids activated genes in both TG biosynthetic and lipolytic pathways, we confirmed that 4-day glucocorticoid treatment increased TG synthesis and lipolysis concomitantly in inguinal fat. Notably, we found that 9 of these 12 genes were induced in transgenic mice that have constant elevated plasma glucocorticoid levels. These results suggested that a similar mechanism was used to regulate TG homeostasis during chronic glucocorticoid treatment. In summary, our studies have identified molecular components in a glucocorticoid-controlled gene network involved in the regulation of TG homeostasis in adipocytes. Understanding the regulation of this gene network should provide important insight for future therapeutic developments for metabolic diseases
Recommended from our members
Statistical Aspects of ChIP-Seq Data Analysis
ChIP-Seq experiments combine the recently developed next-generation sequencing technology with the established chromatin immunoprecipitation assays to study the interactions between various classes of proteins and DNA in the cell nucleus. The experiments consist of isolating the protein-DNA complexes from the nucleus, enriching the pool of DNA fragments for those bound to the protein of interest, and sequencing the resulting pool of fragments, producing millions of short reads that can be aligned to the genome. Despite the fact that the ChIP-Seq technology has been developed very recently, a great number of studies have been carried out on the DNA binding of a variety of transcription factors in different species and tissue types. ChIP-Seq approaches have also been used to study cellular epigenomic states such as histone modifications.As with any nascent technology, a number of methodological issues need to be addressed before a proper data analysis pipeline for ChIP-Seq can be established. Some of the issues that need to be addressed are image processing and analysis, alignment of the reads to a genome or a subset of it, and identifying the signal sites along the genome. This work focuses on the issue of signal identification, the problem known as peak-finding in the literature.We describe the data-generating process for ChIP-Seq experiments and review properties of the data and various sources of biases in Chapter 1. We then review various approaches to peak-finding in Chapter 2. We provide a detailed overview of some common strategies, their relative advantages and disadvantages, and describe the statistical models used by some popular peak-finding tools. We formalize the conceptual framework of peak-finding by introducing the notions of enrichment measures and enrichment statistics and categorize various peak-finders in terms of this framework. We discuss in some detail the different kinds of control samples used in ChIP-Seq experiments, and how they are incorporated into the peak-finding procedure. We also address the important issue of validation in the context of ChIP-Seq experiments and the shortcomings of the currently available validation approaches.In Chapter 3 we propose a novel peak-finding strategy for experiments involving trancription factor binding that lack appropriate control samples (so-called one-sample experiments). Our approach accounts for genomic sequence biases in the data, namely the GC and mappability effects, and utilizes the knowledge of the shape of the read density profile in the vicinity of the true binding sites. We use deduced sets of true positive and true negative enriched regions to demonstrate that our approach is better at removing non-specifically enriched regions from the set of identified binding sites than other one-sample approaches and provides a superior spatial resolution to most examined peak-finders.Finally, in Chapter 4 we discuss the important issue of combining data from replicate samples. We discuss different kinds of replicates common in the ChIP-Seq literature and the standard approaches used to integrate data across replicates. We develop several diagnostic plots for assessing whether the standard assumption of Poisson variance holds and observe that the assumption can break down even for technical replicates due to flow cell-specific sequence composition effects
A Quartet of PIF bHLH Factors Provides a Transcriptionally Centered Signaling Hub That Regulates Seedling Morphogenesis through Differential Expression-Patterning of Shared Target Genes in <em>Arabidopsis</em>
<div><p>Dark-grown seedlings exhibit skotomorphogenic development. Genetic and molecular evidence indicates that a quartet of <em>Arabidopsis</em> Phytochrome (phy)-Interacting bHLH Factors (PIF1, 3, 4, and 5) are critically necessary to maintaining this developmental state and that light activation of phy induces a switch to photomorphogenic development by inducing rapid degradation of the PIFs. Here, using integrated ChIPβseq and RNAβseq analyses, we have identified genes that are direct targets of PIF3 transcriptional regulation, exerted by sequence-specific binding to G-box (CACGTG) or PBE-box (CACATG) motifs in the target promoters genome-wide. In addition, expression analysis of selected genes in this set, in all triple <em>pif</em>-mutant combinations, provides evidence that the PIF quartet members collaborate to generate an expression pattern that is the product of a mosaic of differential transcriptional responsiveness of individual genes to the different PIFs and of differential regulatory activity of individual PIFs toward the different genes. Together with prior evidence that all four PIFs can bind to G-boxes, the data suggest that this collective activity may be exerted via shared occupancy of binding sites in target promoters.</p> </div
Differential regulation of PIF3 direct-target genes by individual PIF-quartet proteins.
<p>(A) Individual PIF-quartet members display diverse patterns of shared regulatory activity toward genes defined as direct targets of PIF3 transcriptional activation. Expression levels in the <i>pifq</i> and <i>pif</i>-triple mutants indicated, were determined by RT-qPCR, normalized to an internal <i>PP2AA3</i> control, and presented relative to WT levels set at unity. Data are represented as the mean of biological triplicates +/β SEM. (B) Matrix of relative contributions from individual PIF proteins toward the shared transcriptional activation of individual, potentially-shared direct-target genes. Percent contribution is calculated as the proportion of the total differential expression between <i>pifq</i> and WT, that is contributed by the differential expression between <i>pifq</i> and each <i>pif</i>-triple mutant. (C) <i>In planta PIL1</i> promoter activity requires both G-box motifs and PIF-quartet members. Left: Schematic of <i>pPIL1:LUC</i> constructs expressed transgenically in either WT or <i>pifq</i> plants, as indicated. Yellow and red stripes represent the locations of three native (<i>pPIL1</i>) and mutated (<i>mpPIL1</i>) G-box motifs, respectively, in variants of the <i>PIL1</i> promoter, as shown by the DNA sequences displayed below each construct. A contiguous 35S-promoter driven <i>RLUC</i> reporter was included as an internal control in each construct. Right: Mean expression of the <i>LUC</i> reporter gene is shown as LUC enzyme activity normalized to the RLUC control in the same transgenic plant. Data represent the means of 6 or 7 independent transgenic lines +/β SEM.</p
Direct-target genes of PIF3-induced transcription (Class Z and YZ1.5 genes).
a<p>PIF3 loss-of-function (L.O.F.);</p>b<p>PIF3 gain-of function (G.O.F.);</p>c<p>PIF1/4/5-trio gain-of function (G.O.F.);</p>d<p>See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003244#pgen-1003244-g004" target="_blank">Figure 4</a> and <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003244#pgen.1003244.s003" target="_blank">Figure S3</a>;</p>e<p>See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003244#pgen.1003244-Leivar4" target="_blank">[17]</a>;</p>f<p>See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003244#pgen.1003244-Leivar3" target="_blank">[11]</a>;</p>g<p>Gene not represented on the Affymetrix ATH1 array.</p
Genome-wide identification of PIF3-binding sites and motifs.
<p>(A) Venn diagram depicting total numbers (parentheses) and reproducible presence (overlapping sectors) of statistically significant PIF3-binding peaks in ChIP-seq analysis of four biological replicates (Venn ovals) of dark-grown seedlings. (B) Relative binding-peak distribution across genomic regions. (C) MEME motif search identifies two dominant PIF3-binding motifs, defined as G-box (CACGTG) and PBE-box (CACATG) motifs. (D) Percentage of PIF3 binding sites containing designated motifs. Other E-box: Variants of E-box (CANNTG) motif other than G- or PBE-box. Unknown: Unknown and/or non-statistically-overrepresented motif. (E) Distribution of the G- and PBE-box motifs in the 1 kb regions surrounding the PIF3-binding peak-summits. (F) G- and PBE-box-motif coincidence with PIF3-binding peaks (% within 201, 101, and 51 bp centered at the peak-summits) is significantly higher than in other random genomic regions of the same size. Internal numbers indicate the relative fold motif-enrichment at PIF3 binding-sites. Error bars represent the standard deviation of 100 random simulations. (G) DPI-ELISA assays of <i>in vitro</i> binding of recombinant GST-PIF3 to the G- and PBE-box motifs. Binding activity (Relative Absorbance) for each DNA probe is expressed as a percentage of each reaction relative to GST-PIF3 binding to the <i>PIL1a</i> WT probe. Data represent the mean of independent duplicates +/β SEM. WT: wild-type competitor probes. mut: competitor probes mutated at the G-box and PBE-box motifs. GST: GST negative-control binding to the biotinylated WT probes.</p