2,774 research outputs found

    Inferring activity changes of transcription factors by binding association with sorted expression profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification of transcription factors (TFs) associated with a biological process is fundamental to understanding its regulatory mechanisms. From microarray data, however, the activity changes of TFs often cannot be directly observed due to their relatively low expression levels, post-transcriptional modifications, and other complications. Several approaches have been proposed to infer TF activity changes from microarray data. In some models, a linear relationship between gene expression and TF-gene binding strength is assumed. In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes.</p> <p>Results</p> <p>We propose a novel method, referred to as BASE (binding association with sorted expression), to infer TF activity changes from microarray expression profiles with the help of binding affinity data. It searches the maximum association between bind affinity profile of a TF and expression change profile along the direction of sorted differentiation. The method does not make hard target gene selection, rather, the significances of TF activity changes are evaluated by permutation tests of binding association at the end. To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively. The implications obtained from all three examples are consistent with established biological results. Moreover, the inferences suggest new and biological meaningful hypotheses for further investigation.</p> <p>Conclusion</p> <p>The proposed method makes transcription inference from profiles of expression and binding affinity. The same machinery can be used to deal with various kinds of binding affinity data. The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity. This method is easy to implement and can be routinely applied for transcriptional inferences in microarray studies.</p

    Inferring TF activities and activity regulators from gene expression data with constraints from TF perturbation data

    Get PDF
    MOTIVATION: The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Many methods of inferring TF activity from gene expression data have been described, but due to the lack of appropriate large-scale datasets, systematic and objective validation has not been possible until now. RESULTS: We systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a condition-independent matrix of control strengths and a condition-dependent matrix of TF activity levels. We find that expression data in which the activities of individual TFs have been perturbed are both necessary and sufficient for obtaining good performance. To a considerable extent, control strengths inferred using expression data from one growth condition carry over to other conditions, so the control strength matrices derived here can be used by others. Finally, we apply these methods to gain insight into the upstream factors that regulate the activities of yeast TFs Gcr2, Gln3, Gcn4 and Msn2. AVAILABILITY AND IMPLEMENTATION: Evaluation code and data are available at https://doi.org/10.5281/zenodo.4050573. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    A framework to identify epigenome and transcription factor crosstalk

    Get PDF
    While changes in chromatin are integral to transcriptional reprogramming during cellular differentiation, it is currently unclear how chromatin modifications are targeted to specific loci. To systematically identify transcription factors (TFs) that can direct chromatin changes during cell fate decisions, we model the genome-wide dynamics of chromatin marks in terms of computationally predicted TF binding sites. By applying this computational approach to a time course of Polycomb-mediated H3K27me3 marks during neuronal differentiation of murine stem cells, we identify several motifs that likely regulate dynamics of this chromatin mark. Among these, the motifs bound by REST and by the SNAIL family of TFs are predicted to transiently recruit H3K27me3 in neuronal progenitors. We validate these predictions experimentally and show that absence of REST indeed causes loss of H3K27me3 at target promoters in trans, specifically at the neuronal progenitor state. Moreover, using targeted transgenic insertion, we show that promoter fragments containing REST or SNAIL binding sites are sufficient to recruit H3K27me3 in cis, while deletion of these sites results in loss of H3K27me3. These findings illustrate that the occurrence of TF binding sites can determine chromatin dynamics. Local determination of Polycomb activity by Rest and Snail motifs exemplifies such TF based regulation of chromatin. Furthermore, our results show that key TFs can be identified ab initio through computational modeling of epigenome datasets using a modeling approach that we make readily accessible

    Discovering transcriptional modules by Bayesian data integration

    Get PDF
    Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs

    Inferring MicroRNA Activities by Combining Gene Expression with MicroRNA Target Prediction

    Get PDF
    MicroRNAs (miRNAs) play crucial roles in a variety of biological processes via regulating expression of their target genes at the mRNA level. A number of computational approaches regarding miRNAs have been proposed, but most of them focus on miRNA gene finding or target predictions. Little computational work has been done to investigate the effective regulation of miRNAs.We propose a method to infer the effective regulatory activities of miRNAs by integrating microarray expression data with miRNA target predictions. The method is based on the idea that regulatory activity changes of miRNAs could be reflected by the expression changes of their target transcripts measured by microarray. To validate this method, we apply it to the microarray data sets that measure gene expression changes in cell lines after transfection or inhibition of several specific miRNAs. The results indicate that our method can detect activity enhancement of the transfected miRNAs as well as activity reduction of the inhibited miRNAs with high sensitivity and specificity. Furthermore, we show that our inference is robust with respect to false positives of target prediction.A huge amount of gene expression data sets are available in the literature, but miRNA regulation underlying these data sets is largely unknown. The method is easy to be implemented and can be used to investigate the miRNA effective regulation underlying the expression change profiles obtained from microarray experiments

    Inference of RNA decay rate from transcriptional profiling highlights the regulatory programs of Alzheimer's disease.

    Get PDF
    The abundance of mRNA is mainly determined by the rates of RNA transcription and decay. Here, we present a method for unbiased estimation of differential mRNA decay rate from RNA-sequencing data by modeling the kinetics of mRNA metabolism. We show that in all primary human tissues tested, and particularly in the central nervous system, many pathways are regulated at the mRNA stability level. We present a parsimonious regulatory model consisting of two RNA-binding proteins and four microRNAs that modulate the mRNA stability landscape of the brain, which suggests a new link between RBFOX proteins and Alzheimer's disease. We show that downregulation of RBFOX1 leads to destabilization of mRNAs encoding for synaptic transmission proteins, which may contribute to the loss of synaptic function in Alzheimer's disease. RBFOX1 downregulation is more likely to occur in older and female individuals, consistent with the association of Alzheimer's disease with age and gender."mRNA abundance is determined by the rates of transcription and decay. Here, the authors propose a method for estimating the rate of differential mRNA decay from RNA-seq data and model mRNA stability in the brain, suggesting a link between mRNA stability and Alzheimer's disease.

    Systematic identification of cell cycle regulated transcription factors from microarray time series data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The cell cycle has long been an important model to study the genome-wide transcriptional regulation. Although several methods have been introduced to identify cell cycle regulated genes from microarray data, they can not be directly used to investigate cell cycle regulated transcription factors (CCRTFs), because for many transcription factors (TFs) it is their activities instead of expressions that are periodically regulated across the cell cycle. To overcome this problem, it is useful to infer TF activities across the cell cycle by integrating microarray expression data with ChIP-chip data, and then examine the periodicity of the inferred activities. For most species, however, large-scale ChIP-chip data are still not available.</p> <p>Results</p> <p>We propose a two-step method to identify the CCRTFs by integrating microarray cell cycle data with ChIP-chip data or motif discovery data. In <it>S. cerevisiae</it>, we identify 42 CCRTFs, among which 23 have been verified experimentally. The cell cycle related behaviors (e.g. at which cell cycle phase a TF achieves the highest activity) predicted by our method are consistent with the well established knowledge about them. We also find that the periodical activity fluctuation of some TFs can be perturbed by the cell synchronization treatment. Moreover, by integrating expression data with in-silico motif discovery data, we identify 8 cell cycle associated regulatory motifs, among which 7 are binding sites for well-known cell cycle related TFs.</p> <p>Conclusion</p> <p>Our method is effective to identify CCRTFs by integrating microarray cell cycle data with TF-gene binding information. In <it>S. cerevisiae</it>, the TF-gene binding information is provided by the systematic ChIP-chip experiments. In other species where systematic ChIP-chip data is not available, in-silico motif discovery and analysis provide us with an alternative method. Therefore, our method is ready to be implemented to the microarray cell cycle data sets from different species. The C++ program for AC score calculation is available for download from URL <url>http://leili-lab.cmb.usc.edu/yeastaging/projects/project-base/</url>.</p

    DPRP: A Database of Phenotype-Specific Regulatory Programs Derived from Transcription Factor Binding Data

    Get PDF
    Gene expression profiling has been extensively used in the past decades, resulting in an enormous amount of expression data available in public databases. These data sets are informative in elucidating transcriptional regulation of genes underlying various biological and clinical conditions. However, it is usually difficult to identify transcription factors (TFs) responsible for gene expression changes directly from their own expression, as TF activity is often regulated at the posttranscriptional level. In recent years, technical advances have made it possible to systematically determine the target genes of TFs by ChIP-seq experiments. To identify the regulatory programs underlying gene expression profiles, we constructed a database of phenotype-specific regulatory programs (DPRP, http://syslab.nchu.edu.tw/DPRP/) derived from the integrative analysis of TF binding data and gene expression data. DPRP provides three methods: the Fisher’s Exact Test, the Kolmogorov–Smirnov test and the BASE algorithm to facilitate the application of gene expression data for generating new hypotheses on transcriptional regulatory programs in biological and clinical studies

    ChIP-on-chip significance analysis reveals large-scale binding and regulation by human transcription factor oncogenes

    Get PDF
    ChIP-on-chip has emerged as a powerful tool to dissect the complex network of regulatory interactions between transcription factors and their targets. However, most ChIP-on-chip analysis methods use conservative approaches aimed to minimize false-positive transcription factor targets. We present a model with improved sensitivity in detecting binding events from ChIP-on-chip data. Biochemically validated analysis in human T-cells reveals that three transcription factor oncogenes, NOTCH1, MYC, and HES1, bind one order of magnitude more promoters than previously thought. Gene expression profiling upon NOTCH1 inhibition shows broad-scale functional regulation across the entire range of predicted target genes, establishing a closer link between occupancy and regulation. Finally, the resolution of a more complete map of transcriptional targets reveals that MYC binds nearly all promoters bound by NOTCH1. Overall, these results suggest an unappreciated complexity of transcriptional regulatory networks and highlight the fundamental importance of genome-scale analysis to represent transcriptional programs
    corecore