3,145 research outputs found

    CompMoby: Comparative MobyDick for detection of cis-regulatory motifs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The regulation of gene expression is complex and occurs at many levels, including transcriptional and post-transcriptional, in metazoans. Transcriptional regulation is mainly determined by sequence elements within the promoter regions of genes while sequence elements within the 3' untranslated regions of mRNAs play important roles in post-transcriptional regulation such as mRNA stability and translation efficiency. Identifying cis-regulatory elements, or motifs, in multicellular eukaryotes is more difficult compared to unicellular eukaryotes due to the larger intergenic sequence space and the increased complexity in regulation. Experimental techniques for discovering functional elements are often time consuming and not easily applied on a genome level. Consequently, computational methods are advantageous for genome-wide cis-regulatory motif detection. To decrease the search space in metazoans, many algorithms use cross-species alignment, although studies have demonstrated that a large portion of the binding sites for the same trans-acting factor do not reside in alignable regions. Therefore, a computational algorithm should account for both conserved and nonconserved cis-regulatory elements in metazoans.</p> <p>Results</p> <p>We present CompMoby (Comparative MobyDick), software developed to identify cis-regulatory binding sites at both the transcriptional and post-transcriptional levels in metazoans without prior knowledge of the trans-acting factors. The CompMoby algorithm was previously shown to identify cis-regulatory binding sites in upstream regions of genes co-regulated in embryonic stem cells. In this paper, we extend the software to identify putative cis-regulatory motifs in 3' UTR sequences and verify our results using experimentally validated data sets in mouse and human. We also detail the implementation of CompMoby into a user-friendly tool that includes a web interface to a streamlined analysis. Our software allows detection of motifs in the following three categories: one, those that are alignable and conserved; two, those that are conserved but not alignable; three, those that are species specific. One of the output files from CompMoby gives the user the option to decide what category of cis-regulatory element to experimentally pursue based on their biological problem. Using experimentally validated biological datasets, we demonstrate that CompMoby is successful in detecting cis-regulatory target sites of known and novel trans-acting factors at the transcriptional and post-transcriptional levels.</p> <p>Conclusion</p> <p>CompMoby is a powerful software tool for systematic <it>de novo </it>discovery of evolutionarily conserved and nonconserved cis-regulatory sequences involved in transcriptional or post-transcriptional regulation in metazoans. This software is freely available to users at <url>http://genome.ucsf.edu/compmoby/</url>.</p

    Allegro: Analyzing expression and sequence in concert to discover regulatory programs

    Get PDF
    A major goal of system biology is the characterization of transcription factors and microRNAs (miRNAs) and the transcriptional programs they regulate. We present Allegro, a method for de-novo discovery of cis-regulatory transcriptional programs through joint analysis of genome-wide expression data and promoter or 3′ UTR sequences. The algorithm uses a novel log-likelihood-based, non-parametric model to describe the expression pattern shared by a group of co-regulated genes. We show that Allegro is more accurate and sensitive than existing techniques, and can simultaneously analyze multiple expression datasets with more than 100 conditions. We apply Allegro on datasets from several species and report on the transcriptional modules it uncovers. Our analysis reveals a novel motif over-represented in the promoters of genes highly expressed in murine oocytes, and several new motifs related to fly development. Finally, using stem-cell expression profiles, we identify three miRNA families with pivotal roles in human embryogenesis

    Sustained-input switches for transcription factors and microRNAs are central building blocks of eukaryotic gene circuits

    Get PDF
    WaRSwap is a randomization algorithm that for the first time provides a practical network motif discovery method for large multi-layer networks, for example those that include transcription factors, microRNAs, and non-regulatory protein coding genes. The algorithm is applicable to systems with tens of thousands of genes, while accounting for critical aspects of biological networks, including self-loops, large hubs, and target rearrangements. We validate WaRSwap on a newly inferred regulatory network from Arabidopsis thaliana, and compare outcomes on published Drosophila and human networks. Specifically, sustained input switches are among the few over-represented circuits across this diverse set of eukaryotes

    Unsupervised learning of transcriptional regulatory networks via latent tree graphical models

    Get PDF
    Gene expression is a readily-observed quantification of transcriptional activity and cellular state that enables the recovery of the relationships between regulators and their target genes. Reconstructing transcriptional regulatory networks from gene expression data is a problem that has attracted much attention, but previous work often makes the simplifying (but unrealistic) assumption that regulator activity is represented by mRNA levels. We use a latent tree graphical model to analyze gene expression without relying on transcription factor expression as a proxy for regulator activity. The latent tree model is a type of Markov random field that includes both observed gene variables and latent (hidden) variables, which factorize on a Markov tree. Through efficient unsupervised learning approaches, we determine which groups of genes are co-regulated by hidden regulators and the activity levels of those regulators. Post-processing annotates many of these discovered latent variables as specific transcription factors or groups of transcription factors. Other latent variables do not necessarily represent physical regulators but instead reveal hidden structure in the gene expression such as shared biological function. We apply the latent tree graphical model to a yeast stress response dataset. In addition to novel predictions, such as condition-specific binding of the transcription factor Msn4, our model recovers many known aspects of the yeast regulatory network. These include groups of co-regulated genes, condition-specific regulator activity, and combinatorial regulation among transcription factors. The latent tree graphical model is a general approach for analyzing gene expression data that requires no prior knowledge of which possible regulators exist, regulator activity, or where transcription factors physically bind

    Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data

    Get PDF
    Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al
    corecore