Search CORE

12,109 research outputs found

Predicting Combinatorial Binding of Transcription Factors to Regulatory Elements in the Human Genome by Association Rule Mining

Author: Iyer Vishwanath R.
Miranker Daniel P.
Morgan Xochitl C.
Ni Sshulin
Publication venue
Publication date: 01/01/2007
Field of study

Cis-acting transcriptional regulatory elements in mammalian genomes typically contain specific combinations of binding sites for various transcription factors. Although some cisregulatory elements have been well studied, the combinations of transcription factors that regulate normal expression levels for the vast majority of the 20,000 genes in the human genome are unknown. We hypothesized that it should be possible to discover transcription factor combinations that regulate gene expression in concert by identifying over-represented combinations of sequence motifs that occur together in the genome. In order to detect combinations of transcription factor binding motifs, we developed a data mining approach based on the use of association rules, which are typically used in market basket analysis. We scored each segment of the genome for the presence or absence of each of 83 transcription factor binding motifs, then used association rule mining algorithms to mine this dataset, thus identifying frequently occurring pairs of distinct motifs within a segment. Results: Support for most pairs of transcription factor binding motifs was highly correlated across different chromosomes although pair significance varied. Known true positive motif pairs showed higher association rule support, confidence, and significance than background. Our subsets of high-confidence, high-significance mined pairs of transcription factors showed enrichment for co-citation in PubMed abstracts relative to all pairs, and the predicted associations were often readily verifiable in the literature. Conclusion: Functional elements in the genome where transcription factors bind to regulate expression in a combinatorial manner are more likely to be predicted by identifying statistically and biologically significant combinations of transcription factor binding motifs than by simply scanning the genome for the occurrence of binding sites for a single transcription factor.NIAAA Alcohol Training GrantNational Science FoundationCellular and Molecular Biolog

Crossref

PubMed Central

Texas ScholarWorks

GeneReg: integration of experimental data on the DNA transcription process

Author: Cortés-Calabuig Álvaro
De Moor Bart
Denecker Marc
Lemmens Karen
Marchal Kathleen
Pastor David
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

From data towards knowledge: Revealing the architecture of signaling systems by unifying knowledge mining and data mining of systematic perturbation data

Author: Cowart Ashley
Jin Bo
Lu Songjian
Lu Xinghua
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/02/2013
Field of study

Genetic and pharmacological perturbation experiments, such as deleting a gene and monitoring gene expression responses, are powerful tools for studying cellular signal transduction pathways. However, it remains a challenge to automatically derive knowledge of a cellular signaling system at a conceptual level from systematic perturbation-response data. In this study, we explored a framework that unifies knowledge mining and data mining approaches towards the goal. The framework consists of the following automated processes: 1) applying an ontology-driven knowledge mining approach to identify functional modules among the genes responding to a perturbation in order to reveal potential signals affected by the perturbation; 2) applying a graph-based data mining approach to search for perturbations that affect a common signal with respect to a functional module, and 3) revealing the architecture of a signaling system organize signaling units into a hierarchy based on their relationships. Applying this framework to a compendium of yeast perturbation-response data, we have successfully recovered many well-known signal transduction pathways; in addition, our analysis have led to many hypotheses regarding the yeast signal transduction system; finally, our analysis automatically organized perturbed genes as a graph reflecting the architect of the yeast signaling system. Importantly, this framework transformed molecular findings from a gene level to a conceptual level, which readily can be translated into computable knowledge in the form of rules regarding the yeast signaling system, such as "if genes involved in MAPK signaling are perturbed, genes involved in pheromone responses will be differentially expressed"

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

A compendium of Caenorhabditis elegans regulatory transcription factors: a resource for mapping transcription regulatory networks

Author: Deplancke B.
Grove C.A.
Hope I.A.
Reece-Hoyes J.S.
Shingles J.
Walhout J.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/12/2005
Field of study

Background Transcription regulatory networks are composed of interactions between transcription factors and their target genes. Whereas unicellular networks have been studied extensively, metazoan transcription regulatory networks remain largely unexplored. Caenorhabditis elegans provides a powerful model to study such metazoan networks because its genome is completely sequenced and many functional genomic tools are available. While C. elegans gene predictions have undergone continuous refinement, this is not true for the annotation of functional transcription factors. The comprehensive identification of transcription factors is essential for the systematic mapping of transcription regulatory networks because it enables the creation of physical transcription factor resources that can be used in assays to map interactions between transcription factors and their target genes. Results By computational searches and extensive manual curation, we have identified a compendium of 934 transcription factor genes (referred to as wTF2.0). We find that manual curation drastically reduces the number of both false positive and false negative transcription factor predictions. We discuss how transcription factor splice variants and dimer formation may affect the total number of functional transcription factors. In contrast to mouse transcription factor genes, we find that C. elegans transcription factor genes do not undergo significantly more splicing than other genes. This difference may contribute to differences in organism complexity. We identify candidate redundant worm transcription factor genes and orthologous worm and human transcription factor pairs. Finally, we discuss how wTF2.0 can be used together with physical transcription factor clone resources to facilitate the systematic mapping of C. elegans transcription regulatory networks. Conclusion wTF2.0 provides a starting point to decipher the transcription regulatory networks that control metazoan development and function

Infoscience - École polytechnique fédérale de Lausanne

Springer - Publisher Connector

PubMed Central

White Rose Research Online

Application of regulatory sequence analysis and metabolic network analysis to the interpretation of gene expression data

Author: A. Brazma
A.J. Enright
D. Gilbert
D. Thomas
E. Wingender
E.M. Marcotte
E.M. Marcotte
G. Reinert
H. Salgado
J. Helden van
J. Helden van
J. Helden van
J. Helden van
J. Helden van
J.H. Graber
J.L. DeRisi
M. Kanehisa
M. Pellegrini
M.B. Eisen
M.B. Eisen
P. Tamayo
P.D. Karp
P.O. Brown
P.T. Spellman
Publication venue: JOBIM
Publication date: 01/01/2000
Field of study

We present two complementary approaches for the interpretation of clusters of co-regulated genes, such as those obtained from DNA chips and related methods. Starting from a cluster of genes with similar expression profiles, two basic questions can be asked: 1. Which mechanism is responsible for the coordinated transcriptional response of the genes? This question is approached by extracting motifs that are shared between the upstream sequences of these genes. The motifs extracted are putative cis-acting regulatory elements. 2. What is the physiological meaning for the cell to express together these genes? One way to answer the question is to search for potential metabolic pathways that could be catalyzed by the products of the genes. This can be done by selecting the genes from the cluster that code for enzymes, and trying to assemble the catalyzed reactions to form metabolic pathways. We present tools to answer these two questions, and we illustrate their use with selected examples in the yeast Saccharomyces cerevisiae. The tools are available on the web (http://ucmb.ulb.ac.be/bioinformatics/rsa-tools/; http://www.ebi.ac.uk/research/pfbp/; http://www.soi.city.ac.uk/~msch/)

CiteSeerX

Crossref

DI-fusion

Brunel University Research Archive

Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs

Author: Caselle Michele
Cora' Davide
Di Cunto Ferdinando
Provero Paolo
Silengo Lorenzo
Publication venue
Publication date: 01/01/2004
Field of study

BACKGROUND: Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. RESULTS: To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. CONCLUSIONS: The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcription factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.Comment: 19 pages, 1 figure. Published version with several improvements. Supplementary material available from the author

arXiv.org e-Print Archive

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Institutional Research Information System University of Turin

Recommended from our members

Eukaryotic transcriptional regulation : from data mining to transcriptional profiling

Author: Morgan Xochitl Chamorro
Publication venue
Publication date: 01/12/2008
Field of study

textSurvival of cells and organisms requires that each of thousands of genes is expressed at the correct time in development, in the correct tissue, and under the correct conditions. Transcription is the primary point of gene regulation. Genes are activated and repressed by transcription factors, which are proteins that become active through signaling, bind, sometimes cooperatively, to regulatory regions of DNA, and interact with other proteins such as chromatin remodelers. Yeast has nearly six thousand genes, several hundred of which are transcription factors; transcription factors comprise around 2000 of the 22,000 genes in the human genome. When and how these transcription factors are activated, as well as which subsets of genes they regulate, is a current, active area of research essential to understanding the transcriptional regulatory programs of organisms. We approached this problem in two divergent ways: first, an in silico study of human transcription factor combinations, and second, an experimental study of the transcriptional response of yeast mutants deficient in DNA repair. First, in order to better understand the combinatorial nature of transcription factor binding, we developed a data mining approach to assess whether transcription factors whose binding motifs were frequently proximal in the human genome were more likely to interact. We found many instances in the literature in which over-represented transcription factor pairs co-regulated the same gene, so we used co-citation to assess the utility of this method on a larger scale. We determined that over-represented pairs were more likely to be co-cited than would be expected by chance. Because proper repair of DNA is an essential and highly-conserved process in all eukaryotes, we next used cDNA microarrays to measure differentially expressed genes in eighteen yeast deletion strains with sensitivity to the DNA cross-linking agent methyl methane sulfonate (MMS); many of these mutants were transcription factors or DNA-binding proteins. Combining this data with tools such as chromatin immunoprecipitation, gene ontology analysis, expression profile similarity, and motif analysis allowed us to propose a model for the roles of Iki3 and of YML081W, a poorly-characterized gene, in DNA repair.Institute for Cellular and Molecular Biolog

Texas ScholarWorks