Search CORE

13 research outputs found

Extended Sunflower Hidden Markov Models for the recognition of homotypic cis-regulatory modules}

Author: Eggeling Ralf
Grosse Ivo
Lemnian Ioana M.
Publication venue: OASIcs - OpenAccess Series in Informatics. German Conference on Bioinformatics 2013
Publication date: 01/01/2013
Field of study

The transcription of genes is often regulated not only by transcription factors binding at single sites per promoter, but by the interplay of multiple copies of one or more transcription factors binding at multiple sites forming a cis-regulatory module. The computational recognition of cis-regulatory modules from ChIP-seq or other high-throughput data is crucial in modern life and medical sciences. A common type of cis-regulatory modules are homotypic clusters of binding sites, i.e., clusters of binding sites of one transcription factor. For their recognition the homotypic Sunflower Hidden Markov Model is a promising statistical model. However, this model neglects statistical dependences among nucleotides within binding sites and flanking regions, which makes it not well suited for de-novo motif discovery. Here, we propose an extension of this model that allows statistical dependences within binding sites, their reverse complements, and flanking regions. We study the efficacy of this extended homotypic Sunflower Hidden Markov Model based on ChIP-seq data from the Human ENCODE Project and find that it often outperforms the traditional homotypic Sunflower Hidden Markov Model

Dagstuhl Research Online Publication Server

Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection

Author: Fierro Ana Carolina
Guns Tias
Marchal Kathleen
Nijssen Siegfried
Sun Hong
Thorrez Lieven
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the binding sites of the assayed transcription factor (TF) should be located, but also allows restricting the valid CRMs to those that contain the assayed TF (here referred to as applying CRM detection in a query-based mode). In this study, we show that exploiting ChIP-information in a query-based way makes in silico CRM detection a much more feasible endeavor. To be able to handle the large datasets, the query-based setting and other specificities proper to CRM detection on ChIP-Seq based data, we developed a novel powerful CRM detection method 'CPModule'. By applying it on a well-studied ChIP-Seq data set involved in self-renewal of mouse embryonic stem cells, we demonstrate how our tool can recover combinatorial regulation of five known TFs that are key in the self-renewal of mouse embryonic stem cells. Additionally, we make a number of new predictions on combinatorial regulation of these five key TFs with other TFs documented in TRANSFAC

Ghent University Academic Bibliography

PubMed Central

Erroneous attribution of relevant transcription factor binding sites despite successful prediction of cis-regulatory modules

Author: A Ochoa-Espinosa
A Siepel
A Visel
AA Philippakis
B Estrada
B Morgenstern
BP Berman
Elizabeth R Brennan
GA Maston
J Su
J Zeitlinger
JP Noonan
L Li
M Haeussler
M Markstein
Marc S Halfon
MD Schroeder
MR Kantorovitz
MS Halfon
MS Halfon
N Bray
N Negre
P Van Loo
Qianqian Zhu
R Niwa
S Kahana
T Sandmann
T Sandmann
T Vavouri
W Krivan
WJ Kent
WW Wasserman
XY Li
YH Grad
YH Liu
Yiyun Zhou
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background <it>Cis</it>-regulatory modules are bound by transcription factors to regulate gene expression. Characterizing these DNA sequences is central to understanding gene regulatory networks and gaining insight into mechanisms of transcriptional regulation, but genome-scale regulatory module discovery remains a challenge. One popular approach is to scan the genome for clusters of transcription factor binding sites, especially those conserved in related species. When such approaches are successful, it is typically assumed that the activity of the modules is mediated by the identified binding sites and their cognate transcription factors. However, the validity of this assumption is often not assessed. Results We successfully predicted five new <it>cis</it>-regulatory modules by combining binding site identification with sequence conservation and compared these to unsuccessful predictions from a related approach not utilizing sequence conservation. Despite greatly improved predictive success, the positive set had similar degrees of sequence and binding site conservation as the negative set. We explored the reasons for this by mutagenizing putative binding sites in three <it>cis</it>-regulatory modules. A large proportion of the tested sites had little or no demonstrable role in mediating regulatory element activity. Examination of loss-of-function mutants also showed that some transcription factors supposedly binding to the modules are not required for their function. Conclusions Our results raise important questions about interpreting regulatory module predictions obtained by finding clusters of conserved binding sites. Attribution of function to these sites and their cognate transcription factors may be incorrect even when modules are successfully identified. Our study underscores the importance of empirical validation of computational results even when these results are in line with expectation.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ReLA, a local alignment search tool for the identification of distal and proximal gene regulatory regions and their conserved transcription factor binding sites

Author: Abeel
Abeel
Alex Ramirez
Altschul
Arnosti
Berezikov
Birney
Blanchette
Blanco
Blanco
Blanco
Bàrbara Montserrat-Sentís
David Torrents
Down
Enrique Blanco
Friman Sánchez
Goñi
Guigo
Hsiao
Hubbard
Johnson
Kel
Kent
Loots
Matys
Montserrat Puiggròs
Palin
Pavesi
Puomila
Santi González
Schmid
Sebestyen
Smith
Sonnenburg
Tokovenko
Tompa
Tonon
Van Loo
Visel
Xie
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: The prediction and annotation of the genomic regions involved in gene expression has been largely explored. Most of the energy has been devoted to the development of approaches that detect transcription start sites, leaving the identification of regulatory regions and their functional transcription factor binding sites (TFBSs) largely unexplored and with important quantitative and qualitative methodological gaps

Crossref

PubMed Central

Emerging applications of read profiles towards the functional annotation of the genome

Author: Gorodkin Jan
Poirazi Panayiota
Pundhir Sachin
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

Functional annotation of the genome in various species is important to understand their phenotypic complexity. The road towards functional annotation involves several challenges ranging from experiments on individual molecules to large-scale analysis of high-throughput sequencing (HTS) data. HTS data is typically a result of the protocol designed to address specific research questions. The sequencing results in reads, which when mapped to a reference genome often leads to the formation of distinct patterns (read profiles). Interpretation of these read profiles are essential for the analysis in relation to the research question addressed. Several strategies have been employed at varying levels of abstraction ranging from a somewhat ad hoc to a more systematic analysis of read profiles. These include methods which can compare read profiles, e.g. from direct (non-sequence based) alignments to classification of patterns into functional groups. In this review, we highlight the emerging applications of read profiles for the annotation of non-coding RNA and cis-regulatory regions such as enhancers and promoters. We also discuss the biological rationale behind their formation

Directory of Open Access Journals

Copenhagen University Research Information System

Frontiers - Publisher Connector

PubMed Central

Conserved elements associated with ribosomal genes and their trans-splice acceptor sites in Caenorhabditis elegans

Author: Allan K. Mah
Ashburner
Barrera
Bauer
Benita
Benson
Bieri
Bryne
Cardinaux
Das
David L. Baillie
de Wit
Efimenko
Eguchi
Fields
Graber
Grange
Griffith
He
Hobert
Holdeman
Huang da
Hunt-Newbury
Kanehisa
Lin
Matys
McKay
Minovitsky
Monica C. Sleumer
Obata
Petropoulos
Roush
Schug
Sekido
Siddharthan
Sleumer
Smith
Smith
Smith
Steven J. M. Jones
Stumpf
Suzuki
Tatusov
Thacker
Thijs
Van Loo
Wang
Warner
Wu
Zhao
Publication venue: Oxford University Press
Publication date
Field of study

The recent publication of the Caenorhabditis elegans cisRED database has provided an extensive catalog of upstream elements that are conserved between nematode genomes. We have performed a secondary analysis to determine which subsequences of the cisRED motifs are found in multiple locations throughout the C. elegans genome. We used the word-counting motif discovery algorithm DME to form the motifs into groups based on sequence similarity. We then examined the genes associated with each motif group using DAVID and Ontologizer to determine which groups are associated with genes that also have significant functional associations in the Gene Ontology and other gene annotation sources. Of the 3265 motif groups formed, 612 (19%) had significant functional associations with respect to GO terms. Eight of the first 20 motif groups based on frequent dodecamers among the cisRED motif sequences were specifically associated with ribosomal protein genes; two of these were similar to mouse EBP-45, rat HNF3-family and Drosophila Zeste transcription factor binding sites. Additionally, seven motif groups were extensions of the canonical C. elegans trans-splice acceptor site. One motif group was tested for regulatory function in a series of green fluorescent protein expression experiments and was shown to be involved in pharyngeal expression

Crossref

PubMed Central