Search CORE

206 research outputs found

Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

Author: A Atfi
A Coppe
A Fadda
A Sandelin
A Subramanian
AB Georges
C Dieterich
C Huttenhower
D Boffelli
D Cora
DA Tagle
DJ Reiss
E Davidson
E Eskin
E Wingender
G Kreiman
G Robertson
G Thijs
GL Hager
GZ Hertz
H Le Pabic
HK Lee
JD Thompson
JM Vaquerizas
JS Michaloski
Jérémy Gruel
K Quandt
L Marino-Ramirez
M Blanchette
M Blanchette
M Endoh
M Kanehisa
M Kazemian
M Rebeiz
M Tompa
MC Frith
MC Frith
Michel LeBorgne
MM Babu
Nathalie Théret
Nolwenn LeMeur
O Hallikas
Q Zhou
RW Hamming
S Falcon
S Hannenhalli
T Knittel
TA Down
TGO Consortium
TL Bailey
VK Mootha
W Thompson
Y Halperin
YH Grad
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.</p

HAL-CentraleSupelec

Springer - Publisher Connector

Directory of Open Access Journals

INRIA a CCSD electronic archive server

HAL-Rennes 1

Transcription factor co-localization patterns affect human cell type-specific gene expression.

Author: Ouwehand Willem
Rendon Augusto
Wang Dennis
Wernisch Lorenz
Publication venue: BMC Genomics
Publication date: 21/06/2012
Field of study

BACKGROUND: Cellular development requires the precise control of gene expression states. Transcription factors are involved in this regulatory process through their combinatorial binding with DNA. Information about transcription factor binding sites can help determine which combinations of factors work together to regulate a gene, but it is unclear how far the binding data from one cell type can inform about regulation in other cell types. RESULTS: By integrating data on co-localized transcription factor binding sites in the K562 cell line with expression data across 38 distinct hematopoietic cell types, we developed regression models to describe the relationship between the expression of target genes and the transcription factors that co-localize nearby. With K562 binding sites identifying the predictors, the proportion of expression explained by the models is statistically significant only for monocytic cells (p-value< 0.001), which are closely related to K562. That is, cell type specific binding patterns are crucial for choosing the correct transcription factors for the model. Comparison of predictors obtained from binding sites in the GM12878 cell line with those from K562 shows that the amount of difference between binding patterns is directly related to the quality of the prediction. By identifying individual genes whose expression is predicted accurately by the binding sites, we are able to link transcription factors FOS, TAF1 and YY1 to a sparsely studied gene LRIG2. We also find that the activity of a transcription factor may be different depending on the cell type and the identity of other co-localized factors. CONCLUSION: Our approach shows that gene expression can be explained by a modest number of co-localized transcription factors, however, information on cell-type specific binding is crucial for understanding combinatorial gene regulation.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Springer - Publisher Connector

KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

Author: Bailey
Ben-Hur
Boser
Busch
Frith
Giardine
Gordân
Gunnar Rätsch
Gupta
Harbison
Jan U. Lohmann
Joachims
Lawrence
Leibfried
Leslie
Leslie
Matys
Meinicke
Mikolajczyk
Müller
Noble
Nowak
Oliver Kohlbacher
Redman
Rätsch
Rätsch
Sandelin
Schneider
Schneider
Schölkopf
Schölkopf
Schölkopf
Sebastian J. Schultheiss
Segal
Sinha
Smith
Sonnenburg
Sonnenburg
Sonnenburg
Sonnenburg
Stormo
Swarbreck
Thijs
Wolfgang Busch
Yada
Zien
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

Computational identification of transcriptional regulatory elements in DNA sequence

Author: GuhaThakurta Debraj
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges

CiteSeerX

Large-scale motif discovery using DNA Gray code and equiprobable oligomers

Author: Alligood
Bailey
Das
Er
Gray
Lawrence
Natsuhiro Ichinose
Osamu Gotoh
Pavesi
Robertson
Sandelin
Sandve
Schneider
Suzuki
Tetsushi Yada
Tompa
Wakaguri
Wingender
Publication venue: Oxford University Press
Publication date: 03/11/2011
Field of study

Motivation: How to find motifs from genome-scale functional sequences, such as all the promoters in a genome, is a challenging problem. Word-based methods count the occurrences of oligomers to detect excessively represented ones. This approach is known to be fast and accurate compared with other methods. However, two problems have hampered the application of such methods to large-scale data. One is the computational cost necessary for clustering similar oligomers, and the other is the bias in the frequency of fixed-length oligomers, which complicates the detection of significant words

From condition-specific interactions towards the differential complexome of proteins

Author: Will Thorsten
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2020
Field of study

While capturing the transcriptomic state of a cell is a comparably simple effort with modern sequencing techniques, mapping protein interactomes and complexomes in a sample-speciﬁc manner is currently not feasible on a large scale. To understand crucial biological processes, however, knowledge on the physical interplay between proteins can be more interesting than just their mere expression. In this thesis, we present and demonstrate four software tools that unlock the cellular wiring in a condition-speciﬁc manner and promise a deeper understanding of what happens upon cell fate transitions. PPIXpress allows to exploit the abundance of existing expression data to generate speciﬁc interactomes, which can even consider alternative splicing events when protein isoforms can be related to the presence of causative protein domain interactions of an underlying model. As an addition to this work, we developed the convenient differential analysis tool PPICompare to determine rewiring events and their causes within the inferred interaction networks between grouped samples. Furthermore, we present a new implementation of the combinatorial protein complex prediction algorithm DACO that features a signiﬁcantly reduced runtime. This improvement facilitates an application of the method for a large number of samples and the resulting sample-speciﬁc complexes can ultimately be assessed quantitatively with our novel differential protein complex analysis tool CompleXChange.Das Transkriptom einer Zelle ist mit modernen Sequenzierungstechniken vergleichsweise einfach zu erfassen. Die Ermittlung von Proteininteraktionen und -komplexen wiederum ist in großem Maßstab derzeit nicht möglich. Um wichtige biologische Prozesse zu verstehen, kann das Zusammenspiel von Proteinen jedoch erheblich interessanter sein als deren reine Expression. In dieser Arbeit stellen wir vier Software-Tools vor, die es ermöglichen solche Interaktionen zustandsbezogen zu betrachten und damit ein tieferes Verständnis darüber versprechen, was in der Zelle bei Veränderungen passiert. PPIXpress ermöglicht es vorhandene Expressionsdaten zu nutzen, um die aktiven Interaktionen in einem biologischen Kontext zu ermitteln. Wenn Proteinvarianten mit Interaktionen von Proteindomänen in Verbindung gebracht werden können, kann hierbei sogar alternatives Spleißen berücksichtigen werden. Als Ergänzung dazu haben wir das komfortable Differenzialanalyse-Tool PPICompare entwickelt, welches Veränderungen des Interaktoms und deren Ursachen zwischen gruppierten Proben bestimmen kann. Darüber hinaus stellen wir eine neue Implementierung des Proteinkomplex-Vorhersagealgorithmus DACO vor, die eine deutlich reduzierte Laufzeit aufweist. Diese Verbesserung ermöglicht die Anwendung der Methode auf eine große Anzahl von Proben. Die damit bestimmten probenspeziﬁschen Komplexe können schließlich mit unserem neuartigen Differenzialanalyse-Tool CompleXChange quantitativ bewertet werden