Search CORE

1,356 research outputs found

Transcription factor site dependencies in human, mouse and rat genomes

Author: A Di Cara
A Gyenesei
A Sandelin
A Sandelin
A Tomovic
A Tomovic
AG Jegga
AH Brivanlou
AJ Walhout
Andrija Tomovic
AV Morozov
B Lenhard
C Kunsch
CC Liu
D Choi
D GuhaThakurta
DC King
DE Schones
DH Crouch
Edward J Oakeley
G Caretti
G Robertson
G Zhao
H Klein
H Wang
IJ Donaldson
IJ Donaldson
J Carabana
J Karlseder
L Narlikar
L Narlikar
M Blanchette
M Defrance
Michael Stadler
O Puig
PR van Ginkel
R Sharan
R Sharan
S Impey
S Mahony
SJ Ho Sui
SM Kielbasa
T Mahmoudi
V Ferretti
W Thompson
WB Alkema
WW Wasserman
X Yan
X Zhang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background It is known that transcription factors frequently act together to regulate gene expression in eukaryotes. In this paper we describe a computational analysis of transcription factor site dependencies in human, mouse and rat genomes. Results Our approach for quantifying tendencies of transcription factor binding sites to co-occur is based on a binding site scoring function which incorporates dependencies between positions, the use of information about the structural class of each transcription factor (major/minor groove binder), and also considered the possible implications of varying GC content of the sequences. Significant tendencies (dependencies) have been detected by non-parametric statistical methodology (permutation tests). Evaluation of obtained results has been performed in several ways: reports from literature (many of the significant dependencies between transcription factors have previously been confirmed experimentally); dependencies between transcription factors are not biased due to similarities in their DNA-binding sites; the number of dependent transcription factors that belong to the same functional and structural class is significantly higher than would be expected by chance; supporting evidence from GO clustering of targeting genes. Based on dependencies between two transcription factor binding sites (second-order dependencies), it is possible to construct higher-order dependencies (networks). Moreover results about transcription factor binding sites dependencies can be used for prediction of groups of dependent transcription factors on a given promoter sequence. Our results, as well as a scanning tool for predicting groups of dependent transcription factors binding sites are available on the Internet. Conclusion We show that the computational analysis of transcription factor site dependencies is a valuable complement to experimental approaches for discovering transcription regulatory interactions and networks. Scanning promoter sequences with dependent groups of transcription factor binding sites improve the quality of transcription factor predictions.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Novartis Repository

Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

Author: A Ambesi-Impiombato
A Blais
A Eto
A Subramanian
AE Kel
AG Clark
AL Lam
AM McGuire
Anat Reiner
Assif Yitzhaky
B Ren
C Kimura-Yoshida
C Plessy
C Yang
CT Harbison
D Pfeifer
D Wang
DB Allison
E Emberly
E Segal
Eytan Domany
FP Roth
GC Pipes
GC Yuan
GQ Yao
GZ Hertz
H Li
H Lodish
J Zheng
JD Hughes
JL DeRisi
JQ Ling
K Frech
K Quandt
KD MacIsaac
L Amir-Zilberstein
L Elnitski
L Marino-Ramirez
L McCue
M Ashburner
M Kellis
M Milyavsky
MA Nobrega
Mark Koudritsky
MC Frith
ML Howard
ML Whitfield
N Rajewsky
Or Zuk
P Carninci
P Carninci
P Cliften
PM Haverty
PR Buckland
R Elkon
R Liu
R Sharan
Ran Brosh
S Aerts
S Rashi-Elkeles
S Tavazoie
SJ Cooper
SJ Ho Sui
Sui Huang
U Gerland
Varda Rotter
WW Wasserman
X Xie
Y Barash
Y Benjamini
Y Benjamini
Y Tabach
Yossi Buganim
Yuval Tabach
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Alternative Choice of Constitutive Exons throughout Evolution

Author: Adi Doron-Faigenboim
Amir Goren
Eddo Kim
Galit Lev-Maor
Gil Ast
Hadas Keren
Lisa Stubbs
Noa Sela
Shelly Leibman-Barak
Tal Pupko
Publication venue
Publication date: 01/11/2007
Field of study

Alternative cassette exons are known to originate from two processes exonization of intronic sequences and exon shuffling. Herein, we suggest an additional mechanism by which constitutively spliced exons become alternative cassette exons during evolution. We compiled a dataset of orthologous exons from human and mouse that are constitutively spliced in one species but alternatively spliced in the other. Examination of these exons suggests that the common ancestors were constitutively spliced. We show that relaxation of the 59 splice site during evolution is one of the molecular mechanisms by which exons shift from constitutive to alternative splicing. This shift is associated with the fixation of exonic splicing regulatory sequences (ESRs) that are essential for exon definition and control the inclusion level only after the transition to alternative splicing. The effect of each ESR on splicing and the combinatorial effects between two ESRs are conserved from fish to human. Our results uncover an evolutionary pathway that increases transcriptome diversity by shifting exons from constitutive to alternative splicin

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

A flexible integrative approach based on random forest improves prediction of transcription factor binding sites

Author: Abeel
Afflerbach
Angarica
Bailey
Bart Hooghe
Bauer
Benos
Breiman
Bulyk
Burden
Calladine
Camenisch
Chen
Cho
Cordell
Davis
Dickerson
Ehret
Ernst
Frans van Roy
Friedel
Fujii
Fulton
Gama-Castro
Gardiner
Gartenberg
Gershenzon
Goodsell
Gorin
Gowrisankar
Greenbaum
Gunewardena
Hall
Hendrickson
Hu
Juo
Kajimura
Kaplan
Karas
Kel
Kim
Lavery
Lewis
Liu
Liu
Liu
Long
Lu
Lu
Lu
Lunetta
Man
Marco
Marinescu
Martinez-Hackert
Matys
Medina-Rivera
Meysman
Michel
Mokry
Morozov
Narang
Naughton
O'Flanagan
Olson
Paillard
Pan
Parker
Parvin
Pieter De Bleser
Ponomarenko
Portales-Casamar
Powell
Pudimat
Ramsey
Rohs
Rohs
Rohs
Ruiz
Satchwell
Schneider
Shakked
Sharon
Shi
Spolar
Stefan Broos
Stormo
Svozil
Thayer
Tomovic
Toro-Roman
Travers
Tullius
Wunderlich
Zhang
Zhang
Zhu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Transcription factor binding sites (TFBSs) are DNA sequences of 6-15 base pairs. Interaction of these TFBSs with transcription factors (TFs) is largely responsible for most spatiotemporal gene expression patterns. Here, we evaluate to what extent sequence-based prediction of TFBSs can be improved by taking into account the positional dependencies of nucleotides (NPDs) and the nucleotide sequence-dependent structure of DNA. We make use of the random forest algorithm to flexibly exploit both types of information. Results in this study show that both the structural method and the NPD method can be valuable for the prediction of TFBSs. Moreover, their predictive values seem to be complementary, even to the widely used position weight matrix (PWM) method. This led us to combine all three methods. Results obtained for five eukaryotic TFs with different DNA-binding domains show that our method improves classification accuracy for all five eukaryotic TFs compared with other approaches. Additionally, we contrast the results of seven smaller prokaryotic sets with high-quality data and show that with the use of high-quality data we can significantly improve prediction performance. Models developed in this study can be of great use for gaining insight into the mechanisms of TF binding

Crossref

Ghent University Academic Bibliography

PubMed Central

Sequence information gain based motif analysis

Author: Marco Santiago
Maynou Fernández Joan
Pairó Erola
Perera Lluna Alexandre
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. Results: This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70 % of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Conclusions: Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Springer - Publisher Connector

PubMed Central

A ChIP-Seq Benchmark Shows That Sequence Conservation Mainly Improves Detection of Strong Transcription Factor Binding Sites

Author: A Moses
A Siepel
A Stark
BT Naughton
D Boffelli
D Karolchik
DT Odom
E Birney
Finn Drabløs
G Badis
G Sandve
J Bryne
J Ernst
J Hawkins
JA Hanley
K Klepper
L Elnitski
M Rye
M Tompa
Morten Beck Rye
P D'haeseleer
P Kheradpour
PJ Park
Pål Sætrom
R Jothi
Sridhar Hannenhalli
T Vavouri
Tony Håndstad
V Matys
WW Wasserman
X Xie
Y Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS) is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predictors, but it is not clear under what circumstances use of conservation is most beneficial.Here we use published ChIP-seq data and an improved peak detection method to create comprehensive benchmark datasets for prediction methods which use known descriptors or binding motifs to detect TFBS in genomic sequences. We use this benchmark to assess the performance of five different prediction methods and find that the methods that use information about sequence conservation generally perform better than simpler motif-scanning methods. The difference is greater on high-affinity peaks and when using short and information-poor motifs. However, if the motifs are specific and information-rich, we find that simple motif-scanning methods can perform better than conservation-based methods.Our benchmark provides a comprehensive test that can be used to rank the relative performance of transcription factor binding site prediction methods. Moreover, our results show that, contrary to previous reports, sequence conservation is better suited for predicting strong than weak transcription factor binding sites

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

NORA - Norwegian Open Research Archives

Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes

Author: et al
Hillier LaDeana W.
Spieth John
Wilson Richard K.
Publication venue: Digital Commons@Becker
Publication date: 01/01/2005
Field of study

Digital Commons@Becker

Genome-wide functional analysis of human 5' untranslated region introns

Author: Berriz Gabriel F
Cenik Can
Derti Adnan
Mellor Joseph C
Roth Frederick P
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Genes with short 5'UTR introns have higher expression than genes with no or long 5'UTR introns. Complex evolutionary forces act on these introns

Crossref

Springer - Publisher Connector

PubMed Central

Recommended from our members

MAPPER: a search engine for the computational identification of putative transcription factor binding sites in multiple genomes

Author: Kohane Isaac S
Marinescu Voichita D
Riva Alberto
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Cis-regulatory modules are combinations of regulatory elements occurring in close proximity to each other that control the spatial and temporal expression of genes. The ability to identify them in a genome-wide manner depends on the availability of accurate models and of search methods able to detect putative regulatory elements with enhanced sensitivity and specificity. RESULTS: We describe the implementation of a search method for putative transcription factor binding sites (TFBSs) based on hidden Markov models built from alignments of known sites. We built 1,079 models of TFBSs using experimentally determined sequence alignments of sites provided by the TRANSFAC and JASPAR databases and used them to scan sequences of the human, mouse, fly, worm and yeast genomes. In several cases tested the method identified correctly experimentally characterized sites, with better specificity and sensitivity than other similar computational methods. Moreover, a large-scale comparison using synthetic data showed that in the majority of cases our method performed significantly better than a nucleotide weight matrix-based method. CONCLUSION: The search engine, available at , allows the identification, visualization and selection of putative TFBSs occurring in the promoter or other regions of a gene from the human, mouse, fly, worm and yeast genomes. In addition it allows the user to upload a sequence to query and to build a model by supplying a multiple sequence alignment of binding sites for a transcription factor of interest. Due to its extensive database of models, powerful search engine and flexible interface, MAPPER represents an effective resource for the large-scale computational analysis of transcriptional regulation

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Network Inference Algorithms Elucidate Nrf2 Regulation of Mouse Lung Oxidative Stress

Author: A Jacquier
A Otomo
AA Margolin
AA Margolin
AK Jaiswal
AK Jaiswal
B Biteau
C-C Chang
CJ Reed
CM Clements
CO Daub
D Giustarini
Deepti Malhotra
DJ Moore
EY Park
George Acquaah-Mensah
GK Acquaah-Mensah
H Cai
H Ohkawa
I Nagano
I Priness
I Rahman
IH Witten
J Choi
JJ Faith
K Basso
K Itoh
K Iwasaki
M Kanehisa
M Matsuoka
M Singhal
MM Gallogly
Mudita Singhal
N Christianni
N Slonim
N Watanabe
P Shannon
PL Whitney
R Venugopal
R Venugopal
RA Irizarry
RC Taylor
RC Taylor
RG Will
RK Thimmulappa
Ronald C. Taylor
Ruth Nussinov
S Hadano
S Mead
SE Keene
Shyam Biswal
T Rangasamy
TM Cover
U Alon
U Alon
V Bonifati
VJ Findlay
W Droge
W Zhou
WW Wasserman
XL Chen
Y El-Manzalawy
Y Katoh
Y Li
Publication venue: Public Library of Science
Publication date: 01/08/2008
Field of study

A variety of cardiovascular, neurological, and neoplastic conditions have been associated with oxidative stress, i.e., conditions under which levels of reactive oxygen species (ROS) are elevated over significant periods. Nuclear factor erythroid 2-related factor (Nrf2) regulates the transcription of several gene products involved in the protective response to oxidative stress. The transcriptional regulatory and signaling relationships linking gene products involved in the response to oxidative stress are, currently, only partially resolved. Microarray data constitute RNA abundance measures representing gene expression patterns. In some cases, these patterns can identify the molecular interactions of gene products. They can be, in effect, proxies for protein–protein and protein–DNA interactions. Traditional techniques used for clustering coregulated genes on high-throughput gene arrays are rarely capable of distinguishing between direct transcriptional regulatory interactions and indirect ones. In this study, newly developed information-theoretic algorithms that employ the concept of mutual information were used: the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE), and Context Likelihood of Relatedness (CLR). These algorithms captured dependencies in the gene expression profiles of the mouse lung, allowing the regulatory effect of Nrf2 in response to oxidative stress to be determined more precisely. In addition, a characterization of promoter sequences of Nrf2 regulatory targets was conducted using a Support Vector Machine classification algorithm to corroborate ARACNE and CLR predictions. Inferred networks were analyzed, compared, and integrated using the Collective Analysis of Biological Interaction Networks (CABIN) plug-in of Cytoscape. Using the two network inference algorithms and one machine learning algorithm, a number of both previously known and novel targets of Nrf2 transcriptional activation were identified. Genes predicted as novel Nrf2 targets include Atf1, Srxn1, Prnp, Sod2, Als2, Nfkbib, and Ppp1r15b. Furthermore, microarray and quantitative RT-PCR experiments following cigarette-smoke-induced oxidative stress in Nrf2+/+ and Nrf2−/− mouse lung affirmed many of the predictions made. Several new potential feed-forward regulatory loops involving Nrf2, Nqo1, Srxn1, Prdx1, Als2, Atf1, Sod1, and Park7 were predicted. This work shows the promise of network inference algorithms operating on high-throughput gene expression data in identifying transcriptional regulatory and other signaling relationships implicated in mammalian disease

Crossref

Directory of Open Access Journals

PubMed Central