Search CORE

Fine-Tuning Enhancer Models to Predict Transcriptional Targets across Multiple Genomes

Author: A Ochoa-Espinosa
A Siepel
A Stark
AA Philippakis
AM Moses
AP Lifanov
B Adryan
BA Hassan
Bassem A. Hassan
BY Chan
CM Frith
D Karolchik
D Karolchik
DC King
DM Schroeder
E Emberly
E Segal
EH Davidson
G Thijs
Guillaume Bourque
GZ Hertz
IE Boyle
J van Helden
Jacques van Helden
JE Ostrin
JM Stuart
LW Chang
M Blanchette
M Brudno
M Markstein
M Pritsker
M Rebeiz
M Tompa
MC Bergman
MS Halfon
N Rajewsky
NV Taverner
O Johansson
Olivier Sand
PB Berman
PI zur Lage
R Siddharthan
S Aerts
S Aerts
S Kurtz
S Sinha
SB Montgomery
SM Gallo
SR Eddy
Stein Aerts
T Zhang
TL Bailey
WJ Kent
WW Wasserman
Y Sun
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Networks of regulatory relations between transcription factors (TF) and their target genes (TG)- implemented through TF binding sites (TFBS)- are key features of biology. An idealized approach to solving such networks consists of starting from a consensus TFBS or a position weight matrix (PWM) to generate a high accuracy list of candidate TGs for biological validation. Developing and evaluating such approaches remains a formidable challenge in regulatory bioinformatics. We perform a benchmark study on 34 Drosophila TFs to assess existing TFBS and cis-regulatory module (CRM) detection methods, with a strong focus on the use of multiple genomes. Particularly, for CRM-modelling we investigate the addition of orthologous sites to a known PWM to construct phyloPWMs and we assess the added value of phylogenentic footprinting to predict contextual motifs around known TFBSs. For CRM-prediction, we compare motif conservation with network-level conservation approaches across multiple genomes. Choosing the optimal training and scoring strategies strongly enhances the performance of TG prediction for more than half of the tested TFs. Finally, we analyse a 35th TF, namely Eyeless, and find a significant overlap between predicted TGs and candidate TGs identified by microarray expression studies. In summary we identify several ways to optimize TF-specific TG predictions, some of which can be applied to all TFs, and others that can be applied only to particular TFs. The ability to model known TF-TG relations, together with the use of multiple genomes, results in a significant step forward in solving the architecture of gene regulatory networks

Lirias

HAL AMU

HAL: Hyper Article en Ligne

DI-fusion

The Francis Crick Institute

The Drosophila Gap Gene Network Is Composed of Two Parallel Toggle Switches

Author: A Pisarev
AB Owen
AD Lander
AM Berezhkovskii
AP Lifanov
C Sample
C Schulz
D Lebrecht
D Papatsenko
D Papatsenko
D Papatsenko
D Stanojevic
D Wilson
DE Clyde
DM Holloway
Dmitri Papatsenko
DS Burz
DS Burz
E Mjolsness
E Poustelnikova
EH Davidson
FJ Lopes
G Struhl
G Struhl
GK Ackers
H Bolouri
H Bolouri
H Janssens
J Gertz
J Jaeger
J Jaeger
JA Langeland
JS Margolis
K Fujimoto
L Bintu
L Bintu
M Hulskamp
M Kazemian
M Ptashne
Manu
MD Librizzi
Michael Levine
P Zuo
R Dilao
R Kraut
R Kraut
R Milo
R Rivera-Pomar
RP Zinzen
RP Zinzen
S Ishihara
S Ladame
T Gregor
T Gregor
TJ Perkins
Vladimir N. Uversky
W Driever
W Driever
WK Hastings
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Drosophila “gap” genes provide the first response to maternal gradients in the early fly embryo. Gap genes are expressed in a series of broad bands across the embryo during first hours of development. The gene network controlling the gap gene expression patterns includes inputs from maternal gradients and mutual repression between the gap genes themselves. In this study we propose a modular design for the gap gene network, involving two relatively independent network domains. The core of each network domain includes a toggle switch corresponding to a pair of mutually repressive gap genes, operated in space by maternal inputs. The toggle switches present in the gap network are evocative of the phage lambda switch, but they are operated positionally (in space) by the maternal gradients, so the synthesis rates for the competing components change along the embryo anterior-posterior axis. Dynamic model, constructed based on the proposed principle, with elements of fractional site occupancy, required 5–7 parameters to fit quantitative spatial expression data for gap gradients. The identified model solutions (parameter combinations) reproduced major dynamic features of the gap gradient system and explained gap expression in a variety of segmentation mutants

CiteSeerX

Public Library of Science (PLOS)

Statistical significance of cis-regulatory modules

Author: A Kel
A Klingenhoff
A Sandelin
A Sosinsky
A Wagner
A Wagner
A Wagner
A Webber
AA Philippakis
Andrew D Smith
AP Lifanov
BP Berman
BP Berman
D GuhaThakurta
DS Johnson
Dustin E Schones
E Eskin
EM McCreight
F Tronche
G Hertz
GD Stormo
J van Helden
JM Claverie
JM Claverie
JS Liu
K Struhl
M Beckstette
M Beckstette
M Blanchette
M Gupta
MA Beer
MC Frith
MC Frith
Michael Q Zhang
N Munshi
N Nagarajan
N Rajewsky
O Johansson
P Leighton
Q Zhou
R Hoberman
R Hoberman
R Staden
RR Sokal
S Aerts
S Rahmann
S Sinha
TD Schneider
TL Bailey
TL Bailey
TL Baily
V Matys
W Kent
W Thompson
WB Alkema
WW Wasserman
YH Grad
Z Xuan
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. RESULTS: We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. CONCLUSION: The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM) and MODSTORM software

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Public Library of Science (PLOS)

Thermodynamics-Based Models of Transcriptional Regulation by Enhancers: The Roles of Synergistic Activation, Cooperative Binding and Short-Range Repression

Author: A Krumm
A La Rosee-Borggreve
A Tanay
AM Moses
AP Lifanov
AR Borneman
AV Morozov
CC Fowlkes
Charles Blatti
CM Bergman
D Lebrecht
D Zenklusen
DC Bauer
DN Arnosti
DS Burz
DS Homsi
E Birney
E Segal
EH Davidson
ET Dermitzakis
F Gao
F Sauer
F Sauer
G Jimenez
GD Stormo
H Janssens
H Nakanishi
H Zhu
J Gertz
J Reinitz
JK Joung
K Struhl
LP Andrioli
M Carey
M Hoch
M Ptashne
MA Beer
MA Shea
MB Noyes
Md. Abul Hassan Samee
MM Kulkarni
MR Green
MZ Ludwig
NE Buchler
OG Berg
P Ray
PV Benos
R Hermsen
R Yan
RA Veitia
RP Zinzen
RP Zinzen
RW Lusk
S Gray
S Gray
S Small
S Small
SA Keller
Saurabh Sinha
SJ Maerkl
T Chi
T Wasson
Uwe Ohler
VB Teif
VJ Makeev
WD Fakhouri
X Ma
Xin He
Y Nibu
Z Hu
Publication venue: Public Library of Science
Publication date: 01/09/2010
Field of study

Quantitative models of cis-regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled, or heuristic approximations of the underlying regulatory mechanisms. We have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence, as a function of transcription factor concentrations and their DNA-binding specificities. It uses statistical thermodynamics theory to model not only protein-DNA interaction, but also the effect of DNA-bound activators and repressors on gene expression. In addition, the model incorporates mechanistic features such as synergistic effect of multiple activators, short range repression, and cooperativity in transcription factor-DNA binding, allowing us to systematically evaluate the significance of these features in the context of available expression data. Using this model on segmentation-related enhancers in Drosophila, we find that transcriptional synergy due to simultaneous action of multiple activators helps explain the data beyond what can be explained by cooperative DNA-binding alone. We find clear support for the phenomenon of short-range repression, where repressors do not directly interact with the basal transcriptional machinery. We also find that the binding sites contributing to an enhancer's function may not be conserved during evolution, and a noticeable fraction of these undergo lineage-specific changes. Our implementation of the model, called GEMSTAT, is the first publicly available program for simultaneously modeling the regulatory activities of a given set of sequences