Search CORE

TRANSFAC(®) and its module TRANSCompel(®): transcriptional gene regulation in eukaryotes

Author: Barre-Dirrie A.
Chekmenev D.
Fricke E.
Hornischer K.
Kel A. E.
Kel-Margoulis O. V.
Krull M.
Land S.
Lewicki-Potapov B.
Liebich I.
Matys V.
Reuter I.
Saxel H.
Stegmaier P.
Voss N.
Wingender E.
Publication venue: Oxford University Press
Publication date: 28/12/2005
Field of study

The TRANSFAC(®) database on transcription factors, their binding sites, nucleotide distribution matrices and regulated genes as well as the complementing database TRANSCompel(®) on composite elements have been further enhanced on various levels. A new web interface with different search options and integrated versions of Match™ and Patch™ provides increased functionality for TRANSFAC(®). The list of databases which are linked to the common GENE table of TRANSFAC(®) and TRANSCompel(®) has been extended by: Ensembl, UniGene, EntrezGene, HumanPSD™ and TRANSPRO™. Standard gene names from HGNC, MGI and RGD, are included for human, mouse and rat genes, respectively. With the help of InterProScan, Pfam, SMART and PROSITE domains are assigned automatically to the protein sequences of the transcription factors. TRANSCompel(®) contains now, in addition to the COMPEL table, a separate table for detailed information on the experimental EVIDENCE on which the composite elements are based. Finally, for TRANSFAC(®), in respect of data growth, in particular the gain of Drosophila transcription factor binding sites (by courtesy of the Drosophila DNase I footprint database) and of Arabidopsis factors (by courtesy of DATF, Database of Arabidopsis Transcription Factors) has to be stressed. The here described public releases, TRANSFAC(®) 7.0 and TRANSCompel(®) 7.0, are accessible under

CiteSeerX

Advanced Computational Biology Methods Identify Molecular Switches for Malignancy in an EGF Mouse Model of Liver Cancer

Author: A Hosui
A Kel
A Kel
A Sala
A Seth
AE Kel
AE Kel
Alexander Kel
AM Waterhouse
AP Feinberg
C Desbois-Mouthon
C Yang
CD Schmid
CM Kendziorski
CM Shea
DL Galson
E Wingender
EC Lopes
Edgar Wingender
FM van Roy
G Otaegi
H Michael
HE Jones
J Borlak
J Jiang
J Riedemann
JA Figueroa
JJ Shah
Juergen Borlak
K Hayashida
K Katoh
KH Ventii
LC Yeh
M Ashburner
M Krull
M Mietus-Snyder
MH Tai
Michael Polymenis
Nico Voss
P Carninci
P Nioi
Philip Stegmaier
R Yamashita
RC Gentleman
S Morin
S Rahmann
S Zhang
T Pham-Gia
Tatiana Meier
TJP Hubbard
TM DeChiara
V Matys
VX Fu
Y Babaie
Y Fu
Y Guo
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The molecular causes by which the epidermal growth factor receptor tyrosine kinase induces malignant transformation are largely unknown. To better understand EGFs' transforming capacity whole genome scans were applied to a transgenic mouse model of liver cancer and subjected to advanced methods of computational analysis to construct de novo gene regulatory networks based on a combination of sequence analysis and entrained graph-topological algorithms. Here we identified transcription factors, processes, key nodes and molecules to connect as yet unknown interacting partners at the level of protein-DNA interaction. Many of those could be confirmed by electromobility band shift assay at recognition sites of gene specific promoters and by western blotting of nuclear proteins. A novel cellular regulatory circuitry could therefore be proposed that connects cell cycle regulated genes with components of the EGF signaling pathway. Promoter analysis of differentially expressed genes suggested the majority of regulated transcription factors to display specificity to either the pre-tumor or the tumor state. Subsequent search for signal transduction key nodes upstream of the identified transcription factors and their targets suggested the insulin-like growth factor pathway to render the tumor cells independent of EGF receptor activity. Notably, expression of IGF2 in addition to many components of this pathway was highly upregulated in tumors. Together, we propose a switch in autocrine signaling to foster tumor growth that was initially triggered by EGF and demonstrate the knowledge gain form promoter analysis combined with upstream key node identification

Fraunhofer-ePrints

Leading-effect vs. Risk-taking in Dynamic Tournaments: Evidence from a Real-life Randomized Experiment

Author: B A Taylor
C C Moul
C Ferrall
C Genakos
C Grund
C R Knoeber
D A Malueg
D G Pope
E P Lazear
E P Lazear
F Carmichael
Frank Mueller-Langer
G P Baker
H K Hvide
H K Hvide
J Apesteguia
J Chevalier
J G Lynch
J Gonz�lez-D�az
J Greenhough
J Taylor
K A Konrad
K A Konrad
L M B Cabral
L Page
M Kocher
M Kr�kel
N Neave
P Nieken
P Oyer
P.-A Chiappori
Patrick Andreoli Versbach
R Pollard
S R Clarke
S Szymanski
T J Dohmen
T Klumpp
Publication venue
Publication date: 01/01/2013
Field of study

Two 'order effects' may emerge in dynamic tournaments with information feedback. First, participants adjust effort across stages, which could advantage the leading participant who faces a larger 'effective prize' after an initial victory (leading-effect). Second, participants lagging behind may increase risk at the final stage as they have 'nothing to lose' (risk-taking). We use a randomized natural experiment in professional two-game soccer tournaments where the treatment (order of a stage-specific advantage) and team characteristics, e.g. ability, are independent. We develop an identification strategy to test for leading-effects controlling for risk-taking. We find no evidence of leading-effects and negligible risk-taking effects

Open Access LMU

MPG.PuRe

Musculoskeletal pain is associated with a long-term increased risk of cancer and cardiovascular-related mortality

Author: A. J. Silman
Al-Allaf
D. P. Symmons
Dreyer
Frank
G. J. Macfarlane
Gursoy
G rsoy
J. McBeth
Kennedy
MacFarlane
Macfarlane
McBeth
McBeth
M kel
Penrod
R. Webb
Smith
T. Allison
T. Brammah
Urwin
Wolfe
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Objectives. To test the hypothesis that individuals with regional and widespread pain disorders have an increased risk of mortality

The University of Manchester - Institutional Repository

Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

Author: A Ambesi-Impiombato
A Blais
A Eto
A Subramanian
AE Kel
AG Clark
AL Lam
AM McGuire
Anat Reiner
Assif Yitzhaky
B Ren
C Kimura-Yoshida
C Plessy
C Yang
CT Harbison
D Pfeifer
D Wang
DB Allison
E Emberly
E Segal
Eytan Domany
FP Roth
GC Pipes
GC Yuan
GQ Yao
GZ Hertz
H Li
H Lodish
J Zheng
JD Hughes
JL DeRisi
JQ Ling
K Frech
K Quandt
KD MacIsaac
L Amir-Zilberstein
L Elnitski
L Marino-Ramirez
L McCue
M Ashburner
M Kellis
M Milyavsky
MA Nobrega
Mark Koudritsky
MC Frith
ML Howard
ML Whitfield
N Rajewsky
Or Zuk
P Carninci
P Carninci
P Cliften
PM Haverty
PR Buckland
R Elkon
R Liu
R Sharan
Ran Brosh
S Aerts
S Rashi-Elkeles
S Tavazoie
SJ Cooper
SJ Ho Sui
Sui Huang
U Gerland
Varda Rotter
WW Wasserman
X Xie
Y Barash
Y Benjamini
Y Benjamini
Y Tabach
Yossi Buganim
Yuval Tabach
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Fast index based algorithms and software for matching position specific scoring matrices

Author: A Kel
A Sandelin
B Dorohonceanu
D Weeks
G Castillo
H Gonnet
J Henikoff
J Henikoff
J Kärkkäinen
K Quandt
L Goldstein
LR Murphy
M Abouelhoda
M Beckstette
M Beckstette
M Gribskov
Michael Beckstette
N de Bruijn
N Hulo
P Embrechts
P Haverty
P Scordis
R Giegerich
R Staden
R Tatusov
Robert Giegerich
Robert Homann
S Kurtz
S Kurtz
S Rahmann
S Rajasekaran
Stefan Kurtz
T Kasai
T Li
T Wu
T Wu
TK Attwood
V Freschi
V Matys
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences. Searching with PSSMs in complete genomes or large sequence databases is a common, but computationally expensive task. RESULTS: We present a new non-heuristic algorithm, called ESAsearch, to efficiently find matches of PSSMs in large databases. Our approach preprocesses the search space, e.g., a complete genome or a set of protein sequences, and builds an enhanced suffix array that is stored on file. This allows the searching of a database with a PSSM in sublinear expected time. Since ESAsearch benefits from small alphabets, we present a variant operating on sequences recoded according to a reduced alphabet. We also address the problem of non-comparable PSSM-scores by developing a method which allows the efficient computation of a matrix similarity threshold for a PSSM, given an E-value or a p-value. Our method is based on dynamic programming and, in contrast to other methods, it employs lazy evaluation of the dynamic programming matrix. We evaluated algorithm ESAsearch with nucleotide PSSMs and with amino acid PSSMs. Compared to the best previous methods, ESAsearch shows speedups of a factor between 17 and 275 for nucleotide PSSMs, and speedups up to factor 1.8 for amino acid PSSMs. Comparisons with the most widely used programs even show speedups by a factor of at least 3.8. Alphabet reduction yields an additional speedup factor of 2 on amino acid sequences compared to results achieved with the 20 symbol standard alphabet. The lazy evaluation method is also much faster than previous methods, with speedups of a factor between 3 and 330. CONCLUSION: Our analysis of ESAsearch reveals sublinear runtime in the expected case, and linear runtime in the worst case for sequences not shorter than | [Formula: see text] |(m )+ m - 1, where m is the length of the PSSM and [Formula: see text] a finite alphabet. In practice, ESAsearch shows superior performance over the most widely used programs, especially for DNA sequences. The new algorithm for accurate on-the-fly calculations of thresholds has the potential to replace formerly used approximation approaches. Beyond the algorithmic contributions, we provide a robust, well documented, and easy to use software package, implementing the ideas and algorithms presented in this manuscript

Springer - Publisher Connector

Publications at Bielefeld University

Analysis of Gene Regulatory Networks in the Mammalian Circadian Rhythm

Author: AE Kel
AI Su
B Kornmann
BH Miller
BR Zeeberg
Chunxuan Shao
FO James
G Thijs
GZ Hertz
H Wakaguri
Haifang Wang
HR Ueda
HR Ueda
Jeffrey M. Gimble
Jun Yan
K Bozek
KD Pruitt
L Yin
M Rakhshandehroo
P Carninci
S Aerts
S Panda
S Rahmann
SM Reppert
V Porterfield
Y Suzuki
Yuting Liu
Publication venue: Public Library of Science
Publication date: 01/10/2008
Field of study

Circadian rhythm is fundamental in regulating a wide range of cellular, metabolic, physiological, and behavioral activities in mammals. Although a small number of key circadian genes have been identified through extensive molecular and genetic studies in the past, the existence of other key circadian genes and how they drive the genomewide circadian oscillation of gene expression in different tissues still remains unknown. Here we try to address these questions by integrating all available circadian microarray data in mammals. We identified 41 common circadian genes that showed circadian oscillation in a wide range of mouse tissues with a remarkable consistency of circadian phases across tissues. Comparisons across mouse, rat, rhesus macaque, and human showed that the circadian phases of known key circadian genes were delayed for 4–5 hours in rat compared to mouse and 8–12 hours in macaque and human compared to mouse. A systematic gene regulatory network for the mouse circadian rhythm was constructed after incorporating promoter analysis and transcription factor knockout or mutant microarray data. We observed the significant association of cis-regulatory elements: EBOX, DBOX, RRE, and HSE with the different phases of circadian oscillating genes. The analysis of the network structure revealed the paths through which light, food, and heat can entrain the circadian clock and identified that NR3C1 and FKBP/HSP90 complexes are central to the control of circadian genes through diverse environmental signals. Our study improves our understanding of the structure, design principle, and evolution of gene regulatory networks involved in the mammalian circadian rhythm

Public Library of Science (PLOS)

Effective transcription factor binding site prediction using a combination of optimization, a genetic algorithm and discriminant analysis to capture distant interactions

Author: A Hoglund
AE Kel
AE Kel
AE Vinogradov
B Efron
B Jaruga
BJ Deroo
C Burge
CD Schmid
CR Calladine
D Cai
D GuhaThakurta
DM Graunke
E Fayard
Elena A Ananko
Elena V Ignatieva
FA Wright
GD Stormo
HP Ko
I Abnizova
I Ben-Gal
IA Udalova
Igor I Turnaev
J Duarte
J Hu
JV Ponomarenko
K Ellrott
K Morohashi
K Quandt
KJ Campbell
L Quintana-Murci
LC Platanias
LG Cowell
M Beato
M Blanchette
M Costantini
M Ganapathi
M Lohoff
M Stepanova
M-LT Lee
ML Bulyk
MP Ponomarenko
MQ Zhang
MQ Zhang
NA Kolchanov
NI Gershenzon
Nikolay A Kolchanov
NV Klimova
O Kel-Margoulis
OA Podkolodnaia
OD King
OG Berg
P Val
PV Benos
Q Zhou
R Castelo
R Kiyama
R Osada
R Pudimat
RV Davuluri
S Kamalakaran
Tatyana I Merkulova
TC Hodgman
TK Man
TM Chen
TV Busygina
VG Levitskii
VG Levitsky
VG Levitsky
VG Levitsky
VG Levitsky
Victor G Levitsky
VV Solovyev
W Huang
WH Shen
WW Wasserman
X Xie
Y Barash
Publication venue: BioMed Central
Publication date: 01/12/2007
Field of study

Abstract Background Reliable transcription factor binding site (TFBS) prediction methods are essential for computer annotation of large amount of genome sequence data. However, current methods to predict TFBSs are hampered by the high false-positive rates that occur when only sequence conservation at the core binding-sites is considered. Results To improve this situation, we have quantified the performance of several Position Weight Matrix (PWM) algorithms, using exhaustive approaches to find their optimal length and position. We applied these approaches to bio-medically important TFBSs involved in the regulation of cell growth and proliferation as well as in inflammatory, immune, and antiviral responses (NF-κB, ISGF3, IRF1, STAT1), obesity and lipid metabolism (PPAR, SREBP, HNF4), regulation of the steroidogenic (SF-1) and cell cycle (E2F) genes expression. We have also gained extra specificity using a method, entitled SiteGA, which takes into account structural interactions within TFBS core and flanking regions, using a genetic algorithm (GA) with a discriminant function of locally positioned dinucleotide (LPD) frequencies. To ensure a higher confidence in our approach, we applied resampling-jackknife and bootstrap tests for the comparison, it appears that, optimized PWM and SiteGA have shown similar recognition performances. Then we applied SiteGA and optimized PWMs (both separately and together) to sequences in the Eukaryotic Promoter Database (EPD). The resulting SiteGA recognition models can now be used to search sequences for BSs using the web tool, SiteGA. Analysis of dependencies between close and distant LPDs revealed by SiteGA models has shown that the most significant correlations are between close LPDs, and are generally located in the core (footprint) region. A greater number of less significant correlations are mainly between distant LPDs, which spanned both core and flanking regions. When SiteGA and optimized PWM models were applied together, this substantially reduced false positives at least at higher stringencies. Conclusion Based on this analysis, SiteGA adds substantial specificity even to optimized PWMs and may be considered for large-scale genome analysis. It adds to the range of techniques available for TFBS prediction, and EPD analysis has led to a list of genes which appear to be regulated by the above TFs.</p

Springer - Publisher Connector

Lund University Publications

Analysis of promoter regions of co-expressed genes identified by microarray analysis

Author: AE Kel
B Lenhard
DL Latchman
DR Rhodes
G Thijs
HY Chang
JD Leib
K Cartharius
K Ohtani
L Bullinger
LA Pennacchio
LJ Heyer
M Long
Mattias Höglund
ND Trinklein
O Troyanskaya
P Pavlidis
Q Lu
S Aerts
S Hannenhalli
S Karanam
S Levy
S Levy
Srinivas Veerla
WW Wasserman
Y Suzuki
Z Zhu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The use of global gene expression profiling to identify sets of genes with similar expression patterns is rapidly becoming a widespread approach for understanding biological processes. A logical and systematic approach to study co-expressed genes is to analyze their promoter sequences to identify transcription factors that may be involved in establishing specific profiles and that may be experimentally investigated. RESULTS: We introduce promoter clustering i.e. grouping of promoters with respect to their high scoring motif content, and show that this approach greatly enhances the identification of common and significant transcription factor binding sites (TFBS) in co-expressed genes. We apply this method to two different dataset, one consisting of micro array data from 108 leukemias (AMLs) and a second from a time series experiment, and show that biologically relevant promoter patterns may be obtained using phylogenetic foot-printing methodology. In addition, we also found that 15% of the analyzed promoter regions contained transcription factors start sites for additional genes transcribed in the opposite direction. CONCLUSION: Promoter clustering based on global promoter features greatly improve the identification of shared TFBS in co-expressed genes. We believe that the outlined approach may be a useful first step to identify transcription factors that contribute to specific features of gene expression profiles

Springer - Publisher Connector