Search CORE

371 research outputs found

An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

Author: A Sandelin
A Sandelin
A Sharov
A Tomovic
Adrian J Shepherd
Armando Blanco
C Lawrence
D Denning
E Baker
E Szmidt
E Wingender
F Garcia
F Lam
F Lopez
F Offner
F Zare-Mirakabad
Fernando Garcia-Alcalde
G Chamilos
G Diop
G Hertz
J Hanley
J Hughes
J Sainz
J Van Helden
J Zhao
K Atanassov
K Atanassov
K Atanassov
K Atanassov
K Won
L Liang
L Zadeh
M Bulyk
M Das
M Eisen
N Dror
N Kim
P Benos
P Bochud
P Schling
R Gordan
S De
T Bailey
T Fawcett
T Hehlgans
T Tamura
T Tamura
V Khatibi
W Hung
W Wasserman
X Chen
Y Haudry
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

Repositorio Institucional Universidad de Granada

Birkbeck Institutional Research Online

A cAMP-binding ectoprotein in the yeast Saccharomyces cerevisiae

Author: Achstetter T.
Baroni M. D.
Behrens M. M.
Biilow R.
Bregman D. B.
Brunton L. L.
Cannon J. F.
Caras I. W.
Caras I. W.
Carr S. A.
Clegg C.
Conzelmann A.
Conzelmann A.
Corbin J. D.
Cross G. A. M.
Dallner G.
Davitz M. A.
Davitz M. A.
DeCamilli P.
Doering T. L.
Edelman A. M.
Ferguson M. A. J.
Ferguson M. A. J.
Ferguson R.
Fersht A.
Flockart D. A.
Gruber W.
Hixson C. S.
Ishihara M.
Jaynes P. K.
Johnson K. E.
Kamps M. P.
Kiibler D.
Korc-Grodzicki B.
Kunisawa R.
Lang B.
Lohmann S. M.
Lohmann S. M.
Low M. G.
Low M. G.
Matsumoto K.
Matsumoto K.
Matsumoto K.
Matsumoto K.
Matsumoto K.
Merino A.
Mitts M. R.
Mostov K. E.
Muller G.
Muller G.
Muller G.
Muller G„
Nairn A. C.
Nigam S. K.
Nigg E. A.
Nigg E. A.
Olson S.
Pall G.
Rhee T.
Rodel G.
Sakar D.
Salomon Y.
Smith M. E.
Srere P. A.
Stieger A.
Takami N.
Thorner J.
Toda T.
Trams E. G.
Uno I.
Vai M.
Wen T. C.
Wingender-Drissen R.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/1991
Field of study

tides 10, 593-595

CiteSeerX

Crossref

Open Access LMU

A study of the distribution of phylogenetically conserved blocks within clusters of mammalian homeobox genes

Author: Ahituv N
Bentley JL
Brudno M
Bryne JC
Cobb J
Duboule D
Duret L
Flicek P
Foronda D
Ganley AR
Gumucio D
Hardison R
Horan G
International Human Genome Sequencing Consortium
Karolchik D
King DC
Li J
Li-Kroeger D
Lynch VJ
Margulies E
Miller W
Miller W
Nikola Stojanovic
Rijnkels M
Rosenbloom K
Ruzzo W
Sabarinadh C
Sankoff D
Schwartz S
Sharpe J
Stojanovic N
Stojanovic N
Stojanovic N
Thomas JW
Wingender E
Yekta S
Zody MC
Publication venue: Sociedade Brasileira de Genética
Publication date: 01/01/2009
Field of study

Genome sequencing efforts of the last decade have produced a large amount of data, which has enabled whole-genome comparative analyses in order to locate potentially functional elements and study the overall patterns of phylogenetic conservation. In this paper we present a statistically based method for the characterization of these patterns in mammalian DNA sequences. We have applied this approach to the study of exceptionally well conserved homeobox gene clusters (Hox), based on an alignment of six species, and we have constructed a map of Hox cataloguing the conserved fragments, along with their locations in relation to the genes and other landmarks, sometimes showing unexpected layouts

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

PubMed Central

rMotifGen: random motif generator for DNA and protein sequences

Author: A Bairoch
A Bairoch
A Bairoch
A Rambaut
AF Neuwald
AV Favorov
C Timothy Hardin
CE Lawrence
CT Hardin
CT Workman
E Coward
E Eskin
E Wingender
EP Xing
Eric C Rouchka
G Pavesi
G Thijs
GZ Hertz
H Matsuda
HJ van
J Hu
J Liu
JD Hughes
L Stein
M Tompa
MC Frith
ML Engle
PA Pevzner
RM Schwartz
S Sinha
TL Bailey
W Ao
W Thompson
WN Grundy
X Liu
Y Ponty
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Results Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. Conclusion rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: <url>http://bioinformatics.louisville.edu/brg/rMotifGen/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Population Differences in Transcript-Regulator Expression Quantitative Trait Loci

Author: A Mortazavi
A Schwartzman
A Siepel
A Subramanian
A Vinuela
Ahsan Huda
AL Price
B Langmead
BE Stranger
BE Stranger
BH McArdle
C Trapnell
C Trapnell
C Ye
D Lv
Daniel J. Kliebenstein
DC Guo
DJ Kliebenstein
DL Nicolae
DM Ruden
DO Kennedy
E Choy
E Grundberg
E Wingender
E Wingender
EE Schadt
EE Schadt
ER Gamazon
ER Gamazon
G Yvert
GA Heap
GJ Bates
J Coulombe-Huntington
J Ding
JC Schisler
JE Wigginton
JK Pickrell
JL McCauley
JL Min
JM Akey
JM Bhasin
Jun Lu
L Liu
L Liu
L Parts
L Raskin
LA Hindorff
Liwen Liu
M Holden
M Krull
M Morley
MA Zapala
MG Naylor
N Hubner
Oliver Hofmann
PC Bennetta
Pierre R. Bushel
Q Jiang
R Breitling
R Edgar
RA Irizarry
Ray McGovern
RE Tiedemann
RS Spielman
S Duan
S Kim
S Li
SB Montgomery
SK Sarkar
T Barrett
T Breslin
T Kwan
T Zuo
W Jin
W Zhang
W Zou
Winston Hide
Xihong Lin
Y Benjamini
Y Idaghdour
Y Xu
Publication venue: Public Library of Science
Publication date: 27/03/2012
Field of study

Gene expression quantitative trait loci (eQTL) are useful for identifying single nucleotide polymorphisms (SNPs) associated with diseases. At times, a genetic variant may be associated with a master regulator involved in the manifestation of a disease. The downstream target genes of the master regulator are typically co-expressed and share biological function. Therefore, it is practical to screen for eQTLs by identifying SNPs associated with the targets of a transcript-regulator (TR). We used a multivariate regression with the gene expression of known targets of TRs and SNPs to identify TReQTLs in European (CEU) and African (YRI) HapMap populations. A nominal p-value of <1×10−6 revealed 234 SNPs in CEU and 154 in YRI as TReQTLs. These represent 36 independent (tag) SNPs in CEU and 39 in YRI affecting the downstream targets of 25 and 36 TRs respectively. At a false discovery rate (FDR) = 45%, one cis-acting tag SNP (within 1 kb of a gene) in each population was identified as a TReQTL. In CEU, the SNP (rs16858621) in Pcnxl2 was found to be associated with the genes regulated by CREM whereas in YRI, the SNP (rs16909324) was linked to the targets of miRNA hsa-miR-125a. To infer the pathways that regulate expression, we ranked TReQTLs by connectivity within the structure of biological process subtrees. One TReQTL SNP (rs3790904) in CEU maps to Lphn2 and is associated (nominal p-value = 8.1×10−7) with the targets of the X-linked breast cancer suppressor Foxp3. The structure of the biological process subtree and a gene interaction network of the TReQTL revealed that tumor necrosis factor, NF-kappaB and variants in G-protein coupled receptors signaling may play a central role as communicators in Foxp3 functional regulation. The potential pleiotropic effect of the Foxp3 TReQTLs was gleaned from integrating mRNA-Seq data and SNP-set enrichment into the analysis

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

FigShare

Genetically Engineered Alginate Lyase-PEG Conjugates Exhibit Enhanced Catalytic Function and Reduced Immunoreactivity

Author: AS Bayer
AS Bayer
DM Hoover
DM Ramsey
F Eftekhar
G Shankar
G Skjåk-Bræk
G Vaaje-Kolstad
GT Mai
H Sakakibara
H Schellekens
H-J Yoon
H-J Yoon
J Wingender
JA Simpson
Jennifer I. Lai
John W. Lamppa
K Murata
Karl E. Griswold
M Ackerman
M Alipour
M Hentzer
M Strathmann
MA Alkawash
Margaret E. Ackerman
ME Himmel
MJ Feldhaus
N Hoiby
RA Hatch
RJ Mrsny
Roy Roop II
S Elkin
S Moreau-Marquis
SJ Shire
TB May
Thomas C. Scanlon
TJ Giezen
W Hashimoto
Y Kodera
YM Smedley
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2010
Field of study

Alginate lyase enzymes represent prospective biotherapeutic agents for treating bacterial infections, particularly in the cystic fibrosis airway. To effectively deimmunize one therapeutic candidate while maintaining high level catalytic proficiency, a combined genetic engineering-PEGylation strategy was implemented. Rationally designed, site-specific PEGylation variants were constructed by orthogonal maleimide-thiol coupling chemistry. In contrast to random PEGylation of the enzyme by NHS-ester mediated chemistry, controlled mono-PEGylation of A1-III alginate lyase produced a conjugate that maintained wild type levels of activity towards a model substrate. Significantly, the PEGylated variant exhibited enhanced solution phase kinetics with bacterial alginate, the ultimate therapeutic target. The immunoreactivity of the PEGylated enzyme was compared to a wild type control using in vitro binding studies with both enzyme-specific antibodies, from immunized New Zealand white rabbits, and a single chain antibody library, derived from a human volunteer. In both cases, the PEGylated enzyme was found to be substantially less immunoreactive. Underscoring the enzyme's potential for practical utility, >90% of adherent, mucoid, Pseudomonas aeruginosa biofilms were removed from abiotic surfaces following a one hour treatment with the PEGylated variant, whereas the wild type enzyme removed only 75% of biofilms in parallel studies. In aggregate, these results demonstrate that site-specific mono-PEGylation of genetically engineered A1-III alginate lyase yielded an enzyme with enhanced performance relative to therapeutically relevant metrics.Cystic Fibrosis Foundation (Research Development Program)National Center for Research Resources (U.S.) (P20RR018787-06

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

Identification of Direct Target Genes Using Joint Sequence and Expression Likelihood with Application to DAF-16

Author: A Beyer
A Sandelin
B Ren
BC Foat
C Kenyon
C Yompakdee
CT Harbison
CT Murphy
D Das
D Das
DB Gordon
E Segal
E Segal
E Wingender
EI Boyle
EM Conlon
EM Schwarz
GZ Hertz
H Wurst
HJ Bussemaker
HJ Bussemaker
JD Hughes
JH An
Jie Liu
JS Flick
K Basso
KD MacIsaac
L Narlikar
M Tompa
N Ogawa
Nick True
P Martinez
Raya Khanin
Ron X. Yu
SS Lee
SW Oh
TL Bailey
VG Tusher
VR Iyer
W Wang
W Wang
Wei Wang
XS Liu
Y Honda
Y Kaneko
Publication venue: Public Library of Science
Publication date: 19/03/2008
Field of study

A major challenge in the post-genome era is to reconstruct regulatory networks from the biological knowledge accumulated up to date. The development of tools for identifying direct target genes of transcription factors (TFs) is critical to this endeavor. Given a set of microarray experiments, a probabilistic model called TRANSMODIS has been developed which can infer the direct targets of a TF by integrating sequence motif, gene expression and ChIP-chip data. The performance of TRANSMODIS was first validated on a set of transcription factor perturbation experiments (TFPEs) involving Pho4p, a well studied TF in Saccharomyces cerevisiae. TRANSMODIS removed elements of arbitrariness in manual target gene selection process and produced results that concur with one's intuition. TRANSMODIS was further validated on a genome-wide scale by comparing it with two other methods in Saccharomyces cerevisiae. The usefulness of TRANSMODIS was then demonstrated by applying it to the identification of direct targets of DAF-16, a critical TF regulating ageing in Caenorhabditis elegans. We found that 189 genes were tightly regulated by DAF-16. In addition, DAF-16 has differential preference for motifs when acting as an activator or repressor, which awaits experimental verification. TRANSMODIS is computationally efficient and robust, making it a useful probabilistic framework for finding immediate targets

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Identification of functionally related genes using data mining and data integration: a breast cancer case study

Author: B Quesnel
C Cocola
DJ Watts
E Wingender
Eleonora Piscitelli
Ettore Mosca
Gloria Bertoli
H Kitano
H Kouros-Mehr
H Kouros-Mehr
HK Lee
HR Christofk
I Zucchi
I Zucchi
Ileana Zucchi
JA Sterling
K Cartharius
Laura Vilardo
Luciano Milanesi
M Bamshad
MR Barron
MT Lewis
P Shannon
R Catena
R Mehra
R Sharan
RG Bristow
Rolland A Reinbold
RV Hoch
S Struckmann
TR Brummelkamp
U Stelzl
VD Marinescu
W Fan
W Yarosh
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference

Author: AD Smith
AM Benotmane
C Linhart
CE Lawrence
CT Harbison
DJ Galas
DJ Lockhart
DJC MacKay
DS Johnson
E Redhead
E Wingender
G Mönke
G Pavesi
GA Wray
GK Sandve
H Wettig
Harmen J. Bussemaker
HM Wallach
IA Paponov
Ivan A. Paponov
Ivo Grosse
J Cerquides
J Davis
J Wu
Jan Grau
JC Bryne
JD Hughes
Jens Keilwagen
LM Hellman
LV Sun
M Tompa
Marc Strickert
NK Kim
O Elemento
S Sonnenburg
S Sonnenburg
Stefan Posch
T Ulmasov
T Ulmasov
TD Schneider
TJ Guilfoyle
TL Bailey
V Matys
VV Raghavan
W Ao
W Thompson
WA Thompson
WD Teale
Publication venue: Public Library of Science
Publication date: 10/02/2011
Field of study

Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Optimized mixed Markov models for motif identification

Author: AE Kel
B Matthews
B Negre
C Burge
D Cai
David M Umbach
E Roulet
E Wingender
G Schwarz
G Yeo
GA Wray
GD Stormo
GE Crooks
H Akaike
I Carmel
J Rissanen
JP Staley
K Ellrott
K Nandabalan
K Nelson
K Quandt
Leping Li
M Kellis
MG Reese
ML Bulyk
MP Ponomarenko
MQ Zhang
N Saitou
P Agarwal
P Bühlmann
PV Benos
Q Zhou
R Staden
RP Ketterling
S Salzberg
T Thanaraj
TD Schneider
TK Man
U Ohler
Uwe Ohler
W Krivan
Weichun Huang
X Xie
X Zhao
Y Barash
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. RESULTS: We introduce a novel and flexible model, the Optimized Mixture Markov model (OMiMa), and related methods to allow adjustment of model complexity for different motifs. In comparison with other leading methods, OMiMa can incorporate more than the NNSplice's pairwise dependencies; OMiMa avoids model over-fitting better than the Permuted Variable Length Markov Model (PVLMM); and OMiMa requires smaller training samples than the Maximum Entropy Model (MEM). Testing on both simulated and actual data (regulatory cis-elements and splice sites), we found OMiMa's performance superior to the other leading methods in terms of prediction accuracy, required size of training data or computational time. Our OMiMa system, to our knowledge, is the only motif finding tool that incorporates automatic selection of the best model. OMiMa is freely available at [1]. CONCLUSION: Our optimized mixture of Markov models represents an alternative to the existing methods for modeling dependent structures within a biological motif. Our model is conceptually simple and effective, and can improve prediction accuracy and/or computational speed over other leading methods

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MDC Repository