Search CORE

42 research outputs found

MetaDBSite: a meta approach to improve protein DNA-binding sites prediction

Author: Huang Bingding
Lin Biaoyang
Schroeder Michael
Si Jingna
Zhang Zengming
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

The Rough Guide to In Silico Function Prediction, or How To Use Sequence and Structure Information To Predict Protein Function

Author: A Armon
A Bateman
A Godzik
A Passerini
A Pierleoni
A Stark
AE Todd
AT Laurie
B Rost
B Rost
BA Shoemaker
C Notredame
CA Innis
CA Orengo
CA Wilson
CE Stebbins
CJ Jeffery
CJ Jeffery
CP Ponting
CT Porter
D Brown
D Desveaux
D Devos
D Pal
D Petrey
D Petrey
E Krissinel
E Reynolds
EP Gianchandani
F Corpet
F Ferron
F Zhou
Fran Lewitter
G Theissen
GJ Bartlett
GJ Kleywegt
GL Holliday
H Nakashima
HL Schubert
HM Berman
IM Wallace
J Hawkins
J Thompson
JA Barker
JB Bard
JC Whisstock
JG Henikoff
JM Thornton
JS Sodhi
JW Torrance
JZ Wang
K Goyal
K Hofmann
K Karplus
K Nakai
L Holm
L Jaroszewski
L Shapiro
L Wang
LJ Jensen
M Babor
M Gruber
M Linial
M Lippi
M Nayal
M Remm
Marco Punta
MJ Hartshorn
O Emanuelsson
O Lichtarge
OA Bateman
OC Redfern
P Puntervoll
PD Thomas
R Apweiler
R Kolodny
R Nair
R Nair
R Nair
RA Laskowski
RL Tatusov
S Altschul
S Shazman
SG Lee
T Gabaldon
TA Binkowski
TJ Hubbard
TK Attwood
VA Ivanisenko
W Humphrey
W Tian
Y Ofran
Y Ye
Yanay Ofran
Publication venue: Public Library of Science
Publication date: 01/10/2008
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature

Author: Ahmad
Ahmad
Altschul
Altschul
Berman
Bhardwaj
Breiman
Bullock
Cohen
Coulocheri
Dimitriadou
Egan
Frishman
Ho
Hongde Liu
Hongtao Wu
Hwang
Jiansheng Wu
Jones
Kubat
Kuznetsov
Liaw
Luscombe
Matthews
Ofran
Scheffer
Shen
Siggers
Stawiski
Tjong
Tsuchiya
Vapnik
Wang
Wang
Wang
Xiao Sun
Xueye Duan
Yan
Yan Ding
Yunfei Bai
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: In this work, we aim to develop a computational approach for predicting DNA-binding sites in proteins from amino acid sequences. To avoid overfitting with this method, all available DNA-binding proteins from the Protein Data Bank (PDB) are used to construct the models. The random forest (RF) algorithm is used because it is fast and has robust performance for different parameter values. A novel hybrid feature is presented which incorporates evolutionary information of the amino acid sequence, secondary structure (SS) information and orthogonal binary vector (OBV) information which reflects the characteristics of 20 kinds of amino acids for two physical–chemical properties (dipoles and volumes of the side chains). The numbers of binding and non-binding residues in proteins are highly unbalanced, so a novel scheme is proposed to deal with the problem of imbalanced datasets by downsizing the majority class

Crossref

PubMed Central

PDNAsite:identification of DNA-binding site from protein sequence by incorporating spatial and sequence context

Author: A Bochkarev
AN Bullock
AP Bradley
B Liu
C Yan
CA BDavey
CC Chang
CO Pabo
EW Stawiski
H Tjong
HM Berman
IB Kuznetsov
J Wu
JA Swets
KL Griffith
L Wang
L Wang
L Wang
L Wang
M Ptashne
M Radlinska
M Terribilini
MY Gutfreund
N Bhardwaj
NM Luscombe
NM Luscombe
P Ozbek
QW Dong
R Liu
R Liu
R Xu
R Xu
RD Kornberg
S Ahmad
S Ahmad
S Hwang
SY Ho
T Li
W Kabsch
X Ma
X Zhao
Y Ofran
YC Chen
Z Yuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

PolyU Institutional Repository

PubMed Central

Aston Publications Explorer

NAPS: a residue-level nucleic acid-binding prediction server

Author: Ahmad
Altschul
Bhardwaj
Bhardwaj
Breiman
Cassiday
Fan
Henikoff
Hui Lu
Kumar
Kuznetsov
Langlois
Langlois
Matthew B. Carson
Ofran
Olson
Quinlan
Robert Langlois
Selvaraj
Sorzano
Stolfo
Terribilini
Wang
Wang
Zadrozny
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Nucleic acid-binding proteins are involved in a great number of cellular processes. Understanding the mechanisms underlying these proteins first requires the identification of specific residues involved in nucleic acid binding. Prediction of NA-binding residues can provide practical assistance in the functional annotation of NA-binding proteins. Predictions can also be used to expedite mutagenesis experiments, guiding researchers to the correct binding residues in these proteins. Here, we present a method for the identification of amino acid residues involved in DNA- and RNA-binding using sequence-based attributes. The method used in this work combines the C4.5 algorithm with bootstrap aggregation and cost-sensitive learning. Our DNA-binding model achieved 79.1% accuracy, while the RNA-binding model reached an accuracy of 73.2%. The NAPS web server is freely available at http://proteomics.bioengr.uic.edu/NAPS

CiteSeerX

Crossref

PubMed Central

Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information

Author: E Jeong
E Jeong
Gajendra PS Raghava
H Kaur
H Kaur
H Kaur
IB Kuznetsov
M Kumar
M Saito
N Bhardwaj
Nitish K Mishra
RA Bauer
S Ahmad
SF Altschul
T Joachims
V Sobolev
V Vapnik
W Li
Y Korllberg
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Flavin binding proteins (FBP) plays a critical role in several biological functions such as electron transport system (ETS). These flavoproteins contain very tightly bound, sometimes covalently, flavin adenine dinucleotide (FAD) or flavin mono nucleotide (FMN). The interaction between flavin nucleotide and amino acids of flavoprotein is essential for their functionality. Thus identification of FAD interacting residues in a FBP is an important step for understanding their function and mechanism. Results: In this study, we describe models developed for predicting FAD interacting residues using 15, 17 and 19 window pattern. Support vector machine (SVM) based models have been developed using binary pattern of amino acid sequence of protein and achieved maximum accuracy 69.65% with Mathew's Correlation Coefficient (MCC) 0.39 and Area Under Curve (AUC) 0.773. The performance of these models have been improved significantly from 69.65% to 82.86% with MCC 0.66 and AUC 0.904, when evolutionary information is used as input in SVM. The evolutionary information was generated in form of position specific score matrix (PSSM) profile by using PSI-BLAST at e-value 0.001. All models were developed on 198 non-redundant FAD binding protein chains containing 5172 FAD interacting residues and evaluated using fivefold cross-validation technique. Conclusion: This study suggests that evolutionary information of 17 amino acid patterns perform best for FAD interacting residues prediction. We also developed a web server which predicts FAD interacting residues in a protein which is freely available for academics

Crossref

Springer - Publisher Connector

PubMed Central

DNABINDPROT: fluctuation-based predictor of DNA-binding residues within a network of interacting residues

Author: Ahmad
Ahmad
Ahmad
Bahar
Bahar
Bhardwaj
Burak Erman
Chu
Ertekin
Gao
Gao
Gao
Haliloglu
Haliloglu
Haliloglu
Haliloglu
Hwang
Jones
Keil
Kuznetsov
Landau
Laskowski
Lejeune
Luscombe
Luscombe
Luscombe
Nimrod
Nimrod
Nimrod
Ofran
Ofran
Panchenko
Passner
Passner
Pemra Ozbek
Popovych
Rader
Res
Sathyapriya
Seren Soner
Stawiski
Szilagyi
Tjong
Tsuchiya
Tsuchiya
Turkan Haliloglu
van Dijk
Wang
Wang
Wang
Wang
Wu
Yan
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

DNABINDPROT is designed to predict DNA-binding residues, based on the fluctuations of residues in high-frequency modes by the Gaussian network model. The residue pairs that display high mean-square distance fluctuations are analyzed with respect to DNA binding, which are then filtered with their evolutionary conservation profiles and ranked according to their DNA-binding propensities. If the analyses are based on the exact outcome of fluctuations in the highest mode, using a conservation threshold of 5, the results have a sensitivity, specificity, precision and accuracy of 9.3%, 90.5%, 18.1% and 78.6%, respectively, on a dataset of 36 unbound–bound protein structure pairs. These values increase up to 24.3%, 93.4%, 45.3% and 83.3% for the respective cases, when the neighboring two residues are considered. The relatively low sensitivity appears with the identified residues being selective and susceptible more for the binding core residues rather than all DNA-binding residues. The predicted residues that are not tagged as DNA-binding residues are those whose fluctuations are coupled with DNA-binding sites. They are in close proximity as well as plausible for other functional residues, such as ligand and protein–protein interaction sites. DNABINDPROT is free and open to all users without login requirement available at: http://www.prc.boun.edu.tr/appserv/prc/dnabindprot/

CiteSeerX

Crossref

PubMed Central

Koç University Digital Collections

Predicting DNA-binding locations and orientation on proteins using knowledge-based learning of geometric properties

Author: Chen Chien-Yu
Wang Chien-Chih
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors

Author: Ahmad
Altschul
Berman
Boyer
Chien-Kang Huang
Chun-Chin Huang
Ferrer-Costa
Finn
Gewehr
Henikoff
Hwang
Jones
Jones
Liu
Luscombe
Ofran
Tjong
Tsuchiya
von Ohsen
Wang
Wen-Yi Chu
Yan
Yen-Jen Oyang
Yi-Sheng Cheng
Yu-Feng Huang
Publication venue: Oxford University Press
Publication date
Field of study

This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein–DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are essential for correct gene regulation. In this respect, ProteDNA is distinctive since it has been designed to identify sequence-specific binding residues. In order to accommodate users with different application needs, ProteDNA has been designed to operate under two modes, namely, the high-precision mode and the balanced mode. According to the experiments reported in this article, under the high-precision mode, ProteDNA has been able to deliver precision of 82.3%, specificity of 99.3%, sensitivity of 49.8% and accuracy of 96.5%. Meanwhile, under the balanced mode, ProteDNA has been able to deliver precision of 60.8%, specificity of 97.6%, sensitivity of 60.7% and accuracy of 95.4%. ProteDNA is available at the following websites

Crossref

PubMed Central

Prediction of DNA-binding residues from protein sequence information using random forests

Author: Jack Y Yang
Liangjiang Wang
Mary Qu Yang
Wang Liangjiang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Springer

Springer - Publisher Connector

PubMed Central