Search CORE

7 research outputs found

Prediction of backbone dihedral angles and protein secondary structure using support vector machines

Author: AG de Brevern
AG Murzin
AK Jain
AP Dempster
B Oliva
B Rost
B Rost
B Rost
B Xue
BH Park
BW Matthews
C Bystroff
C Bystroff
C Mooney
CB Anfinsen
CC Chang
CW Hsu
D Frishman
D Przybylski
DT Jones
DT Jones
E Faraggi
FM Richards
G Karypis
G Pollastri
GN Ramachandran
H Kim
IH Witten
J Guo
J Kyte
J MacQueen
JA Cuff
JA Cuff
JJ Ward
Jonathan D Hirst
JR Green
K Karplus
K Lin
KY Yeung
M Ouali
MJ Rooman
MJ Wood
N Cristianini
N Qian
O Dor
O Zimmermann
O Zimmermann
Petros Kountouris
PY Chou
Q Dong
R Karchin
R Kuang
S Henikoff
S Hua
S Qin
S Wu
SC Lovell
SF Altschul
SK Riis
U Hobohm
V Vapnik
W Kabsch
XM Pan
Y Xu
YM Huang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The prediction of the secondary structure of a protein is a critical step in the prediction of its tertiary structure and, potentially, its function. Moreover, the backbone dihedral angles, highly correlated with secondary structures, provide crucial information about the local three-dimensional structure. Results We predict independently both the secondary structure and the backbone dihedral angles and combine the results in a loop to enhance each prediction reciprocally. Support vector machines, a state-of-the-art supervised classification technique, achieve secondary structure predictive accuracy of 80% on a non-redundant set of 513 proteins, significantly higher than other methods on the same dataset. The dihedral angle space is divided into a number of regions using two unsupervised clustering techniques in order to predict the region in which a new residue belongs. The performance of our method is comparable to, and in some cases more accurate than, other multi-class dihedral prediction methods. Conclusions We have created an accurate predictor of backbone dihedral angles and secondary structure. Our method, called DISSPred, is available online at <url>http://comp.chem.nottingham.ac.uk/disspred/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences

Author: A Schlessinger
A Schlessinger
A Schlessinger
AG de Brevern
B Rost
B Rost
B Rost
B Xue
C Bystroff
C Haynes
C Mooney
C Zhang
C Zheng
Christian Schönbach
D Xie
DT Jones
E Faraggi
E Faraggi
G Helles
Geoffrey I. Webb
GN Ramachandran
GP Raghava
H Zhang
H Zhang
Hao Tan
HJ Dyson
HS Kang
J Cheng
J Gao
J Gsponer
J Song
J Song
J Song
J Song
J Song
J Song
Jiangning Song
JJ Ward
JS Chauhan
K Chen
K Chen
K Chen
L Chen
L Kurgan
M Kumar
Mingjun Wang
MJ Mizianty
MJ Rooman
MJ Wood
MJ Wood
MK Kalita
MN Nguyen
MN Nguyen
MV Berjanskii
O Dor
O Dor
O Zimmermann
P Chen
P Kountouris
P Kountouris
P Sliz
PC Chen
R Gaudet
R Karchin
R Kuang
R Verma
S Ahmad
S Ahmad
S Liang
S Qiu
S Wu
S Wu
SF Altschul
T Ishida
T Zhang
T Zhang
Tatsuya Akutsu
V Vapnik
V Vapnik
W Kabsch
W Liu
W Zhang
X Miao
X Wang
XY Pan
Y Ofran
Y Ofran
YM Huang
Z Markovic-Housley
Z Yuan
Z Yuan
Z Yuan
Z Yuan
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Monash University Research Portal

Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank

Author: Benjamin A. Helfrecht
Federico Giberti
Michele Ceriotti
Piero Gasparotto
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2019
Field of study

Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns

Infoscience - École polytechnique fédérale de Lausanne

Directory of Open Access Journals

Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure

Author: Aydin Zafer
Bilmes Jeff
Noble William S
Singh Ajit
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Protein secondary structure prediction provides insight into protein function and is a valuable preliminary step for predicting the 3D structure of a protein. Dynamic Bayesian networks (DBNs) and support vector machines (SVMs) have been shown to provide state-of-the-art performance in secondary structure prediction. As the size of the protein database grows, it becomes feasible to use a richer model in an effort to capture subtle correlations among the amino acids and the predicted labels. In this context, it is beneficial to derive sparse models that discourage over-fitting and provide biological insight. Results In this paper, we first show that we are able to obtain accurate secondary structure predictions. Our per-residue accuracy on a well established and difficult benchmark (CB513) is 80.3%, which is comparable to the state-of-the-art evaluated on this dataset. We then introduce an algorithm for sparsifying the parameters of a DBN. Using this algorithm, we can automatically remove up to 70-95% of the parameters of a DBN while maintaining the same level of predictive accuracy on the SD576 set. At 90% sparsity, we are able to compute predictions three times faster than a fully dense model evaluated on the SD576 set. We also demonstrate, using simulated data, that the algorithm is able to recover true sparse structures with high accuracy, and using real data, that the sparse model identifies known correlation structure (local and non-local) related to different classes of secondary structure elements. Conclusions We present a secondary structure prediction method that employs dynamic Bayesian networks and support vector machines. We also introduce an algorithm for sparsifying the parameters of the dynamic Bayesian network. The sparsification approach yields a significant speed-up in generating predictions, and we demonstrate that the amino acid correlations identified by the algorithm correspond to several known features of protein secondary structure. Datasets and source code used in this study are available at <url>http://noble.gs.washington.edu/proj/pssp</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach

Author: A Kloczkowski
A Kolinski
A Yaseen
A Zemla
AG Murzin
Andrzej Kloczkowski
Andrzej Kolinski
B Rost
B Rost
B Yang
C Mirabello
C Sander
CA Orengo
D Kihara
DT Jones
E Faraggi
E Faraggi
F Wilcoxon
G Karypis
G Pollastri
G Pollastri
G Pollastri
G Wang
H Cheng
H Zhang
IK McDonald
J Cheng
J Garnier
J Garnier
J Martin
J Skolnick
JA Cuff
K Bryson
K Lin
KD Pruitt
KJ Won
L Kurgan
L Pauling
L Pauling
M Blaszczyk
M Jamroz
M Kurcinski
O Dor
P Kountouris
PA Alexander
PA Alexander
PJ Silva
PN Bryan
Q Huang
R Adamczak
R Heffernan
S Montgomerie
S Saraswathi
Saras Saraswathi
SB Needleman
SF Altschul
Shamima Rashid
SS Shapiro
Suresh Sundaram
T Nitta
TZ Sen
TZ Sen
W Kabsch
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref