Article thumbnail

PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines

By Piyali Chatterjee, Subhadip Basu, Mahantapas Kundu, Mita Nasipuri and Dariusz Plewczynski

Abstract

Secondary structure prediction is a crucial task for understanding the variety of protein structures and performed biological functions. Prediction of secondary structures for new proteins using their amino acid sequences is of fundamental importance in bioinformatics. We propose a novel technique to predict protein secondary structures based on position-specific scoring matrices (PSSMs) and physico-chemical properties of amino acids. It is a two stage approach involving multiclass support vector machines (SVMs) as classifiers for three different structural conformations, viz., helix, sheet and coil. In the first stage, PSSMs obtained from PSI-BLAST and five specially selected physicochemical properties of amino acids are fed into SVMs as features for sequence-to-structure prediction. Confidence values for forming helix, sheet and coil that are obtained from the first stage SVM are then used in the second stage SVM for performing structure-to-structure prediction. The two-stage cascaded classifiers (PSP_MCSVM) are trained with proteins from RS126 dataset. The classifiers are finally tested on target proteins of critical assessment of protein structure prediction experiment-9 (CASP9). PSP_MCSVM with brainstorming consensus procedure performs better than the prediction servers like Predator, DSC, SIMPA96, for randomly selected proteins from CASP9 targets. The overall performance is found to be comparable with the current state-of-the art. PSP_MCSVM source code, train-test datasets and supplementary files are available freely in public domain at: http://sysbio.icm.edu.pl/secstruct and http://code.google.com/p/cmater-bioinfo

Topics: Original Paper
Publisher: Springer-Verlag
OAI identifier: oai:pubmedcentral.nih.gov:3168739
Provided by: PubMed Central

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

Suggested articles

Citations

  1. (2004). A novel method for protein secondary structure prediction using dual-layer SVM and profiles.
  2. (1978). Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular protein.
  3. (2000). Cascaded multiple classifiers for secondary structure prediction.
  4. (1997). Exploring the limits of nearest neighbor secondary structure prediction.
  5. (1987). Further developments of protein secondary structure prediction using information theory— new parameters and consideration of residue pairs.
  6. (1997). Gapped BLAST and PSI BLAST: a new generation of protein database search programs.
  7. (1996). GOR method for predicting protein secondary structure from amino acid sequence.
  8. (2005). HYPROSP II: a knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence.
  9. (2004). HYPROSP: a hybrid protein secondary structure prediction algorithm-a knowledge-based approach.
  10. (1996). Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5:2298–
  11. (1993). Improved prediction of protein secondary structure prediction by use of sequence profiles and neural networks.
  12. (1990). Improvements in protein secondary structure prediction by an enhanced neural network.
  13. (2010). Improving prediction of protein secondary structure using physicochemical properties of amino acids.
  14. (1998). JPRED: a consensus secondary structure prediction server.
  15. (2001). On the algorithmic implementation of multi-class SVMs.
  16. (1996). PHD: predicting 1D protein structure by profile based neural networks.
  17. (1988). Predicting the secondary structure of globular proteins using neural network models.
  18. (1993). Prediction of protein secondary structure at better than 70% accuracy.
  19. (1995). Prediction of protein structure by combining nearest-neighbor algorithms and multiple sequence alignments.
  20. (1978). Prediction of secondary structure of proteins from their amino acid sequence.
  21. (1999). Protein secondary structure prediction based on position-specific scoring matrices.
  22. (2002). Protein secondary structure prediction based on the GOR algorithm incorporating multiple sequence alignment information.
  23. (2007). Protein secondary structure prediction through combination of decisions from multiple MLP classifiers.
  24. (2008). The JPRED3 secondary structure prediction server.