BACKGROUND: Improvements in protein sequence annotation and an increase in the number of annotated protein databases has fueled development of an increasing number of software tools to predict secreted proteins. Six software programs capable of high throughput and employing a wide range of prediction methods, SignalP 3.0, SignalP 2.0, TargetP 1.01, PrediSi, Phobius, and ProtComp 6.0, are evaluated. RESULTS: Prediction accuracies were evaluated using 372 unbiased, eukaryotic, SwissProt protein sequences. TargetP, SignalP 3.0 maximum S-score and SignalP 3.0 D-score were the most accurate single scores (90–91% accurate). The combination of a positive TargetP prediction, SignalP 2.0 maximum Y-score, and SignalP 3.0 maximum S-score increased accuracy by six percent. CONCLUSION: Single predictive scores could be highly accurate, but almost all accuracies were slightly less than those reported by program authors. Predictive accuracy could be substantially improved by combining scores from multiple methods into a single composite prediction

Ellis, Lynda BM

Klee, Eric W

English

PubMed

Abstract Background Improvements in protein sequence annotation and an increase in the number of annotated protein databases has fueled development of an increasing number of software tools to predict secreted proteins. Six software programs capable of high throughput and employing a wide range of prediction methods, SignalP 3.0, SignalP 2.0, TargetP 1.01, PrediSi, Phobius, and ProtComp 6.0, are evaluated. Results Prediction accuracies were evaluated using 372 unbiased, eukaryotic, SwissProt protein sequences. TargetP, SignalP 3.0 maximum S-score and SignalP 3.0 D-score were the most accurate single scores (90–91% accurate). The combination of a positive TargetP prediction, SignalP 2.0 maximum Y-score, and SignalP 3.0 maximum S-score increased accuracy by six percent. Conclusion Single predictive scores could be highly accurate, but almost all accuracies were slightly less than those reported by program authors. Predictive accuracy could be substantially improved by combining scores from multiple methods into a single composite prediction.</p

Ellis Lynda BM

Klee Eric W

Directory of Open Access Journals

BMC Bioinformatics

Evaluating eukaryotic secreted protein prediction

Eric W Klee

Lynda BM Ellis

Springer - Publisher Connector

A: Prediction of signal peptides and signal anchors by a hidden Markov model.

Apweiler R: A collection of well characterised integral membrane proteins. Bioinformatics

Apweiler R: A comparison of signal sequence prediction methods using a test set of signal peptides. Bioinformatics

Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics

Better prediction of protein cellular localization sites with the k nearest neighbors classifier.

Brunak S: Improved prediction of signal peptides:

Comparison of predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta

D: PrediSi: prediction of signal peptides and their cleavage positions. Nucleic Acids Res

G: A new method for predicting signal sequence cleavage sites. Nucleic Acids Res

G: Signal sequences. The limits of variation.

Henzel WJ: Signal peptide prediction based on analysis of experimentally verified cleavage sites. Protein Sci

Kanehisa M: A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics

MitoProt: a Macintosh application for studying mitochondrial proteins.

Prediction of proprotein convertase cleavage sites. Protein Eng Des Sel

Sinning I: SRP-mediated protein targeting: structure and function revisited. Biochim Biophys Acta

Sonnhammer EL: A combined transmembrane topology and signal peptide prediction method.

Sonnhammer EL: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

T: TMPDB: a database of experimentally-characterized transmembrane topologies. Nucl Acids Res

The Moments of the z and F Distributions. Biometrika 1949, 36:394-403. Additional file 1 Protein test set 372 eukaryotic Swiss-Prot protein sequences used to evaluate the six servers, in fasta format. Click here for file

Vertebrate Secretome and CTT-ome Database [ h t t p : /

Vincens P: Computational method to predict mitochondrially imported proteins and their targeting sequences.

von Heijne G: ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci

von Heijne G: Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng

von Heijne G: Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng

von Heijne G: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.

von Heijne G: Prediction of organellar targeting signals. Biochim Biophys Acta

file:///data/remote/core/dit/data/Springer-OA/pdf/dcf/aHR0cDovL2xpbmsuc3ByaW5nZXIuY29tLzEwLjExODYvMTQ3MS0yMTA1LTYtMjU2LnBkZg==.pdf

Evaluating eukaryotic secreted protein prediction

Abstract

Similar works

Full text

Available Versions

Directory of Open Access Journals

Springer - Publisher Connector

Springer - Publisher Connector