Esub8: A novel tool to predict protein subcellular localizations in eukaryotic organisms

Abstract

BACKGROUND: Subcellular localization of a new protein sequence is very important and fruitful for understanding its function. As the number of new genomes has dramatically increased over recent years, a reliable and efficient system to predict protein subcellular location is urgently needed. RESULTS: Esub8 was developed to predict protein subcellular localizations for eukaryotic proteins based on amino acid composition. In this research, the proteins are classified into the following eight groups: chloroplast, cytoplasm, extracellular, Golgi apparatus, lysosome, mitochondria, nucleus and peroxisome. We know subcellular localization is a typical classification problem; consequently, a one-against-one (1-v-1) multi-class support vector machine was introduced to construct the classifier. Unlike previous methods, ours considers the order information of protein sequences by a different method. Our method is tested in three subcellular localization predictions for prokaryotic proteins and four subcellular localization predictions for eukaryotic proteins on Reinhardt's dataset. The results are then compared to several other methods. The total prediction accuracies of two tests are both 100% by a self-consistency test, and are 92.9% and 84.14% by the jackknife test, respectively. Esub8 also provides excellent results: the total prediction accuracies are 100% by a self-consistency test and 87% by the jackknife test. CONCLUSIONS: Our method represents a different approach for predicting protein subcellular localization and achieved a satisfactory result; furthermore, we believe Esub8 will be a useful tool for predicting protein subcellular localizations in eukaryotic organisms

    Similar works