Article thumbnail

Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence

By H. B. Rao, F. Zhu, G. B. Yang, Z. R. Li and Y. Z. Chen


Sequence-derived structural and physicochemical features have been extensively used for analyzing and predicting structural, functional, expression and interaction profiles of proteins and peptides. PROFEAT has been developed as a web server for computing commonly used features of proteins and peptides from amino acid sequence. To facilitate more extensive studies of protein and peptides, numerous improvements and updates have been made to PROFEAT. We added new functions for computing descriptors of protein–protein and protein–small molecule interactions, segment descriptors for local properties of protein sequences, topological descriptors for peptide sequences and small molecule structures. We also added new feature groups for proteins and peptides (pseudo-amino acid composition, amphiphilic pseudo-amino acid composition, total amino acid properties and atomic-level topological descriptors) as well as for small molecules (atomic-level topological descriptors). Overall, PROFEAT computes 11 feature groups of descriptors for proteins and peptides, and a feature group of more than 400 descriptors for small molecules plus the derived features for protein–protein and protein–small molecule interactions. Our computational algorithms have been extensively tested and used in a number of published works for predicting proteins of specific structural or functional classes, protein–protein interactions, peptides of specific functions and quantitative structure activity relationships of small molecules. PROFEAT is accessible free of charge at

Topics: Articles
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

Suggested articles


  1. (2007). Amino acid sequence autocorrelation vectors and Bayesian-regularized genetic neural networks for modeling protein conformational stability: gene V protein mutants.
  2. (2008). Analysis and prediction of protein folding rates using quadratic response surface models.
  3. (2003). Atomic-level-based AI topological descriptors for structure-property correlations.
  4. (2006). BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.
  5. (2010). Bioinformatics predictions of localization and targeting.
  6. (2010). Boosting the prediction and understanding of DNA-binding domains from sequence.
  7. (2002). Classifying G-protein coupled receptors with support vector machines.
  8. (2010). Combining structure and sequence information allows automated prediction of substrate specificities within enzyme families.
  9. (2009). CRYSTALP2: sequence-based protein crystallization propensity prediction.
  10. (2009). DescFold: a web server for protein fold recognition.
  11. (2005). Effect of training datasets on support vector machine prediction of protein-protein interactions.
  12. (2008). Efficient peptide-MHC-I binding prediction for alleles with few known binders.
  13. (2006). Elucidation of characteristic structural features of ligand binding sites of protein kinases: a neural network approach.
  14. (2000). Handbook of Molecular Descriptors.
  15. (2009). Identification of novel antibacterial peptides by chemoinformatics and machine learning.
  16. (2003). Importance of native-state topology for determining the folding rate of two-state proteins.
  17. (2008). Incorporating sequence information into the scoring function: a hidden Markov model for improved peptide identification.
  18. (2006). Influence of amino acid properties for discriminating outer membrane proteins at better accuracy.
  19. (2010). Machine learning based prediction for peptide drift times in ion mobility spectrometry.
  20. (2010). NAPS: a residue-level nucleic acid-binding prediction server.
  21. (2008). ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction.
  22. (2010). PiRaNhA: a server for the computational prediction of RNA-binding residues in protein sequences.
  23. (2008). Predicting co-complexed protein pairs from heterogeneous data.
  24. (2001). Predicting protein–protein interactions from primary structure.
  25. (2008). Prediction of drug-target interaction networks from the integration of chemical and genomic spaces.
  26. (2009). Prediction of interaction between small molecule and enzyme using AdaBoost.
  27. (2007). Prediction of MHC-binding peptides of flexible lengths from sequence-derived structural and physicochemical properties.
  28. (2010). Prediction of protease substrates using sequence and structure features.
  29. (2010). Prediction of protein-RNA binding sites by a random forest method with combined features.
  30. (2006). Prediction of RNA binding sites in proteins from amino acid sequence.
  31. (2006). PROFEAT: a web server for computing structural and Nucleic AcidsResearch,
  32. (2008). Protease substrate site predictors derived from machine learning on multilevel substrate phage display data.
  33. (2008). PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition.
  34. (2006). Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity.
  35. (1999). Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification.
  36. (2010). Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces.
  37. (2003). SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence.
  38. (2010). SVMCRYS: an SVM approach for the prediction of protein crystallization propensity from protein sequence. Protein Pept.
  39. (1994). The rational design of amino acid sequences by artificial neural networks and simulated molecular evolution: de novo design of an idealized leader peptidase cleavage site.
  40. (2009). What are next generation innovative therapeutic targets? Clues from genetic, structural, physicochemical, and systems profiles of successful targets.