2 research outputs found
Protein Secondary Structure Prediction Based on Physicochemical Features and PSSM by KNN
In this paper, we propose a protein secondary structure prediction method based on the k-nearest neighborhood (KNN) technique with position-specific scoring matrix (PSSM) profiles, propensity matrix of amino acids in three conformations (HEC) and three physicochemical features; hydrophobicity, net charges, and side chain mass. First, the KNN with the optimal k-value is found. Then, the Euclidean distance of 26-dimensional data for each amino acid of a protein, to the data vectors of all other proteins are computed. The conformations of the nearest seven amino acids are pooled. Majority of the pooled votes is given to the amino acid of the quarry protein as the conformation H, E, or C. Finally, we use a filter to refine the predicted results from KNN. After filtering, the accuracy of the prediction goes up to the level of 90% for some proteins. This validates that considering PSSM, the propensity matrix, and physicochemical features may exhibit better performance