Search CORE

1 research outputs found

Abstract On Predicting Secondary Structure Transition

Author
Publication venue
Publication date: 02/04/2008
Field of study

A function of a protein is dependent on its structure; therefore, predicting a protein structure from an amino acid sequence is an active area of research. Optimally predicting a structure from a sequence is NP-hard problem, hence several sub-optimal algorithms with heuristics have been used to solve the problem. When a structure is predicted by an approximate algorithm, it must be validated and such validation invariably involves validating the secondary structure using the predicted locations of all the residues. To improve the accuracy of validation of secondary accuracy, we are studying the predictability of secondary structure transitions using the following machine learning algorithms: naïve Bayes, C4.5 decision tree, and random forest. The outcome of any machine-learning algorithm depends on the quality of the training set; hence it must be free from any errors or noise. Absolute error free training data set is not possible to construct, but we have created a data set by filtering out possible errors that are indicated by disagreement of secondary structure assignments or inconsistent with the annotations in PDB, DSSP and STRIDE. We have demonstrated that predicting structure transition with high degree of certainty is possible and we were able to get as high as 97.5 % of prediction accuracy

CiteSeerX