3 research outputs found
Support Vector Machine-based Fuzzy Systems for Quantitative Prediction of Peptide Binding Affinity
Reliable prediction of binding affinity of peptides is one of the most challenging but important complex modelling problems in the post-genome era
due to the diversity and functionality of the peptides discovered. Generally, peptide binding prediction models
are commonly used to find out whether a binding exists between a certain peptide(s) and a major histocompatibility complex (MHC) molecule(s).
Recent research efforts have been focused on quantifying the binding predictions.
The objective of this thesis is to develop reliable real-value predictive models through the use of fuzzy systems.
A non-linear system is proposed with the aid of support vector-based regression to improve the fuzzy system and applied
to the real value prediction of degree of peptide binding.
This research study introduced two novel methods to improve structure and parameter identification of fuzzy systems.
First, the support-vector based regression is used to identify initial parameter values of the consequent part of type-1 and
interval type-2 fuzzy systems. Second, an overlapping clustering concept is used to derive interval valued parameters of the premise part of the type-2 fuzzy system.
Publicly available peptide binding affinity data sets obtained from the literature are used in the
experimental studies of this thesis. First, the proposed models are blind validated using the peptide binding affinity
data sets obtained from a modelling competition. In that competition, almost an equal number of
peptide sequences in the training and testing data sets
(89, 76, 133 and 133 peptides for the training and 88, 76, 133 and 47 peptides for the testing) are provided to the participants.
Each peptide in the data sets was
represented by 643 bio-chemical descriptors assigned to each amino acid.
Second, the proposed models are cross validated using mouse class I MHC alleles (H2-Db, H2-Kb and H2-Kk). H2-Db, H2-Kb, and H2-Kk consist of
65 nona-peptides, 62 octa-peptides, and 154 octa-peptides, respectively. Compared to the previously published results in the literature,
the support vector-based type-1 and support vector-based interval type-2 fuzzy models yield an improvement in the prediction accuracy.
The quantitative predictive performances have been improved
as much as 33.6\% for the first group of data sets and 1.32\% for the
second group of data sets.
The proposed models not only improved the performance of the fuzzy system (which used support vector-based regression),
but the support vector-based regression benefited from the fuzzy concept also.
The results obtained here sets the platform for the presented models to be considered for other application domains in computational and/or systems biology.
Apart from improving the prediction accuracy, this research study has also identified specific features which play a key role(s) in making
reliable peptide binding affinity predictions. The amino acid features "Polarity", "Positive charge", "Hydrophobicity coefficient", and "Zimm-Bragg parameter" are
considered as highly discriminating features in the peptide binding affinity data sets.
This information can be valuable in the design of peptides with strong binding affinity to a MHC I molecule(s). This information may also be useful
when designing drugs and vaccines