Motivation: In silico methods for the prediction of antigenic peptides
binding to MHC class I molecules play an increasingly important role in the
identification of T-cell epitopes. Statistical and machine learning methods, in
particular, are widely used to score candidate epitopes based on their
similarity with known epitopes and non epitopes. The genes coding for the MHC
molecules, however, are highly polymorphic, and statistical methods have
difficulties to build models for alleles with few known epitopes. In this case,
recent works have demonstrated the utility of leveraging information across
alleles to improve the performance of the prediction. Results: We design a
support vector machine algorithm that is able to learn epitope models for all
alleles simultaneously, by sharing information across similar alleles. The
sharing of information across alleles is controlled by a user-defined measure
of similarity between alleles. We show that this similarity can be defined in
terms of supertypes, or more directly by comparing key residues known to play a
role in the peptide-MHC binding. We illustrate the potential of this approach
on various benchmark experiments where it outperforms other state-of-the-art
methods