24 research outputs found

    Prediction of the binding affinities of peptides to class II MHC using a regularized thermodynamic model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The binding of peptide fragments of extracellular peptides to class II MHC is a crucial event in the adaptive immune response. Each MHC allotype generally binds a distinct subset of peptides and the enormous number of possible peptide epitopes prevents their complete experimental characterization. Computational methods can utilize the limited experimental data to predict the binding affinities of peptides to class II MHC.</p> <p>Results</p> <p>We have developed the Regularized Thermodynamic Average, or RTA, method for predicting the affinities of peptides binding to class II MHC. RTA accounts for all possible peptide binding conformations using a thermodynamic average and includes a parameter constraint for regularization to improve accuracy on novel data. RTA was shown to achieve higher accuracy, as measured by AUC, than SMM-align on the same data for all 17 MHC allotypes examined. RTA also gave the highest accuracy on all but three allotypes when compared with results from 9 different prediction methods applied to the same data. In addition, the method correctly predicted the peptide binding register of 17 out of 18 peptide-MHC complexes. Finally, we found that suboptimal peptide binding registers, which are often ignored in other prediction methods, made significant contributions of at least 50% of the total binding energy for approximately 20% of the peptides.</p> <p>Conclusions</p> <p>The RTA method accurately predicts peptide binding affinities to class II MHC and accounts for multiple peptide binding registers while reducing overfitting through regularization. The method has potential applications in vaccine design and in understanding autoimmune disorders. A web server implementing the RTA prediction method is available at <url>http://bordnerlab.org/RTA/</url>.</p

    MultiRTA: A simple yet reliable method for predicting peptide binding affinities for multiple class II MHC allotypes

    Get PDF
    abstract: Background The binding of peptide fragments of antigens to class II MHC is a crucial step in initiating a helper T cell immune response. The identification of such peptide epitopes has potential applications in vaccine design and in better understanding autoimmune diseases and allergies. However, comprehensive experimental determination of peptide-MHC binding affinities is infeasible due to MHC diversity and the large number of possible peptide sequences. Computational methods trained on the limited experimental binding data can address this challenge. We present the MultiRTA method, an extension of our previous single-type RTA prediction method, which allows the prediction of peptide binding affinities for multiple MHC allotypes not used to train the model. Thus predictions can be made for many MHC allotypes for which experimental binding data is unavailable. Results We fit MultiRTA models for both HLA-DR and HLA-DP using large experimental binding data sets. The performance in predicting binding affinities for novel MHC allotypes, not in the training set, was tested in two different ways. First, we performed leave-one-allele-out cross-validation, in which predictions are made for one allotype using a model fit to binding data for the remaining MHC allotypes. Comparison of the HLA-DR results with those of two other prediction methods applied to the same data sets showed that MultiRTA achieved performance comparable to NetMHCIIpan and better than the earlier TEPITOPE method. We also directly tested model transferability by making leave-one-allele-out predictions for additional experimentally characterized sets of overlapping peptide epitopes binding to multiple MHC allotypes. In addition, we determined the applicability of prediction methods like MultiRTA to other MHC allotypes by examining the degree of MHC variation accounted for in the training set. An examination of predictions for the promiscuous binding CLIP peptide revealed variations in binding affinity among alleles as well as potentially distinct binding registers for HLA-DR and HLA-DP. Finally, we analyzed the optimal MultiRTA parameters to discover the most important peptide residues for promiscuous and allele-specific binding to HLA-DR and HLA-DP allotypes. Conclusions The MultiRTA method yields competitive performance but with a significantly simpler and physically interpretable model compared with previous prediction methods. A MultiRTA prediction webserver is available at http://bordnerlab.org/MultiRTA.The electronic version of this article is the complete one and can be found online at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-48

    Predicting Peptide Binding Affinities to MHC Molecules Using a Modified Semi-Empirical Scoring Function

    Get PDF
    The Major Histocompatibility Complex (MHC) plays an important role in the human immune system. The MHC is involved in the antigen presentation system assisting T cells to identify foreign or pathogenic proteins. However, an MHC molecule binding a self-peptide may incorrectly trigger an immune response and cause an autoimmune disease, such as multiple sclerosis. Understanding the molecular mechanism of this process will greatly assist in determining the aetiology of various diseases and in the design of effective drugs. In the present study, we have used the Fresno semi-empirical scoring function and modify the approach to the prediction of peptide-MHC binding by using open-source and public domain software. We apply the method to HLA class II alleles DR15, DR1, and DR4, and the HLA class I allele HLA A2. Our analysis shows that using a large set of binding data and multiple crystal structures improves the predictive capability of the method. The performance of the method is also shown to be correlated to the structural similarity of the crystal structures used. We have exposed some of the obstacles faced by structure-based prediction methods and proposed possible solutions to those obstacles. It is envisaged that these obstacles need to be addressed before the performance of structure-based methods can be on par with the sequence-based methods

    Peptide binding prediction for the human class II MHC allele HLA-DP2:a molecular docking approach

    Get PDF
    MHC class II proteins bind oligopeptide fragments derived from proteolysis of pathogen antigens, presenting them at the cell surface for recognition by CD4+ T cells. Human MHC class II alleles are grouped into three loci: HLA-DP, HLA-DQ and HLA-DR. In contrast to HLA-DR and HLA-DQ, HLA-DP proteins have not been studied extensively, as they have been viewed as less important in immune responses than DRs and DQs. However, it is now known that HLA-DP alleles are associated with many autoimmune diseases. Quite recently, the X-ray structure of the HLA-DP2 molecule (DPA*0103, DPB1*0201) in complex with a self-peptide derived from the HLA-DR a-chain has been determined. In the present study, we applied a validated molecular docking protocol to a library of 247 modelled peptide-DP2 complexes, seeking to assess the contribution made by each of the 20 naturally occurred amino acids at each of the nine binding core peptide positions and the four flanking residues (two on both sides)

    Towards Universal Structure-Based Prediction of Class II MHC Epitopes for Diverse Allotypes

    Get PDF
    The binding of peptide fragments of antigens to class II MHC proteins is a crucial step in initiating a helper T cell immune response. The discovery of these peptide epitopes is important for understanding the normal immune response and its misregulation in autoimmunity and allergies and also for vaccine design. In spite of their biomedical importance, the high diversity of class II MHC proteins combined with the large number of possible peptide sequences make comprehensive experimental determination of epitopes for all MHC allotypes infeasible. Computational methods can address this need by predicting epitopes for a particular MHC allotype. We present a structure-based method for predicting class II epitopes that combines molecular mechanics docking of a fully flexible peptide into the MHC binding cleft followed by binding affinity prediction using a machine learning classifier trained on interaction energy components calculated from the docking solution. Although the primary advantage of structure-based prediction methods over the commonly employed sequence-based methods is their applicability to essentially any MHC allotype, this has not yet been convincingly demonstrated. In order to test the transferability of the prediction method to different MHC proteins, we trained the scoring method on binding data for DRB1*0101 and used it to make predictions for multiple MHC allotypes with distinct peptide binding specificities including representatives from the other human class II MHC loci, HLA-DP and HLA-DQ, as well as for two murine allotypes. The results showed that the prediction method was able to achieve significant discrimination between epitope and non-epitope peptides for all MHC allotypes examined, based on AUC values in the range 0.632–0.821. We also discuss how accounting for peptide binding in multiple registers to class II MHC largely explains the systematically worse performance of prediction methods for class II MHC compared with those for class I MHC based on quantitative prediction performance estimates for peptide binding to class II MHC in a fixed register

    Learning a peptide-protein binding affinity predictor with kernel ridge regression

    Get PDF
    We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalize eight kernels, such as the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of accurately predicting the binding affinity of any peptide to any protein. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. On all benchmarks, our method significantly (p-value < 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. The method should be of value to a large segment of the research community with the potential to accelerate peptide-based drug and vaccine development.Comment: 22 pages, 4 figures, 5 table

    Host genotype and time dependent antigen presentation of viral peptides: predictions from theory

    Get PDF
    The rate of progression of HIV infected individuals to AIDS is known to vary with the genotype of the host, and is linked to their allele of human leukocyte antigen (HLA) proteins, which present protein degradation products at the cell surface to circulating T-cells. HLA alleles are associated with Gag-specific T-cell responses that are protective against progression of the disease. While Pol is the most conserved HIV sequence, its association with immune control is not as strong. To gain a more thorough quantitative understanding of the factors that contribute to immunodominance, we have constructed a model of the recognition of HIV infection by the MHC class I pathway. Our model predicts surface presentation of HIV peptides over time, demonstrates the importance of viral protein kinetics, and provides evidence of the importance of Gag peptides in the long-term control of HIV infection. Furthermore, short-term dynamics are also predicted, with simulation of virion-derived peptides suggesting that efficient processing of Gag can lead to a 50% probability of presentation within 3 hours post-infection, as observed experimentally. In conjunction with epitope prediction algorithms, this modelling approach could be used to refine experimental targets for potential T-cell vaccines, both for HIV and other viruses

    Algorithmes d'apprentissage automatique pour la conception de composés pharmaceutiques et de vaccins

    Get PDF
    La découverte de composés pharmaceutiques est actuellement trop longue et trop coûteuse, et le taux d’échec, trop élevé. Les bases de données biochimiques et génomiques ne cessent de grossir et il est maintenant impraticable d’interpréter ces données. Un changement radical est nécessaire ; certaines étapes de ce processus doivent être automatisées. Les peptides jouent un rôle important dans le système immunitaire et dans la signalisation cellulaire. Leurs propriétés favorables en font des candidats de choix pour initier la conception de nouveaux médicaments et assister la production de nouveaux vaccins. De plus, les techniques de synthèse modernes permettent de rapidement synthétiser ces molécules à faible coût. Les algorithmes d’apprentissage statistique sont particulièrement bien adaptés pour apprendre de façon automatisée des modèles, possiblement biochimiques, à partir des données existantes. Ces méthodes et les peptides offrent donc une solution de choix aux défis auxquels fait face la recherche pharmaceutique. Nous proposons un noyau permettant l’apprentissage de modèles statistiques de phénomènes biochimiques impliquant des peptides. Celui-ci permet, entre autres, l’apprentissage d’un modèle universel pouvant raisonnablement quantifier l’énergie de liaison entre toute séquence peptidique et tout site de liaison d’une protéine cristallisée. De plus, il unifie la théorie de plusieurs noyaux existants tout en conservant une faible complexité algorithmique. Ce noyau s’avère particulièrement adapté pour quantifier l’interaction entre les antigènes et les complexes majeurs d’histocompatibilité. Nous proposons un outil pour prédire les peptides qui survivront au processus de présentation antigénique. Cet outil a gagné une compétition internationale et aura plusieurs applications en immunologie, dont la conception de vaccins. Ultimement, un peptide doit maximiser l’interaction avec une protéine cible ou maximiser la bioactivité chez l’hôte. Nous formalisons ce problème comme un problème de prédiction de structures. Puis, nous proposons un algorithme exploitant les plus longs chemins dans un graphe pour déterminer les peptides maximisant la bioactivité prédite par un modèle préalablement appris. Nous validons cette nouvelle approche en laboratoire par la découverte de peptides antimicrobiens. Finalement, nous fournissons des garanties de performance de type PAC-Bayes pour deux algorithmes de prédiction de structure dont un est nouveau.The discovery of pharmaceutical compounds is currently too time-consuming, too expensive, and the failure rate is too high. Biochemical and genomic databases continue to grow and it is now impracticable to interpret these data. A radical change is needed; some steps in this process must be automated. Peptides are molecules that play an important role in the immune system and in cell signaling. Their favorable properties make them prime candidates for initiating the design of new drugs and assist in the design of vaccines. In addition, modern synthesis techniques can quickly generate these molecules at low cost. Statistical learning algorithms are well suited to manage large amount of data and to learn models in an automated fashion. These methods and peptides thus offer a solution of choice to the challenges facing pharmaceutical research. We propose a kernel for learning statistical models of biochemical phenomena involving peptides. This allows, among other things, to learn a universal model that can reasonably quantify the binding energy between any peptide sequence and any binding site of a protein. In addition, it unifies the theory of many existing string kernels while maintaining a low computational complexity. This kernel is particularly suitable for quantifying the interaction between antigens and proteins of the major histocompatibility complex. We provide a tool to predict peptides that are likely to be processed by the antigen presentation pathway. This tool has won an international competition and has several applications in immunology, including vaccine design. Ultimately, a peptide should maximize the interaction with a target protein or maximize bioactivity in the host. We formalize this problem as a structured prediction problem. Then, we propose an algorithm exploiting the longest paths in a graph to identify peptides maximizing the predicted bioactivity of a previously learned model. We validate this new approach in the laboratory with the discovery of new antimicrobial peptides. Finally, we provide PAC-Bayes bound for two structured prediction algorithms, one of which is new
    corecore