259 research outputs found

    Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles.</p> <p>Results</p> <p>The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors.</p> <p>Conclusion</p> <p>The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (<it>NetMHCII</it>) are made publicly available.</p

    Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores

    Get PDF
    BACKGROUND: Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. RESULTS: The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP [1], MCHBN [2], and MHCBench [3]. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 A(ROC )for the MHCBench data sets (up from 0.756), and an average of 0.96 A(ROC )for multiple alleles of the MHCPEP database. CONCLUSION: The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems

    Modeling the bound conformation of Pemphigus Vulgaris-associated peptides to MHC Class II DR and DQ Alleles

    Get PDF
    BACKGROUND: Pemphigus vulgaris (PV) is a severe autoimmune blistering disorder characterized by the presence of pathogenic autoantibodies directed against desmoglein-3 (Dsg3), involving specific DR4 and DR6 alleles in Caucasians and DQ5 allele in Asians. The development of sequence-based predictive algorithms to identify potential Dsg3 epitopes has encountered limited success due to the paucity of PV-associated allele-specific peptides as training data. RESULTS: In this work we constructed atomic models of ten PV associated, non-associated and protective alleles. Nine previously identified stimulatory Dsg3 peptides, Dsg3 96–112, Dsg3 191–205, Dsg3 206–220, Dsg3 252–266, Dsg3 342–356, Dsg3 380–394, Dsg3 763–777, Dsg3 810–824 and Dsg3 963–977, were docked into the binding groove of each model to analyze the structural aspects of allele-specific binding. CONCLUSION: Our docking simulations are entirely consistent with functional data obtained from in vitro competitive binding assays and T cell proliferation studies in DR4 and DR6 PV patients. Our findings ascertain that DRB1*0402 plays a crucial role in the selection of specific self-peptides in DR4 PV. DRB1*0402 and DQB1*0503 do not necessarily share the same core residues, indicating that both alleles may have different binding specificities. In addition, our results lend credence to the hypothesis that the alleles DQB1*0201 and *0202 play a protective role by binding Dsg3 peptides with greater affinity than the susceptible alleles, allowing for efficient deletion of autoreactive T cells

    Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan

    Get PDF
    CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules-even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC "space," enabling a highly efficient iterative process for improving MHC class II binding predictions

    STATISTICAL LEARNING FOR IMMUNOMICS AND GENOMICS

    Get PDF
    Cancer immunotherapy has become a new pillar of cancer treatment. However, durable clinical benefits are only observed in a subset of patients. Identifying patients who can benefit from immunotherapy is among the most pressing questions in precision medicine. In Chapter 2, we introduce a computational pipeline to prioritize neoantigens for the development of cancer vaccines. To improve the prediction of which peptides are presented, we also propose our novel method PEPPRMINT, a pan-specific method that uses multi-HLA-allele (MA) data to predict HLA-I peptide presentation. MA data is expected to accumulate quickly in the near future, however most current methods cannot use MA data since it is unknown which HLA the peptide binds to. We show the set of neoantigens prioritized using PEPPRMINT in the computational pipeline had a significant association with cytolytic activity in melanoma patients, while current methods did not capture this association. In Chapter 3, we propose a novel method PEPPRMINT-2, which extends the framework developed in PEPPRMINT, to improve HLA-II peptide presentation prediction using MA data and accounting for the binding core that is unique to HLA-II. Both PEPPRMINT methods combine a rigorous statistical mixture model and the power of a neural network to analyze MA data. We also propose a new definition for neoantigen burden that uses both HLA-I and HLA-II neoantigen burdens. Our definition of neoantigen burden helps predict the patient’s response to immunotherapy, even after adjusting for the effect of mutation burden. Polygenic risk scores (PRS) is a popular method for predicting complex traits using genome wide association studies (GWAS) data, but the performance highly depends on the relatedness of the testing and GWAS population. Admixed individuals have recent ancestry from two or more different populations. In Chapter 4, we explore the importance of thresholding and the effect of the global ancestry proportion in the performance of PRS constructed with different GWAS for admixed Hispanic/Latino individuals. Additionally, we examine the performance of different constructed PRS, including a combined PRS of different GWAS, in the Hispanic/Latino population with respect to chronic kidney disease and hypertension, which has not been shown in the current literature.Doctor of Philosoph

    Understanding peptide specificity through structural immunoinformatics

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    The AddACO: A bio-inspired modified version of the ant colony optimization algorithm to solve travel salesman problems

    Get PDF
    The Travel Salesman Problem (TSP) consists in finding the minimal-length closed tour that connects the entire group of nodes of a given graph. We propose to solve such a combinatorial optimization problem with the AddACO algorithm: it is a version of the Ant Colony Optimization method that is characterized by a modified probabilistic law at the basis of the exploratory movement of the artificial insects. In particular, the ant decisional rule is here set to amount in a linear convex combination of competing behavioral stimuli and has therefore an additive form (hence the name of our algorithm), rather than the canonical multiplicative one. The AddACO intends to address two conceptual shortcomings that characterize classical ACO methods: (i) the population of artificial insects is in principle allowed to simultaneously minimize/maximize all migratory guidance cues (which is in implausible from a biological/ecological point of view) and (ii) a given edge of the graph has a null probability to be explored if at least one of the movement trait is therein equal to zero, i.e., regardless the intensity of the others (this in principle reduces the exploratory potential of the ant colony). Three possible variants of our method are then specified: the AddACO-V1, which includes pheromone trail and visibility as insect decisional variables, and the AddACO-V2 and the AddACO-V3, which in turn add random effects and inertia, respectively, to the two classical migratory stimuli. The three versions of our algorithm are tested on benchmark middle-scale TPS instances, in order to assess their performance and to find their optimal parameter setting. The best performing variant is finally applied to large-scale TSPs, compared to the naive Ant-Cycle Ant System, proposed by Dorigo and colleagues, and evaluated in terms of quality of the solutions, computational time, and convergence speed. The aim is in fact to show that the proposed transition probability, as long as its conceptual advantages, is competitive from a performance perspective, i.e., if it does not reduce the exploratory capacity of the ant population w.r.t. the canonical one (at least in the case of selected TSPs). A theoretical study of the asymptotic behavior of the AddACO is given in the appendix of the work, whose conclusive section contains some hints for further improvements of our algorithm, also in the perspective of its application to other optimization problems

    A Comparative Study Of Ant Colony Optimization

    Get PDF
    Ant Colony Optimization (ACO) belongs to a class of biologically-motivated approaches to computing that includes such metaheuristics as artificial neural networks, evolutionary algorithms, and artificial immune systems, among others. Emulating to varying degrees the particular biological phenomena from which their inspiration is drawn, these alternative computational systems have succeeded in finding solutions to complex problems that had heretofore eluded more traditional techniques. Often, the resulting algorithm bears little resemblance to its biological progenitor, evolving instead into a mathematical abstraction of a singularly useful quality of the phenomenon. In such cases, these abstract computational models may be termed biological metaphors. Mindful that a fine line separates metaphor from distortion, this paper outlines an attempt to better understand the potential consequences an insufficient understanding of the underlying biological phenomenon may have on its transformation into mathematical metaphor. To that end, the author independently develops a rudimentary ACO, remaining as faithful as possible to the behavioral qualities of an ant colony. Subsequently, the performance of this new ACO is compared with that of a more established ACO in three categories: (1) the hybridization of evolutionary computing and ACO, (2) the efficacy of daemon actions, and (3) theoretical properties and convergence proofs. Ant Colony Optimization (ACO) belongs to a class of biologically-motivated approaches to computing that includes such metaheuristics as artificial neural networks, evolutionary algorithms, and artificial immune systems, among others. Emulating to varying degrees the particular biological phenomena from which their inspiration is drawn, these alternative computational systems have succeeded in finding solutions to complex problems that had heretofore eluded more traditional techniques. Often, the resulting algorithm bears little resemblance to its biological progenitor, evolving instead into a mathematical abstraction of a singularly useful quality of the phenomenon. In such cases, these abstract computational models may be termed biological metaphors. Mindful that a fine line separates metaphor from distortion, this paper outlines an attempt to better understand the potential consequences an insufficient understanding of the underlying biological phenomenon may have on its transformation into mathematical metaphor. To that end, the author independently develops a rudimentary ACO, remaining as faithful as possible to the behavioral qualities of an ant colony. Subsequently, the performance of this new ACO is compared with that of a more established ACO in three categories: (1) the hybridization of evolutionary computing and ACO, (2) the efficacy of daemon actions, and (3) theoretical properties and convergence proofs
    corecore