Search CORE

Directory of Open Access Journals

Automated benchmarking of peptide-MHC class I binding predictions

Author: Greenbaum Jason
Kim Yohan
Lund Ole
Metushi Imir G.
Nielsen Morten
Peters Bjoern
Sette Alessandro
Sidney John
Trolle Thomas
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Motivation: Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. Results: The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Availability and implementation: Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto-bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto-bench/mhci/join.Fil: Trolle, Thomas. Technical University of Denmark; DinamarcaFil: Metushi, Imir G.. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Greenbaum, Jason A.. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Kim, Yohan. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Sidney, John. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Lund, Ole. Technical University of Denmark; DinamarcaFil: Sette, Alessandro. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Peters, Bjoern. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; Argentin

CONICET Digital

Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions

Author: Buus Søren
Kim Yohan
Nielsen Morten
Peters Bjoern
Sette Alessandro
Sidney John
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: It is important to accurately determine the performance of peptide:MHC binding predictions, as this enables users to compare and choose between different prediction methods and provides estimates of the expected error rate. Two common approaches to determine prediction performance are cross-validation, in which all available data are iteratively split into training and testing data, and the use of blind sets generated separately from the data used to construct the predictive method. In the present study, we have compared cross-validated prediction performances generated on our last benchmark dataset from 2009 with prediction performances generated on data subsequently added to the Immune Epitope Database (IEDB) which served as a blind set. RESULTS: We found that cross-validated performances systematically overestimated performance on the blind set. This was found not to be due to the presence of similar peptides in the cross-validation dataset. Rather, we found that small size and low sequence/affinity diversity of either training or blind datasets were associated with large differences in cross-validated vs. blind prediction performances. We use these findings to derive quantitative rules of how large and diverse datasets need to be to provide generalizable performance estimates. CONCLUSION: It has long been known that cross-validated prediction performance estimates often overestimate performance on independently generated blind set data. We here identify and quantify the specific factors contributing to this effect for MHC-I binding predictions. An increasing number of peptides for which MHC binding affinities are measured experimentally have been selected based on binding predictions and thus are less diverse than historic datasets sampling the entire sequence and affinity space, making them more difficult benchmark data sets. This has to be taken into account when comparing performance metrics between different benchmarks, and when deriving error estimates for predictions based on benchmark performance.Fil: Kim, Yohan. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Sidney, John. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Buus, Søren. Universidad de Copenhagen; DinamarcaFil: Sette, Alessandro. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Nielsen, Morten. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; Argentina. Technical University of Denmark; DinamarcaFil: Peters, Bjoern. La Jolla Institute for Allergy and Immunology; Estados Unido

Springer - Publisher Connector

CONICET Digital

Immune epitope database analysis resource (IEDB-AR)

Author: A. Sette
B. Peters
Beaver
Berman
Bui
Bui
C. Lundegaard
Gowthaman
Greenbaum
H.-H. Bui
Immonen
J. Beaver
J. Greenbaum
J. Ponomarenko
Jones
Larsen
Larsen
Larsen
Lin
M. Nielsen
Malmassari
Nielsen
Nielsen
Nielsen
O. Lund
P. E. Bourne
P. Haste-Andersen
P. Wang
Peters
Peters
Peters
Peters
Q. Zhang
S. Buus
S. Frankild
Sturniolo
Swets
Tenzer
Y. Kim
Z. Zhu
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

We present a new release of the immune epitope database analysis resource (IEDB-AR, http://tools.immuneepitope.org), a repository of web-based tools for the prediction and analysis of immune epitopes. New functionalities have been added to most of the previously implemented tools, and a total of eight new tools were added, including two B-cell epitope prediction tools, four T-cell epitope prediction tools and two analysis tools

Sabanci University Research Database

Prediction of peptides binding to MHC class I alleles by partial periodic pattern mining

Author: Meydan Cem
Otu Hasan
Sezerman Uğur
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/08/2009
Field of study

MHC (Major Histocompatibility Complex) is a key player in the immune response of an organism. It is important to be able to predict which antigenic peptides will bind to a specific MHC allele and which will not, creating possibilities for controlling immune response and for the applications of immunotherapy. However, a problem for MHC class I is the presence of bulges and loops in the peptides, changing the total length. Most machine learning methods in use today require the sequences to be of same length to successfully mine the binding motifs. We propose the use of time-based data mining methods in motif mining to be able to mine motifs position-independently. Also, the information for both binding and non-binding peptides is used on the contrary to the other methods which only rely on binding peptides. The prediction results are between 60-95% for the tested alleles

SVRMHC prediction server for MHC-binding peptides

Author: Flower Darren R
Li Tongbin
Liu Wen
Ren Yongliang
Wan Ji
Xu Qiqi
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The binding between antigenic peptides (epitopes) and the MHC molecule is a key step in the cellular immune response. Accurate in silico prediction of epitope-MHC binding affinity can greatly expedite epitope screening by reducing costs and experimental effort. RESULTS: Recently, we demonstrated the appealing performance of SVRMHC, an SVR-based quantitative modeling method for peptide-MHC interactions, when applied to three mouse class I MHC molecules. Subsequently, we have greatly extended the construction of SVRMHC models and have established such models for more than 40 class I and class II MHC molecules. Here we present the SVRMHC web server for predicting peptide-MHC binding affinities using these models. Benchmarked percentile scores are provided for all predictions. The larger number of SVRMHC models available allowed for an updated evaluation of the performance of the SVRMHC method compared to other well- known linear modeling methods. CONCLUSION: SVRMHC is an accurate and easy-to-use prediction server for epitope-MHC binding with significant coverage of MHC molecules. We believe it will prove to be a valuable resource for T cell epitope researchers

Springer - Publisher Connector

Directory of Open Access Journals

Aston Publications Explorer

The Immune Epitope Database and Analysis Resource Program 2003–2018: reflections and outlook

Author: Martini Sheridan
Nielsen Morten
Peters Bjoern
Sette Alessandro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2019
Field of study

The Immune Epitope Database and Analysis Resource (IEDB) contains information related to antibodies and T cells across an expansive scope of research fields (infectious diseases, allergy, autoimmunity, and transplantation). Capture and representation of the data to reflect growing scientific standards and techniques have required continual refinement of our rigorous curation and query and reporting processes beginning with the automated classification of over 28 million PubMed abstracts, and resulting in easily searchable data from over 20,000 published manuscripts. Data related to MHC binding and elution, nonpeptidics, natural processing, receptors, and 3D structure is first captured through manual curation and subsequently maintained through recuration to reflect evolving scientific standards. Upon promotion to the free, public database, users can query and export records of specific relevance via the online web portal which undergoes iterative development to best enable efficient data access. In parallel, the companion Analysis Resource site hosts a variety of tools that assist in the bioinformatic analyses of epitopes and related structures, which can be applied to IEDB-derived and independent datasets alike. Available tools are classified into two categories: analysis and prediction. Analysis tools include epitope clustering, sequence conservancy, and more, while prediction tools cover T and B cell epitope binding, immunogenicity, and TCR/BCR structures. In addition to these tools, benchmarking servers which allow for unbiased performance comparison are also offered. In order to expand and support the user-base of both the database and Analysis Resource, the research team actively engages in community outreach through publication of ongoing work, conference attendance and presentations, hosting of user workshops, and the provision of online help. This review provides a description of the IEDB database infrastructure, curation and recuration processes, query and reporting capabilities, the Analysis Resource, and our Community Outreach efforts, including assessment of the impact of the IEDB across the research community.Fil: Martini, Sheridan. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Nielsen, Morten. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; Argentina. Technical University of Denmark; DinamarcaFil: Peters, Bjoern. La Jolla Institute for Allergy and Immunology; Estados Unidos. University of California at San Diego; Estados UnidosFil: Sette, Alessandro. La Jolla Institute for Allergy and Immunology; Estados Unidos. University of California at San Diego; Estados Unido

CONICET Digital

NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11

Author: Buus Søren
Harndahl Mikkel
Lamberth Kasper
Lund Ole
Lundegaard Claus
Nielsen Morten
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

NetMHC-3.0 is trained on a large number of quantitative peptide data using both affinity data from the Immune Epitope Database and Analysis Resource (IEDB) and elution data from SYFPEITHI. The method generates high-accuracy predictions of major histocompatibility complex (MHC): peptide binding. The predictions are based on artificial neural networks trained on data from 55 MHC alleles (43 Human and 12 non-human), and position-specific scoring matrices (PSSMs) for additional 67 HLA alleles. As only the MHC class I prediction server is available, predictions are possible for peptides of length 8–11 for all 122 alleles. artificial neural network predictions are given as actual IC50 values whereas PSSM predictions are given as a log-odds likelihood scores. The output is optionally available as download for easy post-processing. The training method underlying the server is the best available, and has been used to predict possible MHC-binding peptides in a series of pathogen viral proteomes including SARS, Influenza and HIV, resulting in an average of 75–80% confirmed MHC binders. Here, the performance is further validated and benchmarked using a large set of newly published affinity data, non-redundant to the training set. The server is free of use and available at: http://www.cbs.dtu.dk/services/NetMHC

Strength in numbers: achieving greater accuracy in MHC-I binding prediction by combining the results from multiple prediction tools

Author: Bickis Mik
Kusalik Anthony
Trost Brett
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Peptides derived from endogenous antigens can bind to MHC class I molecules. Those which bind with high affinity can invoke a CD8(+ )immune response, resulting in the destruction of infected cells. Much work in immunoinformatics has involved the algorithmic prediction of peptide binding affinity to various MHC-I alleles. A number of tools for MHC-I binding prediction have been developed, many of which are available on the web. RESULTS: We hypothesize that peptides predicted by a number of tools are more likely to bind than those predicted by just one tool, and that the likelihood of a particular peptide being a binder is related to the number of tools that predict it, as well as the accuracy of those tools. To this end, we have built and tested a heuristic-based method of making MHC-binding predictions by combining the results from multiple tools. The predictive performance of each individual tool is first ascertained. These performance data are used to derive weights such that the predictions of tools with better accuracy are given greater credence. The combined tool was evaluated using ten-fold cross-validation and was found to signicantly outperform the individual tools when a high specificity threshold is used. It performs comparably well to the best-performing individual tools at lower specificity thresholds. Finally, it also outperforms the combination of the tools resulting from linear discriminant analysis. CONCLUSION: A heuristic-based method of combining the results of the individual tools better facilitates the scanning of large proteomes for potential epitopes, yielding more actual high-affinity binders while reporting very few false positives

Springer - Publisher Connector

MHC Class II Binding Prediction—A Little Help from a Friend

Author: Dimitrov Ivan
Doytchinova Irini
Flower Darren R.
Garnev Panayot
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2010
Field of study

Vaccines are the greatest single instrument of prophylaxis against infectious diseases, with immeasurable benefits to human wellbeing. The accurate and reliable prediction of peptide-MHC binding is fundamental to the robust identification of T-cell epitopes and thus the successful design of peptide- and protein-based vaccines. The prediction of MHC class II peptide binding has hitherto proved recalcitrant and refractory. Here we illustrate the utility of existing computational tools for in silico prediction of peptides binding to class II MHCs. Most of the methods, tested in the present study, detect more than the half of the true binders in the top 5% of all possible nonamers generated from one protein. This number increases in the top 10% and 15% and then does not change significantly. For the top 15% the identified binders approach 86%. In terms of lab work this means 85% less expenditure on materials, labour and time. We show that while existing caveats are well founded, nonetheless use of computational models of class II binding can still offer viable help to the work of the immunologist and vaccinologist

Directory of Open Access Journals