207 research outputs found
Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan
CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules-even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC "space," enabling a highly efficient iterative process for improving MHC class II binding predictions
Prediction of MHC-peptide binding: a systematic and comprehensive overview
T cell immune responses are driven by the recognition of peptide antigens (T cell epitopes) that are bound to major histocompatibility complex (MHC) molecules. T cell epitope immunogenicity is thus contingent on several events, including appropriate and effective processing of the peptide from its protein source, stable peptide binding to the MHC molecule, and recognition of the MHC-bound peptide by the T cell receptor. Of these three hallmarks, MHC-peptide binding is the most selective event that determines T cell epitopes. Therefore, prediction of MHC-peptide binding constitutes the principal basis for anticipating potential T cell epitopes. The tremendous relevance of epitope identification in vaccine design and in the monitoring of T cell responses has spurred the development of many computational methods for predicting MHC-peptide binding that improve the efficiency and economics of T cell epitope identification. In this report, we will systematically examine the available methods for predicting MHC-peptide binding and discuss their most relevant advantages and drawbacks
NNAlign: A Web-Based Prediction Method Allowing Non-Expert End-User Discovery of Sequence Motifs in Quantitative Peptide Data
Recent advances in high-throughput technologies have made it possible to generate both gene and protein sequence data at an unprecedented rate and scale thereby enabling entirely new “omics”-based approaches towards the analysis of complex biological processes. However, the amount and complexity of data that even a single experiment can produce seriously challenges researchers with limited bioinformatics expertise, who need to handle, analyze and interpret the data before it can be understood in a biological context. Thus, there is an unmet need for tools allowing non-bioinformatics users to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. Here, we provide a web-based implementation of NNAlign allowing non-expert end-users to submit their data (optionally adjusting method parameters), and in return receive a trained method (including a visual representation of the identified motif) that subsequently can be used as prediction method and applied to unknown proteins/peptides. We have successfully applied this method to several different data sets including peptide microarray-derived sets containing more than 100,000 data points
From Functional Genomics to Functional Immunomics: New Challenges, Old Problems, Big Rewards
The development of DNA microarray technology a decade ago led to the establishment of functional genomics as one of the most active and successful scientific disciplines today. With the ongoing development of immunomic microarray technology—a spatially addressable, large-scale technology for measurement of specific immunological response—the new challenge of functional immunomics is emerging, which bears similarities to but is also significantly different from functional genomics. Immunonic data has been successfully used to identify biological markers involved in autoimmune diseases, allergies, viral infections such as human immunodeficiency virus (HIV), influenza, diabetes, and responses to cancer vaccines. This review intends to provide a coherent vision of this nascent scientific field, and speculate on future research directions. We discuss at some length issues such as epitope prediction, immunomic microarray technology and its applications, and computation and statistical challenges related to functional immunomics. Based on the recent discovery of regulation mechanisms in T cell responses, we envision the use of immunomic microarrays as a tool for advances in systems biology of cellular immune responses, by means of immunomic regulatory network models
APPLICATION OF MACHINE LEARNING APPROACHES TO EMPOWER DRUG DEVELOPMENT
Human health, one of the major topics in Life Science, is facing intensified challenges, including cancer, pandemic outbreaks, and antimicrobial resistance. Thus, new medicines with unique advantages, including peptide-based vaccines and permeable small molecule antimicrobials, are in urgent need. However, the drug development process is long, complex, and risky with no guarantee of success. Also, the improvements in techniques applied in genomics, proteomics, computational biology, and clinical trials significantly increase the data complexity and volume, which imposes higher requirements on the drug development pipeline. In recent years, machine learning (ML) methods were employed to support drug development in various aspects and were shown to be highly effective. Here, we explored the application of advanced ML approaches to empower the development of peptide-based vaccines and permeable antimicrobials. First, the peptide-based vaccines targeting pancreatic cancer and COVID-19 were predicted and screened via multiple approaches. Next, novel structure-based methods to improve the performance of peptide: MHC binding affinity prediction were developed, including an HLA modeling pipeline that provides structures for docking-based peptide binder validation, and hierarchical clustering of HLA I into supertypes and subtypes that have similar peptide binding specificity. Finally, the physicochemical properties governing the permeability of small molecules into multidrug-resistant Pseudomonas aeruginosa cells were selected using a random forest model. In conclusion, the use of machine learning methods could accelerate the drug development process at a lower cost and promote data-based decision-making if used properly
Recommended from our members
Quantitative approaches for profiling the T cell receptor repertoire in human tissues
The study of B and T cell receptor repertoires from high throughput sequencing is a recent development that allows for unprecedented resolution and quantification of the adaptive immune response. The immense diversity and long tailed distribution of these repertoires has up until now limited such studies to expanded clonal signatures or to analysis of imprecise signals with limited dynamic range collected by techniques such as radioactive and fluorescent labeling. This thesis presents a number of quantitative methods to characterize the repertoire and examine the questions of sequence diversity and inter-repertoire divergence of T cell repertoires. These approaches attempt to accurately parametrize the inherent distribution of T cell clones drawing from statistical tools derived from ecological literature and information theory.
The methods presented are applied to T cell analyses of various tissue compartments of the human body, including peripheral blood mononucleocytes, thymic tissues, spleen, inguinal lymph nodes, lung lymph nodes and the brain. A number of applications are explored with strong implications for translational use in medicine. Novel insights are made into the mechanism of maintenance and compartmentalization of na{\"i}ve T cells from human donors of many different ages. Diversity and divergence of the tumor infiltrating sequence repertoire is measured in low grade gliomas and glioblastomas from cancer patients, and potential sequence based biomarkers are assessed for studying glioma phenotype progression. A careful investigation of the immune response to allogeneic stimulus reveals the effect of HLA on sequence sharing and diversity of the alloresponse, and quantifies for the first time using sequence data the fraction of T cells in a repertoire that are alloreactive.
The use of repertoire sequencing and mathematical models within immunology is a new and emerging concept within the rapidly expanding field of systems immunology and will undoubtedly have a profound impact on the future of immunology research. It is hoped that the tools presented in this thesis will give insight into how to quantitatively explore the breadth and depth of the T cell receptor repertoire, and provide future directions for TCR repertoire analysis
- …