1,122 research outputs found

    Recent advances in B-cell epitope prediction methods

    Get PDF
    Identification of epitopes that invoke strong responses from B-cells is one of the key steps in designing effective vaccines against pathogens. Because experimental determination of epitopes is expensive in terms of cost, time, and effort involved, there is an urgent need for computational methods for reliable identification of B-cell epitopes. Although several computational tools for predicting B-cell epitopes have become available in recent years, the predictive performance of existing tools remains far from ideal. We review recent advances in computational methods for B-cell epitope prediction, identify some gaps in the current state of the art, and outline some promising directions for improving the reliability of such methods

    The Empirical Comparison of Machine Learning Algorithm for the Class Imbalanced Problem in Conformational Epitope Prediction

    Get PDF
    A conformational epitope is a part of a protein-based vaccine. It is challenging to identify using an experiment. A computational model is developed to support identification. However, the imbalance class is one of the constraints to achieving optimal performance on the conformational epitope B cell prediction. In this paper, we compare several conformational epitope B cell prediction models from non-ensemble and ensemble approaches. A sampling method from Random undersampling, SMOTE, and cluster-based undersampling is combined with a decision tree or SVM to build a non-ensemble model. A random forest model and several variants of the bagging method is used to construct the ensemble model. A 10-fold cross-validation method is used to validate the model.  The experiment results show that the combination of the cluster-based under-sampling and decision tree outperformed the other sampling method when combined with the non-ensemble and the ensemble method. This study provides a baseline to improve existing models for dealing with the class imbalance in the conformational epitope prediction

    A novel ensemble fuzzy classification model in SARS-CoV-2 B-cell epitope identification for development of protein-based vaccine

    Get PDF
    B-cell epitope prediction research has received growing interest since the development of the first method. B-cell epitope identification with the aid of an accurate prediction method is one of the most important steps in epitope-based vaccine development, immunodiagnostic testing, antibody production, disease diagnosis, and treatment. Nevertheless, using experimental methods in epitope mapping is very time-consuming, costly, and labor-intensive. Therefore, although successful predictions with in silico methods are very important in epitope prediction, there are limited studies in this area. The aim of this study is to propose a new approach for successfully predicting B-cell epitopes for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In this study, the SARS-CoV B-cell epitope prediction performances of different fuzzy learning classification models genetic cooperative competitive learning (GCCL), fuzzy genetics-based machine learning (GBML), Chi's method (CHI), Ishibuchi's method with weight factor (W), structural learning algorithm on vague environment (SLAVE) and the state-of-the-art ensemble fuzzy classification model were compared. The obtained results showed that the proposed ensemble approach has the lowest error in SARS-CoV B-cell epitope estimation compared to the base fuzzy learners (average error rates; ensemble fuzzy=8.33, GCCL=30.42, GBML=23.82, CHI=29.17, W=46.25, and SLAVE=20.42). SARS-CoV and SARS-CoV-2 have high genome similarities. Therefore, the most successful method determined for SARS-CoV B-cell epitope prediction was used in SARS-CoV-2 cell epitope prediction. Finally, the eventual B-cell epitope prediction results obtained for SARS-CoV-2 with the ensemble fuzzy classification model were compared with the epitope sequences predicted by the BepiPred server and immunoinformatics studies in the literature for the same protein sequences according to VaxiJen 2.0 scores. We hope that the developed epitope prediction method will help design effective vaccines and drugs against future outbreaks of the coronavirus family, especially SARS-CoV-2 and its possible mutations. © 2021 Elsevier B.V.121E326This study was supported by The Scientific and Technological Research Council of Turkey-TÜBİTAK (Project Number: 121E326 ).This study was supported by The Scientific and Technological Research Council of Turkey-T?B?TAK (Project Number: 121E326)

    A new approach for determining SARS-CoV-2 epitopes using machine learning-based in silico methods

    Get PDF
    The emergence of machine learning-based in silico tools has enabled rapid and high-quality predictions in the biomedical field. In the COVID-19 pandemic, machine learning methods have been used in many topics such as predicting the death of patients, modeling the spread of infection, determining future effects, diagnosis with medical image analysis, and forecasting the vaccination rate. However, there is a gap in the literature regarding identifying epitopes that can be used in fast, useful, and effective vaccine design using machine learning methods and bioinformatics tools. Machine learning methods can give medical biotechnologists an advantage in designing a faster and more successful vaccine. The motivation of this study is to propose a successful hybrid machine learning method for SARS-CoV-2 epitope prediction and to identify nonallergen, nontoxic, antigen peptides that can be used in vaccine design from the predicted epitopes with bioinformatics tools. The identified epitopes will be effective not only in the design of the COVID-19 vaccine but also against viruses from the SARS family that may be encountered in the future. For this purpose, epitope prediction performances of random forest, support vector machine, logistic regression, bagging with decision tree, k-nearest neighbor and decision tree methods were examined. In the SARS-CoV and B-cell datasets used for education in the study, epitope estimation was performed again after the datasets were balanced with the synthetic minority oversampling technique (SMOTE) method since the epitope class samples were in the minority compared to the nonepitope class. The experimental results obtained were compared and the most successful predictions were obtained with the random forest (RF) method. The epitope prediction performance in balanced datasets was found to be higher than that in the original datasets (94.0% AUC and 94.4% PRC for the SMOTE-SARS-CoV dataset; 95.6% AUC and 95.3% PRC for the SMOTE-B-cell dataset). In this study, 252 peptides out of 20312 peptides were determined to be epitopes with the SMOTE-RF-SVM hybrid method proposed for SARS-CoV-2 epitope prediction. Determined epitopes were analyzed with AllerTOP 2.0, VaxiJen 2.0 and ToxinPred tools, and allergic, nonantigen, and toxic epitopes were eliminated. As a result, 11 possible nonallergic, high antigen and nontoxic epitope candidates were proposed that could be used in protein-based COVID-19 vaccine design (“VGGNYNY”, “VNFNFNGLTG”, “RQIAPGQTGKI”, “QIAPGQTGKIA”, “SYECDIPIGAGI”, “STFKCYGVSPTKL”, “GVVFLHVTYVPAQ”, “KNHTSPDVDLGDI”, “NHTSPDVDLGDIS”, “AGAAAYYVGYLQPR”, “KKSTNLVKNKCVNF”). It is predicted that the few epitopes determined by machine learning-based in silico methods will help biotechnologists design fast and accurate vaccines by reducing the number of trials in the laboratory environment. © 2022 Elsevier LtdTürkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK: 121E326This study was supported by Turkish Scientific and Technical Research Council, Turkey-TÜBİTAK (Project Number: 121E326).This study was supported by Turkish Scientific and Technical Research Council, Turkey -TÜBİTAK (Project Number: 121E326 )

    Insights to Protein Pathogenicity from the Lens of Protein Evolution

    Get PDF
    As protein sequences evolve, differences in selective constraints may lead to outcomes ranging from sequence conservation to structural and functional divergence. Evolutionary protein family analysis can illuminate which protein regions are likely to diverge or remain conserved in sequence, structure, and function. Moreover, nonsynonymous mutations in pathogens may result in the emergence of protein regions that affect the behavior of pathogenic proteins within a host and host response. I aimed to gain insight on pathogenic proteins from cancer and viruses using an evolutionary perspective. First, I examined p53, a conformationally flexible, multifunctional protein mutated in ~50% of human cancers. Multifunctional proteins may experience rapid sequence divergence given trade-offs between functions, while proteins with important functions may be more constrained. How, then, does a protein like p53 evolve? I assessed the evolutionary dynamics of structural and regulatory properties in the p53 family, revealing paralog-specific patterns of functional divergence. I also studied flaviviruses, like Dengue and Zika virus, whose conformational flexibility contributes to antibody-dependent enhancement (ADE). ADE has long complicated vaccine development for these viruses, making antiviral drug development an attractive alternative. I identified fitness-critical sites conserved in sequence and structure in the proteome of flaviviruses with the potential to act as broadly neutralizing antiviral drug target sites. I later developed Epitopedia, a computational method for epitope-based prediction of molecular mimicry. Molecular mimicry occurs when regions of antigenic proteins resemble protein regions from the host or other pathogens, leading to antibody cross-reactivity at these sites which can result in autoimmunity or have a protective effect. I applied Epitopedia to the antigenic Spike protein from SARS-CoV-2, the causative agent of COVID-19. Molecular mimicry may explain the varied symptoms and outcomes seen in COVID-19 patients. I found instances of molecular mimicry in Spike associated with COVID-19-related blood-clotting disorders and cardiac disease, with implications on disease treatment and vaccine design

    Automated Detection of Conformational Epitopes Using Phage Display Peptide Sequences

    Get PDF
    Background: Precise determination of conformational epitopes of neutralizing antibodies represents a key step in the rational design of novel vaccines. A powerful experimental method to gain insights on the physical chemical nature of conformational epitopes is the selection of linear peptides that bind with high affinities to a monoclonal antibody of interest by phage display technology. However, the structural characterization of conformational epitopes from these mimotopes is not straightforward, and in the past the interpretation of peptide sequences from phage display experiments focused on linear sequence analysis to find a consensus sequence or common sequence motifs. Results: We present a fully automated search method, EpiSearch that predicts the possible location of conformational epitopes on the surface of an antigen. The algorithm uses peptide sequences from phage display experiments as input, and ranks all surface exposed patches according to the frequency distribution of similar residues in the peptides and in the patch. We have tested the performance of the EpiSearch algorithm for six experimental data sets of phage display experiments, the human epidermal growth factor receptor-2 (HER-2/neu), the antibody mAb Bo2C11 targeting the C 2 domain of FVIII, antibodies mAb 17b and mAb b12 of the HIV envelope protein gp120, mAb 13b5 targeting HIV-1 capsid protein and 80R of the SARS coronavirus spike protein. In all these examples th

    The MEPS server for identifying protein conformational epitopes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One of the most interesting problems in molecular immunology is epitope mapping, i.e. the identification of the regions of interaction between an antigen and an antibody. The solution to this problem, even if approximate, would help in designing experiments to precisely map the residues involved in the interaction and could be instrumental both in designing peptides able to mimic the interacting surface of the antigen and in understanding where immunologically important regions are located in its three-dimensional structure. From an experimental point of view, both genetically encoded and chemically synthesised peptide libraries can be used to identify sequences recognized by a given antibody. The problem then arises of which region of a folded protein the selected peptides correspond to.</p> <p>Results</p> <p>We have developed a method able to find the surface region of a protein that can be effectively mimicked by a peptide, given the structure of the protein and the maximum number of side chains deemed to be required for recognition. The method is implemented as a publicly available server. It can also find and report all peptide sequences of a specified length that can mimic the surface of a given protein and store them in a database.</p> <p>The immediate application of the server is the mapping of antibody epitopes, however the system is sufficiently flexible for allowing other questions to be asked, for example one can compare the peptides representing the surface of two proteins known to interact with the same macromolecule to find which is the most likely interacting region.</p> <p>Conclusion</p> <p>We believe that the MEPS server, available at <url>http://www.caspur.it/meps</url>, will be a useful tool for immunologists and structural and computational biologists. We plan to use it ourselves to implement a database of "surface mimicking peptides" for all proteins of known structure and proteins that can be reliably modelled by comparative modelling.</p

    AI driven B-cell Immunotherapy Design

    Full text link
    Antibodies, a prominent class of approved biologics, play a crucial role in detecting foreign antigens. The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction, which demands resource-intensive experimental techniques for characterisation. In recent years, artificial intelligence and machine learning methods have made significant strides, revolutionising the prediction of protein structures and their complexes. The past decade has also witnessed the evolution of computational approaches aiming to support immunotherapy design. This review focuses on the progress of machine learning-based tools and their frameworks in the domain of B-cell immunotherapy design, encompassing linear and conformational epitope prediction, paratope prediction, and antibody design. We mapped the most commonly used data sources, evaluation metrics, and method availability and thoroughly assessed their significance and limitations, discussing the main challenges ahead

    Computational Prediction of Broadly Neutralizing HIV-1 Antibody Epitopes from Neutralization Activity Data

    Get PDF
    Broadly neutralizing monoclonal antibodies effective against the majority of circulating isolates of HIV-1 have been isolated from a small number of infected individuals. Definition of the conformational epitopes on the HIV spike to which these antibodies bind is of great value in defining targets for vaccine and drug design. Drawing on techniques from compressed sensing and information theory, we developed a computational methodology to predict key residues constituting the conformational epitopes on the viral spike from cross-clade neutralization activity data. Our approach does not require the availability of structural information for either the antibody or antigen. Predictions of the conformational epitopes of ten broadly neutralizing HIV-1 antibodies are shown to be in good agreement with new and existing experimental data. Our findings suggest that our approach offers a means to accelerate epitope identification for diverse pathogenic antigens
    corecore