15 research outputs found

    Interpreting BERT architecture predictions for peptide presentation by MHC class I proteins

    Get PDF
    The major histocompatibility complex (MHC) class-I pathway supports the detection of cancer and viruses by the immune system. It presents parts of proteins (peptides) from inside a cell on its membrane surface enabling visiting immune cells that detect non-self peptides to terminate the cell. The ability to predict whether a peptide will get presented on MHC Class I molecules helps in designing vaccines so they can activate the immune system to destroy the invading disease protein. We designed a prediction model using a BERT-based architecture (ImmunoBERT) that takes as input a peptide and its surrounding regions (N and C-terminals) along with a set of MHC class I (MHC-I) molecules. We present a novel application of well known interpretability techniques, SHAP and LIME, to this domain and we use these results along with 3D structure visualizations and amino acid frequencies to understand and identify the most influential parts of the input amino acid sequences contributing to the output. In particular, we find that amino acids close to the peptides' N- and C-terminals are highly relevant. Additionally, some positions within the MHC proteins (in particular in the A, B and F pockets) are often assigned a high importance ranking - which confirms biological studies and the distances in the structure visualizations.Comment: 10 page

    The immunopeptidome from a genomic perspective:Establishing the noncanonical landscape of MHC class I–associated peptides

    Get PDF
    G.B., D.B., K.W., A.P., R.F., T.R.H., S.K., and J.A.A. received support from Fundacja na rzecz Nauki Polskiej (FNP) (grant ID: MAB/3/2017). D.R.G. received support from Genome Canada & Genome BC (grant ID: 264PRO). D.J.H. received support from NuCana plc (grant ID: SMD0-ZIUN05). H.A. received support from Swedish Cancer Foundation (grant ID: 211709). H.G. received support from United Kingdom Research and Innovation (UKRI) (grant ID: EP/S02431X/1). C.P. received support from Fundação para a Ciência e a Tecnologia (FCT) through LASIGE Research Unit (grant ID: UIDB/00408/2020 and UIDP/00408/2020). A.L. F.M.Z., C.P., A.R., A.P., and J.A.A. received support from European Union’s Horizon 2020 research and innovation programme (grant ID: 101017453). C.B. received support from Agence Nationale de la Recherche (ANR) through GRAL LabEX (grant ID: ANR-10-LABX-49-01) and CBH-EUR-GS 32 (grant ID: ANR-17-EURE0003). S.N.S. received support from Cancer Research UK (CRUK) and the Chief Scientist's Office of Scotland (CSO): Experimental Cancer Medicine Centre (ECMC) (grant ID: ECMCQQR-2022/100017). A.L. received support from Chief Scientist's Office of Scotland (CSO) NRS Career Researcher Fellowship. R.O.N. received support from CRUK Cambridge Centre Thoracic Cancer Programme (grant ID: CTRQQR-2021\100012).Tumor antigens can emerge through multiple mechanisms, including translation of non-coding genomic regions. This non-canonical category of antigens has recently gained attention; however, our understanding of how they recur within and between cancer types is still in its infancy. Therefore, we developed a proteogenomic pipeline based on deep learning de novo mass spectrometry to enable the discovery of non-canonical MHC-associated peptides (ncMAPs) from non-coding regions. Considering that the emergence of tumor antigens can also involve post-translational modifications, we included an open search component in our pipeline. Leveraging the wealth of mass spectrometry-based immunopeptidomics, we analyzed 26 MHC class I immunopeptidomic studies of 9 different cancer types. We validated the de novo identified ncMAPs, along with the most abundant post-translational modifications, using spectral matching and controlled their false discovery rate (FDR) to 1%. Interestingly, the non-canonical presentation appeared to be 5 times enriched for the A03 HLA supertype, with a projected population coverage of 54.85%. Here, we reveal an atlas of 8,601 ncMAPs with varying levels of cancer selectivity and suggest 17 cancer-selective ncMAPs as attractive targets according to a stringent cutoff. In summary, the combination of the open-source pipeline and the atlas of ncMAPs reported herein could facilitate the identification and screening of ncMAPs as targeting agents for T-cell therapies or vaccine development.Publisher PDFPeer reviewe

    The emerging landscape of single-molecule protein sequencing technologies

    Get PDF
    Single-cell profiling methods have had a profound impact on the understanding of cellular heterogeneity. While genomes and transcriptomes can be explored at the single-cell level, single-cell profiling of proteomes is not yet established. Here we describe new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell profiling. These technologies will in turn facilitate biological discovery and open new avenues for ultrasensitive disease diagnostics.This Perspective describes new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell proteomics.</p

    A Comprehensive Review of Neurodegenerative Manifestations of SARS-CoV-2

    No full text
    The World Health Organization reports that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has impacted a staggering 770 million individuals to date. Despite the widespread nature of this viral infection, its precise effects remain largely elusive. This scientific inquiry aims to shed light on the intricate interplay between SARS-CoV-2 infection and the development of neurodegenerative disorders—an affliction that weighs heavily on millions worldwide and stands as the fourth most prevalent cause of mortality. By comprehensively understanding the repercussions of SARS-CoV-2 on neurodegenerative disorders, we strive to unravel critical insights that can potentially shape our approach to the diagnosis, prevention, and treatment of these debilitating conditions. To achieve this goal, we conducted a comprehensive literature review of the scientific data available to date showing that SARS-CoV-2 infection is associated with increased risk and severity of neurodegenerative disorders, as well as altered expression of key genes and pathways involved in their pathogenesis

    Mass Spectrometry-Based Identification of MHC-Associated Peptides

    No full text
    Neoantigen-based immunotherapies promise to improve patient outcomes over the current standard of care. However, detecting these cancer-specific antigens is one of the significant challenges in the field of mass spectrometry. Even though the first sequencing of the immunopeptides was done decades ago, today there is still a diversity of the protocols used for neoantigen isolation from the cell surface. This heterogeneity makes it difficult to compare results between the laboratories and the studies. Isolation of the neoantigens from the cell surface is usually done by mild acid elution (MAE) or immunoprecipitation (IP) protocol. However, limited amounts of the neoantigens present on the cell surface impose a challenge and require instrumentation with enough sensitivity and accuracy for their detection. Detecting these neopeptides from small amounts of available patient tissue limits the scope of most of the studies to cell cultures. Here, we summarize protocols for the extraction and identification of the major histocompatibility complex (MHC) class I and II peptides. We aimed to evaluate existing methods in terms of the appropriateness of the isolation procedure, as well as instrumental parameters used for neoantigen detection. We also focus on the amount of the material used in the protocols as the critical factor to consider when analyzing neoantigens. Beyond experimental aspects, there are numerous readily available proteomics suits/tools applicable for neoantigen discovery; however, experimental validation is still necessary for neoantigen characterization

    Untranslated regions (UTRs) are a potential novel source of neoantigens for personalised immunotherapy

    No full text
    BackgroundNeoantigens, mutated tumour-specific antigens, are key targets of anti-tumour immunity during checkpoint inhibitor (CPI) treatment. Their identification is fundamental to designing neoantigen-directed therapy. Non-canonical neoantigens arising from the untranslated regions (UTR) of the genome are an overlooked source of immunogenic neoantigens. Here, we describe the landscape of UTR-derived neoantigens and release a computational tool, PrimeCUTR, to predict UTR neoantigens generated by start-gain and stop-loss mutations.MethodsWe applied PrimeCUTR to a whole genome sequencing dataset of pre-treatment tumour samples from CPI-treated patients (n = 341). Cancer immunopeptidomic datasets were interrogated to identify MHC class I presentation of UTR neoantigens.ResultsStart-gain neoantigens were predicted in 72.7% of patients, while stop-loss mutations were found in 19.3% of patients. While UTR neoantigens only accounted 2.6% of total predicted neoantigen burden, they contributed 12.4% of neoantigens with high dissimilarity to self-proteome. More start-gain neoantigens were found in CPI responders, but this relationship was not significant when correcting for tumour mutational burden. While most UTR neoantigens are private, we identified two recurrent start-gain mutations in melanoma. Using immunopeptidomic datasets, we identify two distinct MHC class I-presented UTR neoantigens: one from a recurrent start-gain mutation in melanoma, and one private to Jurkat cells.ConclusionPrimeCUTR is a novel tool which complements existing neoantigen discovery approaches and has potential to increase the detection yield of neoantigens in personalised therapeutics, particularly for neoantigens with high dissimilarity to self. Further studies are warranted to confirm the expression and immunogenicity of UTR neoantigens

    Table_1_Untranslated regions (UTRs) are a potential novel source of neoantigens for personalised immunotherapy.xlsx

    No full text
    BackgroundNeoantigens, mutated tumour-specific antigens, are key targets of anti-tumour immunity during checkpoint inhibitor (CPI) treatment. Their identification is fundamental to designing neoantigen-directed therapy. Non-canonical neoantigens arising from the untranslated regions (UTR) of the genome are an overlooked source of immunogenic neoantigens. Here, we describe the landscape of UTR-derived neoantigens and release a computational tool, PrimeCUTR, to predict UTR neoantigens generated by start-gain and stop-loss mutations.MethodsWe applied PrimeCUTR to a whole genome sequencing dataset of pre-treatment tumour samples from CPI-treated patients (n = 341). Cancer immunopeptidomic datasets were interrogated to identify MHC class I presentation of UTR neoantigens.ResultsStart-gain neoantigens were predicted in 72.7% of patients, while stop-loss mutations were found in 19.3% of patients. While UTR neoantigens only accounted 2.6% of total predicted neoantigen burden, they contributed 12.4% of neoantigens with high dissimilarity to self-proteome. More start-gain neoantigens were found in CPI responders, but this relationship was not significant when correcting for tumour mutational burden. While most UTR neoantigens are private, we identified two recurrent start-gain mutations in melanoma. Using immunopeptidomic datasets, we identify two distinct MHC class I-presented UTR neoantigens: one from a recurrent start-gain mutation in melanoma, and one private to Jurkat cells.ConclusionPrimeCUTR is a novel tool which complements existing neoantigen discovery approaches and has potential to increase the detection yield of neoantigens in personalised therapeutics, particularly for neoantigens with high dissimilarity to self. Further studies are warranted to confirm the expression and immunogenicity of UTR neoantigens.</p

    Table_2_Untranslated regions (UTRs) are a potential novel source of neoantigens for personalised immunotherapy.xlsx

    No full text
    BackgroundNeoantigens, mutated tumour-specific antigens, are key targets of anti-tumour immunity during checkpoint inhibitor (CPI) treatment. Their identification is fundamental to designing neoantigen-directed therapy. Non-canonical neoantigens arising from the untranslated regions (UTR) of the genome are an overlooked source of immunogenic neoantigens. Here, we describe the landscape of UTR-derived neoantigens and release a computational tool, PrimeCUTR, to predict UTR neoantigens generated by start-gain and stop-loss mutations.MethodsWe applied PrimeCUTR to a whole genome sequencing dataset of pre-treatment tumour samples from CPI-treated patients (n = 341). Cancer immunopeptidomic datasets were interrogated to identify MHC class I presentation of UTR neoantigens.ResultsStart-gain neoantigens were predicted in 72.7% of patients, while stop-loss mutations were found in 19.3% of patients. While UTR neoantigens only accounted 2.6% of total predicted neoantigen burden, they contributed 12.4% of neoantigens with high dissimilarity to self-proteome. More start-gain neoantigens were found in CPI responders, but this relationship was not significant when correcting for tumour mutational burden. While most UTR neoantigens are private, we identified two recurrent start-gain mutations in melanoma. Using immunopeptidomic datasets, we identify two distinct MHC class I-presented UTR neoantigens: one from a recurrent start-gain mutation in melanoma, and one private to Jurkat cells.ConclusionPrimeCUTR is a novel tool which complements existing neoantigen discovery approaches and has potential to increase the detection yield of neoantigens in personalised therapeutics, particularly for neoantigens with high dissimilarity to self. Further studies are warranted to confirm the expression and immunogenicity of UTR neoantigens.</p
    corecore