A large peptidome dataset improves HLA class I epitope prediction across most of the human population

Bachireddy, Pavan; Braun, David A.; Carr, Steven A.; Clauser, Karl R.; Eisenhaure, Thomas; Hacohen, Nir; Hartigan, Christina R.; Justesen, Sune; Keshishian, Hasmik; Keskin, Derin B.; Klaeger, Susan; Lan Zhang, Guang; Lane, William J.; Law, Travis; Le, Phuong M.; Li, Letitia W.; Ligon, Keith L.; Oliveira, Giacomo; Ouspenskaia, Tamara; Rosenbluth, Jennifer M.; Sarkizova, Siranush; Stevens, Jonathan; Wu, Catherine J.; Zervantonakis, Ioannis K.; Zhang, Wandi

A large peptidome dataset improves HLA class I epitope prediction across most of the human population

Authors: Pavan Bachireddy
David A. Braun
Steven A. Carr
Karl R. Clauser
Thomas Eisenhaure
Nir Hacohen
Christina R. Hartigan
Sune Justesen
Hasmik Keshishian
Derin B. Keskin
Susan Klaeger
Guang Lan Zhang
William J. Lane
Travis Law
Phuong M. Le
Letitia W. Li
Keith L. Ligon
Giacomo Oliveira
Tamara Ouspenskaia
Jennifer M. Rosenbluth
Siranush Sarkizova
Jonathan Stevens
Catherine J. Wu
Ioannis K. Zervantonakis
Wandi Zhang
Publication date: 1 February 2020
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

Published in final edited form as: Nat Biotechnol. 2020 February ; 38(2): 199–209. doi:10.1038/s41587-019-0322-9.Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction of endogenous HLA class I-associated peptides across a large fraction of the human population, we used mass spectrometry to profile >185,000 peptides eluted from 95 HLA-A, -B, -C and -G mono-allelic cell lines. We identified canonical peptide motifs per HLA allele, unique and shared binding submotifs across alleles and distinct motifs associated with different peptide lengths. By integrating these data with transcript abundance and peptide processing, we developed HLAthena, providing allele-and-length-specific and pan-allele-pan-length prediction models for endogenous peptide presentation. These models predicted endogenous HLA class I-associated ligands with 1.5-fold improvement in positive predictive value compared with existing tools and correctly identified >75% of HLA-bound peptides that were observed experimentally in 11 patient-derived tumor cell lines.P01 CA229092 - NCI NIH HHS; P50 CA101942 - NCI NIH HHS; T32 HG002295 - NHGRI NIH HHS; T32 CA009172 - NCI NIH HHS; U24 CA224331 - NCI NIH HHS; R21 CA216772 - NCI NIH HHS; R01 CA155010 - NCI NIH HHS; U01 CA214125 - NCI NIH HHS; T32 CA207021 - NCI NIH HHS; R01 HL103532 - NHLBI NIH HHS; U24 CA210986 - NCI NIH HHSAccepted manuscrip

Similar works

Full text

Available Versions

Crossref

Last time updated on 20/04/2021

Boston University Institutional Repository (OpenBU)

oai:open.bu.edu:2144/41361

Last time updated on 04/12/2020