102 research outputs found
MAD Bayes for Tumor Heterogeneity Feature Allocation with Non-Normal Sampling
We propose small-variance asymptotic approximations for the inference of
tumor heterogeneity (TH) using next-generation sequencing data. Understanding
TH is an important and open research problem in biology. The lack of
appropriate statistical inference is a critical gap in existing methods that
the proposed approach aims to fill. We build on a hierarchical model with an
exponential family likelihood and a feature allocation prior. The proposed
approach generalizes similar small-variance approximations proposed by Kulis
and Jordan (2012) and Broderick et.al (2012) for inference with Dirichlet
process mixture and Indian buffet prior models under normal sampling. We show
that the new algorithm can successfully recover latent structures of different
subclones and is also magnitude faster than available Markov chain Monte Carlo
samplers, the latter often practically infeasible for high-dimensional genomics
data. The proposed approach is scalable, simple to implement and benefits from
the flexibility of Bayesian nonparametric models. More importantly, it provides
a useful tool for the biological community for estimating cell subtypes in
tumor samples
<i>Toxoplasma gondii</i> peptide ligands open the gate of the HLA class I binding groove
HLA class I presentation of pathogen-derived peptide ligands is essential for CD8+ T-cell recognition of Toxoplasma gondii infected cells. Currently, little data exist pertaining to peptides that are presented after T. gondii infection. Herein we purify HLA-A*02:01 complexes from T. gondii infected cells and characterize the peptide ligands using LCMS. We identify 195 T. gondii encoded ligands originating from both secreted and cytoplasmic proteins. Surprisingly, T. gondii ligands are significantly longer than uninfected host ligands, and these longer pathogen-derived peptides maintain a canonical N-terminal binding core yet exhibit a C-terminal extension of 1-30 amino acids. Structural analysis demonstrates that binding of extended peptides opens the HLA class I F' pocket, allowing the C-terminal extension to protrude through one end of the binding groove. In summary, we demonstrate that unrealized structural flexibility makes MHC class I receptive to parasite-derived ligands that exhibit unique C-terminal peptide extensions.Fil: McMurtrey, Curtis. University of Oklahoma; Estados UnidosFil: Trolle, Thomas. Technical University of Denmark; Dinamarca. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Sansom, Tiffany. University at Buffalo; Estados UnidosFil: Remesh, Soumya G.. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Kaever, Thomas. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Bardet, Wilfried. University of Oklahoma; Estados UnidosFil: Jackson, Kenneth. University of Oklahoma; Estados UnidosFil: McLeod, Rima. University of Chicago; Estados UnidosFil: Sette, Alessandro. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Nielsen, Morten. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); Argentina. Technical University of Denmark; DinamarcaFil: Zajonc, Dirk M.. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Blader, Ira J. University at Buffalo; Estados UnidosFil: Peters, Bjoern. La Jolla Institute for Allergy and Immunology; Estados UnidosFil: Hildebrand, William. University of Oklahoma; Estados Unido
PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
BACKGROUND: Many different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction is therefore an important task with applications to vaccine and drug design. METHODS: Previous learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose PepDist: a novel approach for predicting binding affinity. Our approach is based on learning peptide-peptide distance functions. Moreover, we suggest to learn a single peptide-peptide distance function over an entire family of proteins (e.g. MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically, we propose to use DistBoost [1,2], which is a semi-supervised distance learning algorithm. RESULTS: We compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, our method's performance gain, when compared to other computational methods, is even more pronounced. We have recently uploaded the PepDist webserver which provides binding prediction of peptides to 35 different MHC class I alleles. The webserver which can be found at is powered by a prediction engine which was trained using the framework presented in this paper. CONCLUSION: The results obtained suggest that learning a single distance function over an entire family of proteins achieves higher prediction accuracy than learning a set of binary classifiers for each of the proteins separately. We also show the importance of obtaining information on experimentally determined non-binders. Learning with real non-binders generalizes better than learning with randomly generated peptides that are assumed to be non-binders. This suggests that information about non-binding peptides should also be published and made publicly available
Effects of thymic selection of the T cell repertoire on HLA-class I associated control of HIV infection
Without therapy, most people infected with human immunodeficiency virus (HIV) ultimately progress to AIDS. Rare individuals (‘elite controllers’) maintain very low levels of HIV RNA without therapy, thereby making disease progression and transmission unlikely. Certain HLA class I alleles are markedly enriched in elite controllers, with the highest association observed for HLA-B57 (ref. 1). Because HLA molecules present viral peptides that activate CD8+ T cells, an immune-mediated mechanism is probably responsible for superior control of HIV. Here we describe how the peptide-binding characteristics of HLA-B57 molecules affect thymic development such that, compared to other HLA-restricted T cells, a larger fraction of the naive repertoire of B57-restricted clones recognizes a viral epitope, and these T cells are more cross-reactive to mutants of targeted epitopes. Our calculations predict that such a T-cell repertoire imposes strong immune pressure on immunodominant HIV epitopes and emergent mutants, thereby promoting efficient control of the virus. Supporting these predictions, in a large cohort of HLA-typed individuals, our experiments show that the relative ability of HLA-B alleles to control HIV correlates with their peptide-binding characteristics that affect thymic development. Our results provide a conceptual framework that unifies diverse empirical observations, and have implications for vaccination strategies.Mark and Lisa Schwartz FoundationNational Institutes of Health (U.S.) (Director’s Pioneer award)Philip T. and Susan M. Ragon FoundationJane Coffin Childs Memorial Fund for Medical ResearchBill & Melinda Gates FoundationNational Institute of Allergy and Infectious Diseases (U.S.)National Institutes of Health (U.S.) (contract no. HHSN261200800001E)National Institutes of Health (U.S.). Intramural Research ProgramNational Cancer Institute (U.S.)Center for Cancer Research (National Cancer Institute (U.S.)
NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence
Binding of peptides to Major Histocompatibility Complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC class I system (HLA-I) is extremely polymorphic. The number of registered HLA-I molecules has now surpassed 1500. Characterizing the specificity of each separately would be a major undertaking.Here, we have drawn on a large database of known peptide-HLA-I interactions to develop a bioinformatics method, which takes both peptide and HLA sequence information into account, and generates quantitative predictions of the affinity of any peptide-HLA-I interaction. Prospective experimental validation of peptides predicted to bind to previously untested HLA-I molecules, cross-validation, and retrospective prediction of known HIV immune epitopes and endogenous presented peptides, all successfully validate this method. We further demonstrate that the method can be applied to perform a clustering analysis of MHC specificities and suggest using this clustering to select particularly informative novel MHC molecules for future biochemical and functional analysis.Encompassing all HLA molecules, this high-throughput computational method lends itself to epitope searches that are not only genome- and pathogen-wide, but also HLA-wide. Thus, it offers a truly global analysis of immune responses supporting rational development of vaccines and immunotherapy. It also promises to provide new basic insights into HLA structure-function relationships. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan
A shared MHC supertype motif emerges by convergent evolution in macaques and mice, but is totally absent in human MHC molecules
The SIV-infected rhesus macaque (Macaca mulatta) is the most established model of AIDS disease systems, providing insight into pathogenesis and a model system for testing novel vaccines. The understanding of cellular immune responses based on the identification and study of Major Histocompatibility Complex (MHC) molecules, including their MHC:peptide-binding motif, provides valuable information to decipher outcomes of infection and vaccine efficacy. Detailed characterization of Mamu-B*039:01, a common allele expressed in Chinese rhesus macaques, revealed a unique MHC:peptide-binding preference consisting of glycine at the second position. Peptides containing a glycine at the second position were shown to be antigenic from animals positive for Mamu-B*039:01. A similar motif was previously described for the Dd mouse MHC allele, but for none of the human HLA molecules for which a motif is known. Further investigation showed that one additional macaque allele, present in Indian rhesus macaques, Mamu-B*052:01, shares this same motif. These “G2” alleles were associated with the presence of specific residues in their B pocket. This pocket structure was found in 6% of macaque sequences but none of 950 human HLA class I alleles. Evolutionary studies using the “G2” alleles points to common ancestry for the macaque sequences, while convergent evolution is suggested when murine and macaque sequences are considered. This is the first detailed characterization of the pocket residues yielding this specific motif in nonhuman primates and mice, revealing a new supertype motif not present in humans
Functional analysis of frequently expressed Chinese rhesus macaque MHC class I molecules Mamu-A1*02601 and Mamu-B*08301 reveals HLA-A2 and HLA-A3 supertypic specificities
The Simian immunodeficiency virus (SIV)-infected Indian rhesus macaque (Macaca mulatta) is the most established model of HIV infection and AIDS-related research, despite the potential that macaques of Chinese origin is a more relevant model. Ongoing efforts to further characterize the Chinese rhesus macaques’ major histocompatibility complex (MHC) for composition and function should facilitate greater utilization of the species. Previous studies have demonstrated that Chinese-origin M. mulatta (Mamu) class I alleles are more polymorphic than their Indian counterparts, perhaps inferring a model more representative of human MHC, human leukocyte antigen (HLA). Furthermore, the Chinese rhesus macaque class I allele Mamu-A1*02201, the most frequent allele thus far identified, has recently been characterized and shown to be an HLA-B7 supertype analog, the most frequent supertype in human populations. In this study, we have characterized two additional alleles expressed with high frequency in Chinese rhesus macaques, Mamu-A1*02601 and Mamu-B*08301. Upon the development of MHC–peptide-binding assays and definition of their associated motifs, we reveal that these Mamu alleles share peptide-binding characteristics with the HLA-A2 and HLA-A3 supertypes, respectively, the next most frequent human supertypes after HLA-B7. These data suggest that Chinese rhesus macaques may indeed be a more representative model of HLA gene diversity and function as compared to the species of Indian origin and therefore a better model for investigating human immune responses
The most common Chinese rhesus macaque MHC class I molecule shares peptide binding repertoire with the HLA-B7 supertype
Of the two rhesus macaque subspecies used for AIDS studies, the Simian immunodeficiency virus-infected Indian rhesus macaque (Macaca mulatta) is the most established model of HIV infection, providing both insight into pathogenesis and a system for testing novel vaccines. Despite the Chinese rhesus macaque potentially being a more relevant model for AIDS outcomes than the Indian rhesus macaque, the Chinese-origin rhesus macaques have not been well-characterized for their major histocompatibility complex (MHC) composition and function, reducing their greater utilization. In this study, we characterized a total of 50 unique Chinese rhesus macaques from several varying origins for their entire MHC class I allele composition and identified a total of 58 unique complete MHC class I sequences. Only nine of the sequences had been associated with Indian rhesus macaques, and 28/58 (48.3%) of the sequences identified were novel. From all MHC alleles detected, we prioritized Mamu-A1*02201 for functional characterization based on its higher frequency of expression. Upon the development of MHC/peptide binding assays and definition of its associated motif, we revealed that this allele shares peptide binding characteristics with the HLA-B7 supertype, the most frequent supertype in human populations. These studies provide the first functional characterization of an MHC class I molecule in the context of Chinese rhesus macaques and the first instance of HLA-B7 analogy for rhesus macaques
- …