71 research outputs found

    State of the art and challenges in sequence based T-cell epitope prediction

    Get PDF
    Sequence based T-cell epitope predictions have improved immensely in the last decade. From predictions of peptide binding to major histocompatibility complex molecules with moderate accuracy, limited allele coverage, and no good estimates of the other events in the antigen-processing pathway, the field has evolved significantly. Methods have now been developed that produce highly accurate binding predictions for many alleles and integrate both proteasomal cleavage and transport events. Moreover have so-called pan-specific methods been developed, which allow for prediction of peptide binding to MHC alleles characterized by limited or no peptide binding data. Most of the developed methods are publicly available, and have proven to be very useful as a shortcut in epitope discovery. Here, we will go through some of the history of sequence-based predictions of helper as well as cytotoxic T cell epitopes. We will focus on some of the most accurate methods and their basic background

    Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Non-ribosomal peptide synthetases (NRPSs) are large multimodular enzymes that synthesize a wide range of biologically active natural peptide compounds, of which many are pharmacologically important. Peptide bond formation is catalyzed by the Condensation (C) domain. Various functional subtypes of the C domain exist: An <sup>L</sup>C<sub>L </sub>domain catalyzes a peptide bond between two L-amino acids, a <sup>D</sup>C<sub>L </sub>domain links an L-amino acid to a growing peptide ending with a D-amino acid, a Starter C domain (first denominated and classified as a separate subtype here) acylates the first amino acid with a <it>β</it>-hydroxy-carboxylic acid (typically a <it>β</it>-hydroxyl fatty acid), and Heterocyclization (Cyc) domains catalyze both peptide bond formation and subsequent cyclization of cysteine, serine or threonine residues. The homologous Epimerization (E) domain flips the chirality of the last amino acid in the growing peptide; Dual E/C domains catalyze both epimerization and condensation.</p> <p>Results</p> <p>In this paper, we report on the reconstruction of the phylogenetic relationship of NRPS C domain subtypes and analyze in detail the sequence motifs of recently discovered subtypes (Dual E/C, <sup>D</sup>C<sub>L </sub>and Starter domains) and their characteristic sequence differences, mutually and in comparison with <sup>L</sup>C<sub>L </sub>domains. Based on their phylogeny and the comparison of their sequence motifs, <sup>L</sup>C<sub>L </sub>and Starter domains appear to be more closely related to each other than to other subtypes, though pronounced differences in some segments of the protein account for the unequal donor substrates (amino vs. <it>β</it>-hydroxy-carboxylic acid). Furthermore, on the basis of phylogeny and the comparison of sequence motifs, we conclude that Dual E/C and <sup>D</sup>C<sub>L </sub>domains share a common ancestor. In the same way, the evolutionary origin of a C domain of unknown function in glycopeptide (GP) NRPSs can be determined to be an <sup>L</sup>C<sub>L </sub>domain. In the case of two GP C domains which are most similar to <sup>D</sup>C<sub>L </sub>but which have <sup>L</sup>C<sub>L </sub>activity, we postulate convergent evolution.</p> <p>Conclusion</p> <p>We systematize all C domain subtypes including the novel Starter C domain. With our results, it will be easier to decide the subtype of unknown C domains as we provide profile Hidden Markov Models (pHMMs) for the sequence motifs as well as for the entire sequences. The determined specificity conferring positions will be helpful for the mutation of one subtype into another, e.g. turning <sup>D</sup>C<sub>L </sub>to <sup>L</sup>C<sub>L</sub>, which can be a useful step for obtaining novel products.</p

    HLA class I allele promiscuity revisited

    Get PDF
    The peptide repertoire presented on human leukocyte antigen (HLA) class I molecules is largely determined by the structure of the peptide binding groove. It is expected that the molecules having similar grooves (i.e., belonging to the same supertype) might present similar/overlapping peptides. However, the extent of promiscuity among HLA class I ligands remains controversial: while in many studies T cell responses are detected against epitopes presented by alternative molecules across HLA class I supertypes and loci, peptide elution studies report minute overlaps between the peptide repertoires of even related HLA molecules. To get more insight into the promiscuous peptide binding by HLA molecules, we analyzed the HLA peptide binding data from the large epitope repository, Immune Epitope Database (IEDB), and further performed in silico analysis to estimate the promiscuity at the population level. Both analyses suggest that an unexpectedly large fraction of HLA ligands (>50%) bind two or more HLA molecules, often across supertype or even loci. These results suggest that different HLA class I molecules can nevertheless present largely overlapping peptide sets, and that “functional” HLA polymorphism on individual and population level is probably much lower than previously anticipated

    The Immune Epitope Database 2.0

    Get PDF

    SigniSite: Identification of residue-level genotype-phenotype correlations in protein multiple sequence alignments

    Get PDF
    Identifying which mutation(s) within a given genotype is responsible for an observable phenotype is important in many aspects of molecular biology. Here, we present SigniSite, an online application for subgroup-free residue-level genotype–phenotype correlation. In contrast to similar methods, SigniSite does not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype. SigniSite will then identify which amino acid residues are significantly associated with the data set phenotype. As output, SigniSite displays a sequence logo, depicting the strength of the phenotype association of each residue and a heat-map identifying ‘hot’ or ‘cold’ regions. SigniSite was benchmarked against SPEER, a state-of-the-art method for the prediction of specificity determining positions (SDP) using a set of human immunodeficiency virus protease-inhibitor genotype–phenotype data and corresponding resistance mutation scores from the Stanford University HIV Drug Resistance Database, and a data set of protein families with experimentally annotated SDPs. For both data sets, SigniSite was found to outperform SPEER. SigniSite is available at: http://www.cbs.dtu.dk/services/SigniSite/

    Identification of CD8+ T Cell Epitopes in the West Nile Virus Polyprotein by Reverse-Immunology Using NetCTL

    Get PDF
    West Nile virus (WNV) is a growing threat to public health and a greater understanding of the immune response raised against WNV is important for the development of prophylactic and therapeutic strategies.In a reverse-immunology approach, we used bioinformatics methods to predict WNV-specific CD8(+) T cell epitopes and selected a set of peptides that constitutes maximum coverage of 20 fully-sequenced WNV strains. We then tested these putative epitopes for cellular reactivity in a cohort of WNV-infected patients. We identified 26 new CD8(+) T cell epitopes, which we propose are restricted by 11 different HLA class I alleles. Aiming for optimal coverage of human populations, we suggest that 11 of these new WNV epitopes would be sufficient to cover from 48% to 93% of ethnic populations in various areas of the World.The 26 identified CD8(+) T cell epitopes contribute to our knowledge of the immune response against WNV infection and greatly extend the list of known WNV CD8(+) T cell epitopes. A polytope incorporating these and other epitopes could possibly serve as the basis for a WNV vaccine

    The Immune Epitope Database 2.0

    Get PDF
    The Immune Epitope Database (IEDB, www.iedb.org) provides a catalog of experimentally characterized B and T cell epitopes, as well as data on Major Histocompatibility Complex (MHC) binding and MHC ligand elution experiments. The database represents the molecular structures recognized by adaptive immune receptors and the experimental contexts in which these molecules were determined to be immune epitopes. Epitopes recognized in humans, nonhuman primates, rodents, pigs, cats and all other tested species are included. Both positive and negative experimental results are captured. Over the course of 4 years, the data from 180 978 experiments were curated manually from the literature, which covers ∼99% of all publicly available information on peptide epitopes mapped in infectious agents (excluding HIV) and 93% of those mapped in allergens. In addition, data that would otherwise be unavailable to the public from 129 186 experiments were submitted directly by investigators. The curation of epitopes related to autoimmunity is expected to be completed by the end of 2010. The database can be queried by epitope structure, source organism, MHC restriction, assay type or host organism, among other criteria. The database structure, as well as its querying, browsing and reporting interfaces, was completely redesigned for the IEDB 2.0 release, which became publicly available in early 2009

    Proteome Sampling by the HLA Class I Antigen Processing Pathway

    Get PDF
    The peptide repertoire that is presented by the set of HLA class I molecules of an individual is formed by the different players of the antigen processing pathway and the stringent binding environment of the HLA class I molecules. Peptide elution studies have shown that only a subset of the human proteome is sampled by the antigen processing machinery and represented on the cell surface. In our study, we quantified the role of each factor relevant in shaping the HLA class I peptide repertoire by combining peptide elution data, in silico predictions of antigen processing and presentation, and data on gene expression and protein abundance. Our results indicate that gene expression level, protein abundance, and rate of potential binding peptides per protein have a clear impact on sampling probability. Furthermore, once a protein is available for the antigen processing machinery in sufficient amounts, C-terminal processing efficiency and binding affinity to the HLA class I molecule determine the identity of the presented peptides. Having studied the impact of each of these factors separately, we subsequently combined all factors in a logistic regression model in order to quantify their relative impact. This model demonstrated the superiority of protein abundance over gene expression level in predicting sampling probability. Being able to discriminate between sampled and non-sampled proteins to a significant degree, our approach can potentially be used to predict the sampling probability of self proteins and of pathogen-derived proteins, which is of importance for the identification of autoimmune antigens and vaccination targets

    Estimating the Fitness Cost of Escape from HLA Presentation in HIV-1 Protease and Reverse Transcriptase

    Get PDF
    Human immunodeficiency virus (HIV-1) is, like most pathogens, under selective pressure to escape the immune system of its host. In particular, HIV-1 can avoid recognition by cytotoxic T lymphocytes (CTLs) by altering the binding affinity of viral peptides to human leukocyte antigen (HLA) molecules, the role of which is to present those peptides to the immune system. It is generally assumed that HLA escape mutations carry a replicative fitness cost, but these costs have not been quantified. In this study, we assess the replicative cost of mutations which are likely to escape presentation by HLA molecules in the region of HIV-1 protease and reverse transcriptase. Specifically, we combine computational approaches for prediction of in vitro replicative fitness and peptide binding affinity to HLA molecules. We find that mutations which impair binding to HLA-A molecules tend to have lower in vitro replicative fitness than mutations which do not impair binding to HLA-A molecules, suggesting that HLA-A escape mutations carry higher fitness costs than non-escape mutations. We argue that the association between fitness and HLA-A binding impairment is probably due to an intrinsic cost of escape from HLA-A molecules, and these costs are particularly strong for HLA-A alleles associated with efficient virus control. Counter-intuitively, we do not observe a significant effect in the case of HLA-B, but, as discussed, this does not argue against the relevance of HLA-B in virus control. Overall, this article points to the intriguing possibility that HLA-A molecules preferentially target more conserved regions of HIV-1, emphasizing the importance of HLA-A genes in the evolution of HIV-1 and RNA viruses in general

    A shared MHC supertype motif emerges by convergent evolution in macaques and mice, but is totally absent in human MHC molecules

    Get PDF
    The SIV-infected rhesus macaque (Macaca mulatta) is the most established model of AIDS disease systems, providing insight into pathogenesis and a model system for testing novel vaccines. The understanding of cellular immune responses based on the identification and study of Major Histocompatibility Complex (MHC) molecules, including their MHC:peptide-binding motif, provides valuable information to decipher outcomes of infection and vaccine efficacy. Detailed characterization of Mamu-B*039:01, a common allele expressed in Chinese rhesus macaques, revealed a unique MHC:peptide-binding preference consisting of glycine at the second position. Peptides containing a glycine at the second position were shown to be antigenic from animals positive for Mamu-B*039:01. A similar motif was previously described for the Dd mouse MHC allele, but for none of the human HLA molecules for which a motif is known. Further investigation showed that one additional macaque allele, present in Indian rhesus macaques, Mamu-B*052:01, shares this same motif. These “G2” alleles were associated with the presence of specific residues in their B pocket. This pocket structure was found in 6% of macaque sequences but none of 950 human HLA class I alleles. Evolutionary studies using the “G2” alleles points to common ancestry for the macaque sequences, while convergent evolution is suggested when murine and macaque sequences are considered. This is the first detailed characterization of the pocket residues yielding this specific motif in nonhuman primates and mice, revealing a new supertype motif not present in humans
    corecore