217 research outputs found

    Discovering sequence motifs in quantitative and qualitative pepetide data

    Get PDF

    Computer aided selection of candidate vaccine antigens

    Get PDF
    Immunoinformatics is an emergent branch of informatics science that long ago pullulated from the tree of knowledge that is bioinformatics. It is a discipline which applies informatic techniques to problems of the immune system. To a great extent, immunoinformatics is typified by epitope prediction methods. It has found disappointingly limited use in the design and discovery of new vaccines, which is an area where proper computational support is generally lacking. Most extant vaccines are not based around isolated epitopes but rather correspond to chemically-treated or attenuated whole pathogens or correspond to individual proteins extract from whole pathogens or correspond to complex carbohydrate. In this chapter we attempt to review what progress there has been in an as-yet-underexplored area of immunoinformatics: the computational discovery of whole protein antigens. The effective development of antigen prediction methods would significantly reduce the laboratory resource required to identify pathogenic proteins as candidate subunit vaccines. We begin our review by placing antigen prediction firmly into context, exploring the role of reverse vaccinology in the design and discovery of vaccines. We also highlight several competing yet ultimately complementary methodological approaches: sub-cellular location prediction, identifying antigens using sequence similarity, and the use of sophisticated statistical approaches for predicting the probability of antigen characteristics. We end by exploring how a systems immunomics approach to the prediction of immunogenicity would prove helpful in the prediction of antigens

    Computational Modeling and Simulations of Protein-Drug and Protein-Protein Complexes: as potential target for therapeutics development

    Get PDF
    The main objective of my thesis is to illustrate the potential of computational modeling techniques in determining decisive protein-protein interactions and protein-ligand interactions of two relevant macromolecular biological systems associated to human diseases. Computational tools such as homology modeling, molecular docking, molecular dynamics simulations and the developed protocols implemented for the preparation, simulation and analysis of each biological system are presented. The first contribution is the simulation of modeling of protein-peptide-protein complexes related to adaptive immune system and multiple sclerosis disease. Investigation of molecular similarity between self-peptide and two microbial peptides for the complexes with respect to molecular recognition mechanism is presented. The second contribution is the investigation of protein-ligand interactions of biological systems associated to Alzheimer’s disease. Computational results are compared with experiments to evidence the origin and degree of selective inhibition displayed by 2-Phenylbenzofurans ligands against butyrylcholinesterase (BChE) protein. The final contribution is on the application of a priori knowledge gathered on protein-ligand interactions in designing ligands with specific structural modifications that display an improved inhibitory activity against BChE protein. In conclusion, therapeutical perspectives and application of hybrid computational approaches to design and develop of potential drugs are discussed

    Structural and Conformational Analysis of B-cell Epitopes − component to guide peptide vaccine design

    Get PDF
    Peptide vaccines have many potential advantages including low cost, lack of need for cold-chain storage and safety. However, it is well known that approximately 90% of B-cell Epitopes (BCEs) are discontinuous in nature making it difficult to mimic them for creating vaccines. To perform a detailed structural analysis of these epitopes, they needs to be mapped onto antigen structures that are complexed with antibody. In order to obtain a clean dataset of antibody-antigen complex crystal structures, a pipeline was designed to process automatically and clean the antibody related structures from the PDB. To store this processed antibody structural data, a database (AbDb) was built and made available online. The degree of discontinuity in B-cell epitopes and their conformational nature was studied by mapping epitopes in the antibody-antigen dataset. The discontinuity of B-cell epitopes was analysed by defining extended ‘regions’ (R, consisting of at least 3 antibody-contacting residues each separated by ≀ 3 residues) and small fragments (F, antibody-contacting residues that do not satisfy the requirements for a region). Secondly, an algorithm was developed to classify region shape as linear, curved or folded. Molecular dynamics simulations were carried out on isolated epitope regions (wild type and mutant peptides). The mutant peptides have been designed by mutating non-contacting and hydrophobic residues in epitopes. Two types of mutations (hy- drophobic to alanine and hydrophobic to glutamine) have been studied using molec- ular dynamics simulations. Furthermore, the effect of end-capping on wild type and mutant epitope regions has been studied. Simulation studies were carried out on 5 linear and 5 folded shape regions. Out of these, 2 epitopes (one linear and one folded), along with their mutants and derivatives, were tested experimentally for conformational stability by CD spectroscopy and NMR. The binding of isolated epitopes with antibody was also validated by ELISA and SPR

    Studies of MHC class I antigen presentation & the origins of the immunopeptidome

    Get PDF
    La prĂ©sentation d'antigĂšne par les molĂ©cules d'histocompatibilitĂ© majeure de classe I (CMHI) permet au systĂšme immunitaire adaptatif de dĂ©tecter et Ă©liminer les agents pathogĂšnes intracellulaires et des cellules anormales. La surveillance immunitaire est effectuĂ©e par les lymphocytes T CD8 qui interagissent avec le rĂ©pertoire de peptides associĂ©s au CMHI prĂ©sentĂ©s Ă  la surface de toutes cellules nuclĂ©Ă©es. Les principaux gĂšnes humains de CMHI, HLA-A et HLA-B, sont trĂšs polymorphes et par consĂ©quent montrent des diffĂ©rences dans la prĂ©sentation des antigĂšnes. Nous avons Ă©tudiĂ© les diffĂ©rences qualitatives et quantitatives dans l'expression et la liaison peptidique de plusieurs allotypes HLA. Utilisant la technique de cytomĂ©trie de flux quantitative nous avons Ă©tabli une hiĂ©rarchie d'expression pour les quatre HLA-A, B allotypes enquĂȘte. Nos rĂ©sultats sont compatibles avec une corrĂ©lation inverse entre l'expression allotypique et la diversitĂ© des peptides bien que d'autres Ă©tudes soient nĂ©cessaires pour consolider cette hypothĂšse. Les origines mondiales du rĂ©pertoire de peptides associĂ©s au CMHI restent une question centrale Ă  la fois fondamentalement et dans la recherche de cibles immunothĂ©rapeutiques. Utilisant des techniques protĂ©ogĂ©nomiques, nous avons identifiĂ© et analysĂ© 25,172 peptides CMHI isolĂ©es Ă  partir des lymphocytes B de 18 personnes qui exprime collectivement 27 allotypes HLA-A,B. Alors que 58% des gĂšnes ont Ă©tĂ© la source de 1-64 peptides CMHI par gĂšne, 42% des gĂšnes ne sont pas reprĂ©sentĂ©s dans l'immunopeptidome. Dans l'ensemble, l’immunopeptidome prĂ©sentĂ© par 27 allotypes HLA-A,B ne couvrent que 17% des sĂ©quences exomiques exprimĂ©es dans les cellules des sujets. Nous avons identifiĂ© plusieurs caractĂ©ristiques des transcrits et des protĂ©ines qui amĂ©liorent la production des peptides CMHI. Avec ces donnĂ©es, nous avons construit un modĂšle de rĂ©gression logistique qui prĂ©dit avec une grande prĂ©cision si un gĂšne de notre ensemble de donnĂ©es ou Ă  partir d'ensembles de donnĂ©es indĂ©pendants gĂ©nĂšrerait des peptides CMHI. Nos rĂ©sultats montrent la sĂ©lection prĂ©fĂ©rentielle des peptides CMHI Ă  partir d'un rĂ©pertoire limitĂ© de produits de gĂšnes avec des caractĂ©ristiques distinctes. L'idĂ©e que le systĂšme immunitaire peut surveiller des peptides CMHI couvrant seulement une fraction du gĂ©nome codant des protĂ©ines a des implications profondes dans l'auto-immunitĂ© et l'immunologie du cancer.Antigen presentation by major histocompatibility complex class I (MHCI) molecules allows the adaptive immune system to detect and eliminate intracellular pathogens or abnormal cells. Immune surveillance is executed by CD8 T cells that monitor the repertoire of MHCI-associated peptides (MAPs) presented at the surface of all nucleated cells. The primary human MHCI genes, HLA-A and HLA-B, are highly polymorphic and consequentially demonstrate differences in antigen presentation. We investigated qualitative and quantitative differences in expression and peptide binding. Using quantitative flow cytometry we establish clear hierarchy of expression for the four HLA-A,B allotypes investigated. Our results are consistent with an inverse correlation between expression and peptide diversity although further work is necessary to solidify this hypothesis. The global origins of the MAP repertoire remains a central question both fundamentally and in the search for immunotherapeutic targets. Using proteogenomics, we identified and analyzed 25,172 MAPs isolated from B lymphocytes of 18 individuals who collectively expressed 27 HLA-A,B allotypes. While 58% of genes were the source of 1-64 MAPs per gene, 42% of genes were not represented in the immunopeptidome. Overall, we estimate the immunopeptidome presented by 27 HLA-A,B allotypes covered only 17% of exomic sequences expressed in subjects’ cells. We identified several features of transcripts and proteins that enhance MAP production. From these data we built a logistic regression model that predicts with high accuracy whether a gene from our dataset or from independent datasets would generate MAPs. Our results show preferential selection of MAPs from a limited repertoire of gene products with distinct features. The notion that the immune system can monitor MAPs covering only a fraction of the protein coding genome has profound implications in autoimmunity and cancer immunology

    Functional characterisation of pncr003;2L, a small open reading frame gene conserved from drosophila to humans

    Get PDF
    Small open reading frame genes (smORFs) are a new class of genes, which emerged from the revision of the idea that open reading frames have to be longer than 100 codons to be protein coding and functional. Although bio-informatics evidence suggests that thousands of smORF genes could exist in any given genome, proof of their functional relevance can only be obtained through their functional characterization. This work represents such a study for a Drosophila smORF (pncr003;2L), which was initially misannotated as a non-coding RNA because of its lack of a canonical long open reading frame. Here I show that pncr003;2L codes for two small peptides of 28 and 29 aa, expressed in somatic and cardiac muscles. After generating a null condition for this gene, I use the adult Drosophila heart as a system to assess the function of pncr003;2L. With this system, I show that the small pncr003;2L peptides regulate heart contractions by modulating Ca2+ cycling in cardiac muscles, with either lack or excess of function of these peptides leading to cardiac arrhythmias, and abnormal calcium dynamics. Finally, through an extensive homology study, I show that these small peptides share a great amount of structural and functional homology with the peptides encoded by the vertebrate smORFs sarcolipin (sln) and phospoholamban (pln), which act as major regulators of the Sarco-Endoplasmic Reticulum Calcium ATPase (SERCA), the channel responsible for calcium uptake into the ER following muscle contraction. These results highlight the importance of the pncr003;2L smORF and the Drosophila system, for the study of cardiac pathologies, but most importantly, they show that this family of peptides, conserved across evolution, represent an ancient system for the regulation of calciumtrafficking in muscles. This work corroborates the prevalence, and relevance of this novel class of genes, and shows that closer attention should be given to smORFs in order to determine the full extent of their biological contributio

    Enumeration, conformation sampling and population of libraries of peptide macrocycles for the search of chemotherapeutic cardioprotection agents

    Get PDF
    Peptides are uniquely endowed with features that allow them to perturb previously difficult to drug biomolecular targets. Peptide macrocycles in particular have seen a flurry of recent interest due to their enhanced bioavailability, tunability and specificity. Although these properties make them attractive hit-candidates in early stage drug discovery, knowing which peptides to pursue is non‐trivial due to the magnitude of the peptide sequence space. Computational screening approaches show promise in their ability to address the size of this search space but suffer from their inability to accurately interrogate the conformational landscape of peptide macrocycles. We developed an in‐silico compound enumerator that was tasked with populating a conformationally laden peptide virtual library. This library was then used in the search for cardio‐protective agents (that may be administered, reducing tissue damage during reperfusion after ischemia (heart attacks)). Our enumerator successfully generated a library of 15.2 billion compounds, requiring the use of compression algorithms, conformational sampling protocols and management of aggregated compute resources in the context of a local cluster. In the absence of experimental biophysical data, we performed biased sampling during alchemical molecular dynamics simulations in order to observe cyclophilin‐D perturbation by cyclosporine A and its mitochondrial targeted analogue. Reliable intermediate state averaging through a WHAM analysis of the biased dynamic pulling simulations confirmed that the cardio‐protective activity of cyclosporine A was due to its mitochondrial targeting. Paralleltempered solution molecular dynamics in combination with efficient clustering isolated the essential dynamics of a cyclic peptide scaffold. The rapid enumeration of skeletons from these essential dynamics gave rise to a conformation laden virtual library of all the 15.2 Billion unique cyclic peptides (given the limits on peptide sequence imposed). Analysis of this library showed the exact extent of physicochemical properties covered, relative to the bare scaffold precursor. Molecular docking of a subset of the virtual library against cyclophilin‐D showed significant improvements in affinity to the target (relative to cyclosporine A). The conformation laden virtual library, accessed by our methodology, provided derivatives that were able to make many interactions per peptide with the cyclophilin‐D target. Machine learning methods showed promise in the training of Support Vector Machines for synthetic feasibility prediction for this library. The synergy between enumeration and conformational sampling greatly improves the performance of this library during virtual screening, even when only a subset is used

    Computational studies of protein-ligand molecular recognition

    Get PDF
    Structure-based drug design is made possible by our understanding of molecular recognition. The utility of this approach was apparent in the development of the clinically e V ective HIV-1 PR inhibitors, where crystal structures of complexes of HIV-1 protease and inhibitors gave pivotal information. Computational methods drawing upon structural data are of increasing relevance to the drug design process. Nonetheless, these methods are quite rudimentary and signicant improvements are needed. The aim of this thesis was to investigate techniques which may lead to improved modelling of molecular recognition and a better ability to make predictions about the binding a Y nity of ligands. The two main themes were the modelling of acidbase titration behaviour of ligand and receptor, and the application of the simulation technique of congurational bias Monte Carlo (CBMC). The studies were performed with HIV-1 PR and its inhibitors as a model system. Biological processes are inuenced by the pH of the medium in which they take place. Ligandreceptor binding equilibria are often thermodynamically linked to protonation changes in ligand and/or receptor, as seen in the the binding of a number of HIV-1 PR inhibitors. In Chapter 2, a series of sixteen continuum electrostatics pKa calculations of HIV-1 PR inhibitor complexes was done, in order to characterize the nature and size of these linkages. The most important e V ects concern changes in the pKa of the enzyme active site aspartate dyad. Large pKa shifts were predicted in all cases, and at least one of the two dyad pKas became more basic on binding. At physiologically relevant pH, di V erent ligands induced di V erent protonation states, with di V erent tautomeric forms favoured. The fully deprotonated form of the dyad was not signicantly populated for any of the complexes. For about a third of the complexes, both singly and doubly protonated forms were predicted to be populated. The predicted predominant protonation states of MVT-101 and VX-478 were consistent with previous theoretical studies. The size of the predicted pKa shifts for MVT-101 and XK263 di V ered from a previous study using similar methods. The paucity and ambiguity of available experimental data makes it di Y cult to evaluate the results fully; however the tendency to exaggerate shifts, as observed in other studies, appears to be present. Scoring is the prediction of binding a Y nity from the structure of the ligandreceptor complex, according to an empirical scheme. Scoring studies usually neglect or grossly simplify the contribution of protonation equilibria to a Y nity, so in Chapter 3 proton linkage data was included in a regression analysis of the HIV-1 PR complexes from Chapter 2. Parameters previously shown to correlate with binding, namely electrostatic free energy changes and buried surface areas, were the basis for the analysis, and terms describing proton linkage, in the form of a correction for assay pH and an indicator variable for predicted dyad pKa shift on binding, were also considered. The complex with MVT-101 was an outlier in the analysis and was excluded. Further analysis demonstrated that the correction for assay pH made a signicant contribution to the regression equation. Amendment of the parameters for XK263 according to the available experimental data led to an improved regression in which the term for calculated pKa shifts also made a signicant contribution. The regression equations obtained had the same form and similar coe Y cients to scoring functions of the master equation type, and t the experimental data with comparable accuracy. More physically realistic simulations of ligandreceptor binding using the techniques of molecular dynamics (MD) or Monte Carlo (MC) are potentially more accurate than scoring function approaches. These methods are slow, so the alternative of CBMC, which has been shown to give faster convergence for polymer simulations, was implemented for C harmm 22, an all-atom protein force eld (Chapter 4). The correctness of the implementation was demonstrated by comparison with exact and stochastic dynamics (SD) results for individual terms in the force eld. The algorithm is more complex than those typically used with alkane force elds, and this has possible consequences for the e Y ciency. CBMC was used to generate a Ramachandran plot for the alanine dipeptide, and the results were found to be in agreement with those generated by a SD simulation. Analysis of statistical errors suggests that CBMC should be competitive with umbrella sampling for simulating conformational equilibria, par- ticularly when the cost of non-bonded energy evaluations dominates the simulation. CBMC can be applied to ligandreceptor binding, as demonstrated in grand canonical simulations of alkane adsorption in zeolites. The more limited problem of nding the pre- dominant bound conformation of a exible ligand given a rigid protein receptor (i.e. dock- ing) was treated in Chapter 5, using the example of a tripeptide inhibitor which binds to HIV-1 PR. Attempts to perform the docking using the Metropolis MC/simulated annealing and Lamarckian genetic algorithm methods implemented in the program AutoDock failed to reproduce the native conguration (with runs on the order of two days execution time). Docking using CBMC, combined with parallel tempering to further improve sampling, was successful in nding the native binding mode, although this success was dependent on ad hoc adjustments to the force eld, and a priori knowledge of the ligand protonation state and bind- ing site. The e Y ciency of the method was considerably lower than hoped, with problems due to the force eld- and model-dependent coupling between terms in the potential energy function, and the greedy nature of the CBMC algorithm. Various conclusions can be drawn from these studies. Chapters 2 and 3 provide evidence of the importance of protonation equilibria in ligandprotein molecular recognition, and un- derline the sizable contribution of electrostatic interactions to binding energies. In the face of this nding, neglect of electrostatic terms, as often seen past studies, appears to be counterpro- ductive. The scoring study also shows how experimental data can be used more e V ectively if factors such as assay conditions are carefully taken into account. Implementation of CBMC for a widely-used protein force eld and application of the algorithm to docking (Chapters 4 and 5) represents a proof of concept for a broadly useful simulation technique. Further work will be required to nd the right niche for CBMC and fully explore the potential of this and re- lated techniques. A nal point is the demonstrated utility of the HIV-1 PR test system which formed the focus of the studies. Abundant structural data has enabled many new approaches to be tested, and further insights are expected from the analysis of unusual cases, such as the anomalous results for MVT-101. As well as the question of scoring, studies of mutation and resistance are likely to attract considerable interest in the future

    The roles of processing, presentation and T cell receptor recognition in the T lymphocyte response to a protein antigen

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biology, 1994.Includes bibliographical references (leaves 126-145).by Alex Szabo.Ph.D

    Systems biology of the human MHC class I immunopeptidome

    Get PDF
    Le systĂšme de diffĂ©renciation entre le « soi » et le « non-soi » des vertĂ©brĂ©s permet la dĂ©tection et le rejet de pathogĂšnes et de cellules allogĂ©niques. Il requiert la surveillance de petits peptides prĂ©sentĂ©s Ă  la surface cellulaire par les molĂ©cules du complexe majeur d’histocompatibilitĂ© de classe I (CMH I). Les molĂ©cules du CMH I sont des hĂ©tĂ©rodimĂšres composĂ©s par une chaĂźne lourde encodĂ©e par des gĂšnes du CMH et une chaĂźne lĂ©gĂšre encodĂ©e par le gĂšne ÎČ2-microglobuline. L’ensemble des peptides est appelĂ© l’immunopeptidome du CMH I. Nous avons utilisĂ© des approches en biologie de systĂšmes pour dĂ©finir la composition et l’origine cellulaire de l’immunopeptidome du CMH I prĂ©sentĂ© par des cellules B lymphoblastoĂŻdes dĂ©rivĂ©s de deux pairs de fratries avec un CMH I identique. Nous avons dĂ©couvert que l’immunopeptidome du CMH I est spĂ©cifique Ă  l’individu et au type cellulaire, qu’il dĂ©rive prĂ©fĂ©rentiellement de transcrits abondants, est enrichi en transcrits possĂ©dant d’élĂ©ments de reconnaissance par les petits ARNs, mais qu’il ne montre aucun biais ni vers les rĂ©gions gĂ©nĂ©tiques invariables ni vers les rĂ©gions polymorphiques. Nous avons Ă©galement dĂ©veloppĂ© une nouvelle mĂ©thode qui combine la spectromĂ©trie de masse, le sĂ©quençage de nouvelle gĂ©nĂ©ration et la bioinformatique pour l’identification Ă  grand Ă©chelle de peptides du CMH I, dont ceux rĂ©sultants de polymorphismes nuclĂ©otidiques simples non-synonymes (PNS-ns), appelĂ©s antigĂšnes mineurs d’histocompatibilitĂ© (AMHs), qui sont les cibles de rĂ©ponses allo-immunitaires. La comparaison de l’origine gĂ©nomique de l’immunopeptidome de soeurs avec un CMH I identique a rĂ©vĂ©lĂ© que 0,5% des PNS-ns Ă©taient reprĂ©sentĂ©s dans l’immunopeptidome et que 0,3% des peptides du CMH I seraient immunogĂ©niques envers une des deux soeurs. En rĂ©sumĂ©, nous avons dĂ©couvert des nouveaux facteurs qui modĂšlent l’immunopeptidome du CMH I et nous prĂ©sentons une nouvelle stratĂ©gie pour l’indentification de ces peptides, laquelle pourrait accĂ©lĂ©rer Ă©normĂ©ment le dĂ©veloppement d’immunothĂ©rapies ciblant les AMHs.The self/nonself discrimination system of vertebrates allows detection and rejection of pathogens and allogeneic cells. It requires the surveillance of short peptides presented by major histocompatibility class I (MHC I) molecules on the cell surface. MHC I molecules are heterodimers that consist of a heavy chain produced by MHC genes and a light chain encoded by the ÎČ2-microglobulin gene. The peptides presented by MHC I molecules are collectively referred to as the MHC I immunopeptidome. We employed systems biology approaches to define the composition and cellular origin of the self MHC I immunopeptidome presented by B lymphoblastoid cells derived from two pairs of MHC-identical siblings. We found that the MHC I immunopeptidome is subject- and cell-specific, derives preferentially from abundant transcripts, is enriched in transcripts bearing microRNA response elements and shows no bias toward invariant vs. polymorphic genomic sequences. We also developed a novel personalized approach combining mass-spectrometry, next-generation sequencing and bioinformatics for high-throughput identification of MHC I peptides including those caused by nonsynonymous single nucleotide polymorphisms (ns-SNPs), termed minor histocompatibility antigens (MiHAs), which are the targets of allo-immune responses. Comparison of the genomic landscape of the immunopeptidome of MHC-identical siblings revealed that 0.5% of ns-SNPs were represented in the immunopeptidome and that 0.3% of the MHC I-peptide repertoire would be immunogenic for one of the siblings. We discovered new factors that shape the self MHC I immunopeptidome and present a novel strategy for the identification of MHC I-associated peptides that could greatly accelerate the development of MiHA-targeted immunotherapy
    • 

    corecore