217 research outputs found
Computer aided selection of candidate vaccine antigens
Immunoinformatics is an emergent branch of informatics science that long ago pullulated from the tree of knowledge that is bioinformatics. It is a discipline which applies informatic techniques to problems of the immune system. To a great extent, immunoinformatics is typified by epitope prediction methods. It has found disappointingly limited use in the design and discovery of new vaccines, which is an area where proper computational support is generally lacking. Most extant vaccines are not based around isolated epitopes but rather correspond to chemically-treated or attenuated whole pathogens or correspond to individual proteins extract from whole pathogens or correspond to complex carbohydrate. In this chapter we attempt to review what progress there has been in an as-yet-underexplored area of immunoinformatics: the computational discovery of whole protein antigens. The effective development of antigen prediction methods would significantly reduce the laboratory resource required to identify pathogenic proteins as candidate subunit vaccines. We begin our review by placing antigen prediction firmly into context, exploring the role of reverse vaccinology in the design and discovery of vaccines. We also highlight several competing yet ultimately complementary methodological approaches: sub-cellular location prediction, identifying antigens using sequence similarity, and the use of sophisticated statistical approaches for predicting the probability of antigen characteristics. We end by exploring how a systems immunomics approach to the prediction of immunogenicity would prove helpful in the prediction of antigens
Computational Modeling and Simulations of Protein-Drug and Protein-Protein Complexes: as potential target for therapeutics development
The main objective of my thesis is to illustrate the potential of computational modeling techniques in determining decisive protein-protein interactions and protein-ligand interactions of two relevant macromolecular biological systems associated to human diseases. Computational tools such as homology modeling, molecular docking, molecular dynamics simulations and the developed protocols implemented for the preparation, simulation and analysis of each biological system are presented. The first contribution is the simulation of modeling of protein-peptide-protein complexes related to adaptive immune system and multiple sclerosis disease. Investigation of molecular similarity between self-peptide and two microbial peptides for the complexes with respect to molecular recognition mechanism is presented.
The second contribution is the investigation of protein-ligand interactions of biological systems associated to Alzheimerâs disease. Computational results are compared with experiments to evidence the origin and degree of selective inhibition displayed by 2-Phenylbenzofurans ligands against butyrylcholinesterase (BChE) protein. The final contribution is on the application of a priori knowledge gathered on protein-ligand interactions in designing ligands with specific structural modifications that display an improved inhibitory activity against BChE protein. In conclusion, therapeutical perspectives and application of hybrid computational approaches to design and develop of potential drugs are discussed
Structural and Conformational Analysis of B-cell Epitopes â component to guide peptide vaccine design
Peptide vaccines have many potential advantages including low cost, lack of need for cold-chain storage and safety. However, it is well known that approximately 90% of B-cell Epitopes (BCEs) are discontinuous in nature making it difficult to mimic them for creating vaccines. To perform a detailed structural analysis of these epitopes, they needs to be mapped onto antigen structures that are complexed with antibody. In order to obtain a clean dataset of antibody-antigen complex crystal structures, a pipeline was designed to process automatically and clean the antibody related structures from the PDB. To store this processed antibody structural data, a database (AbDb) was built and made available online. The degree of discontinuity in B-cell epitopes and their conformational nature was studied by mapping epitopes in the antibody-antigen dataset. The discontinuity of B-cell epitopes was analysed by defining extended âregionsâ (R, consisting of at least 3 antibody-contacting residues each separated by †3 residues) and small fragments (F, antibody-contacting residues that do not satisfy the requirements for a region). Secondly, an algorithm was developed to classify region shape as linear, curved or folded. Molecular dynamics simulations were carried out on isolated epitope regions (wild type and mutant peptides). The mutant peptides have been designed by mutating non-contacting and hydrophobic residues in epitopes. Two types of mutations (hy- drophobic to alanine and hydrophobic to glutamine) have been studied using molec- ular dynamics simulations. Furthermore, the effect of end-capping on wild type and mutant epitope regions has been studied. Simulation studies were carried out on 5 linear and 5 folded shape regions. Out of these, 2 epitopes (one linear and one folded), along with their mutants and derivatives, were tested experimentally for conformational stability by CD spectroscopy and NMR. The binding of isolated epitopes with antibody was also validated by ELISA and SPR
Studies of MHC class I antigen presentation & the origins of the immunopeptidome
La présentation d'antigÚne par les molécules d'histocompatibilité majeure de classe I (CMHI) permet au systÚme immunitaire adaptatif de détecter et éliminer les agents pathogÚnes intracellulaires et des cellules anormales. La surveillance immunitaire est effectuée par les lymphocytes T CD8 qui interagissent avec le répertoire de peptides associés au CMHI présentés à la surface de toutes cellules nucléées.
Les principaux gĂšnes humains de CMHI, HLA-A et HLA-B, sont trĂšs polymorphes et par consĂ©quent montrent des diffĂ©rences dans la prĂ©sentation des antigĂšnes. Nous avons Ă©tudiĂ© les diffĂ©rences qualitatives et quantitatives dans l'expression et la liaison peptidique de plusieurs allotypes HLA. Utilisant la technique de cytomĂ©trie de flux quantitative nous avons Ă©tabli une hiĂ©rarchie d'expression pour les quatre HLA-A, B allotypes enquĂȘte. Nos rĂ©sultats sont compatibles avec une corrĂ©lation inverse entre l'expression allotypique et la diversitĂ© des peptides bien que d'autres Ă©tudes soient nĂ©cessaires pour consolider cette hypothĂšse.
Les origines mondiales du rĂ©pertoire de peptides associĂ©s au CMHI restent une question centrale Ă la fois fondamentalement et dans la recherche de cibles immunothĂ©rapeutiques. Utilisant des techniques protĂ©ogĂ©nomiques, nous avons identifiĂ© et analysĂ© 25,172 peptides CMHI isolĂ©es Ă partir des lymphocytes B de 18 personnes qui exprime collectivement 27 allotypes HLA-A,B. Alors que 58% des gĂšnes ont Ă©tĂ© la source de 1-64 peptides CMHI par gĂšne, 42% des gĂšnes ne sont pas reprĂ©sentĂ©s dans l'immunopeptidome. Dans l'ensemble, lâimmunopeptidome prĂ©sentĂ© par 27 allotypes HLA-A,B ne couvrent que 17% des sĂ©quences exomiques exprimĂ©es dans les cellules des sujets. Nous avons identifiĂ© plusieurs caractĂ©ristiques des transcrits et des protĂ©ines qui amĂ©liorent la production des peptides CMHI. Avec ces donnĂ©es, nous avons construit un modĂšle de rĂ©gression logistique qui prĂ©dit avec une grande prĂ©cision si un gĂšne de notre ensemble de donnĂ©es ou Ă partir d'ensembles de donnĂ©es indĂ©pendants gĂ©nĂšrerait des peptides CMHI. Nos rĂ©sultats montrent la sĂ©lection prĂ©fĂ©rentielle des peptides CMHI Ă partir d'un rĂ©pertoire limitĂ© de produits de gĂšnes avec des caractĂ©ristiques distinctes. L'idĂ©e que le systĂšme immunitaire peut surveiller des peptides CMHI couvrant seulement une fraction du gĂ©nome codant des protĂ©ines a des implications profondes dans l'auto-immunitĂ© et l'immunologie du cancer.Antigen presentation by major histocompatibility complex class I (MHCI) molecules allows the adaptive immune system to detect and eliminate intracellular pathogens or abnormal cells. Immune surveillance is executed by CD8 T cells that monitor the repertoire of MHCI-associated peptides (MAPs) presented at the surface of all nucleated cells.
The primary human MHCI genes, HLA-A and HLA-B, are highly polymorphic and consequentially demonstrate differences in antigen presentation. We investigated qualitative and quantitative differences in expression and peptide binding. Using quantitative flow cytometry we establish clear hierarchy of expression for the four HLA-A,B allotypes investigated. Our results are consistent with an inverse correlation between expression and peptide diversity although further work is necessary to solidify this hypothesis.
The global origins of the MAP repertoire remains a central question both fundamentally and in the search for immunotherapeutic targets. Using proteogenomics, we identified and analyzed 25,172 MAPs isolated from B lymphocytes of 18 individuals who collectively expressed 27 HLA-A,B allotypes. While 58% of genes were the source of 1-64 MAPs per gene, 42% of genes were not represented in the immunopeptidome. Overall, we estimate the immunopeptidome presented by 27 HLA-A,B allotypes covered only 17% of exomic sequences expressed in subjectsâ cells. We identified several features of transcripts and proteins that enhance MAP production. From these data we built a logistic regression model that predicts with high accuracy whether a gene from our dataset or from independent datasets would generate MAPs. Our results show preferential selection of MAPs from a limited repertoire of gene products with distinct features. The notion that the immune system can monitor MAPs covering only a fraction of the protein coding genome has profound implications in autoimmunity and cancer immunology
Functional characterisation of pncr003;2L, a small open reading frame gene conserved from drosophila to humans
Small open reading frame genes (smORFs) are a new class of genes, which emerged from the revision of the idea that open reading frames have to be longer than 100 codons to be protein coding and functional. Although bio-informatics evidence suggests that thousands of smORF genes could exist in any given genome, proof of their functional relevance can only be obtained through their functional characterization. This work represents such a study for a Drosophila smORF (pncr003;2L), which was initially misannotated as a non-coding RNA because of its lack of a canonical long open reading frame. Here I show that pncr003;2L codes for two small peptides of 28 and 29 aa, expressed in somatic and cardiac muscles. After generating a null condition for this gene, I use the adult Drosophila heart as a system to assess the function of pncr003;2L. With this system, I show that the small pncr003;2L peptides regulate heart contractions by modulating Ca2+ cycling in cardiac muscles, with either lack or excess of function of these peptides leading to cardiac arrhythmias, and abnormal calcium dynamics. Finally, through an extensive homology study, I show that these small peptides share a great amount of structural and functional homology with the peptides encoded by the vertebrate smORFs sarcolipin (sln) and phospoholamban (pln), which act as major regulators of the Sarco-Endoplasmic Reticulum Calcium ATPase (SERCA), the channel responsible for calcium uptake into the ER following muscle contraction. These results highlight the importance of the pncr003;2L smORF and the Drosophila system, for the study of cardiac pathologies, but most importantly, they show that this family of peptides, conserved across evolution, represent an ancient system for the regulation of calciumtrafficking in muscles. This work corroborates the prevalence, and relevance of this novel class of genes, and shows that closer attention should be given to smORFs in order to determine the full extent of their biological contributio
Enumeration, conformation sampling and population of libraries of peptide macrocycles for the search of chemotherapeutic cardioprotection agents
Peptides are uniquely endowed with features that allow them to perturb previously difficult to drug biomolecular targets. Peptide macrocycles in particular have seen a flurry of recent interest due to their enhanced bioavailability, tunability and specificity. Although these properties make them attractive hit-candidates in early stage drug discovery, knowing which peptides to pursue is nonâtrivial due to the magnitude of the peptide sequence space. Computational screening approaches show promise in their ability to address the size of this search space but suffer from their inability to accurately interrogate the conformational landscape of peptide macrocycles. We developed an inâsilico compound enumerator that was tasked with populating a conformationally laden peptide virtual library. This library was then used in the search for cardioâprotective agents (that may be administered, reducing tissue damage during reperfusion after ischemia (heart attacks)). Our enumerator successfully generated a library of 15.2 billion compounds, requiring the use of compression algorithms, conformational sampling protocols and management of aggregated compute resources in the context of a local cluster. In the absence of experimental biophysical data, we performed biased sampling during alchemical molecular dynamics simulations in order to observe cyclophilinâD perturbation by cyclosporine A and its mitochondrial targeted analogue. Reliable intermediate state averaging through a WHAM analysis of the biased dynamic pulling simulations confirmed that the cardioâprotective activity of cyclosporine A was due to its mitochondrial targeting. Paralleltempered solution molecular dynamics in combination with efficient clustering isolated the essential dynamics of a cyclic peptide scaffold. The rapid enumeration of skeletons from these essential dynamics gave rise to a conformation laden virtual library of all the 15.2 Billion unique cyclic peptides (given the limits on peptide sequence imposed). Analysis of this library showed the exact extent of physicochemical properties covered, relative to the bare scaffold precursor. Molecular docking of a subset of the virtual library against cyclophilinâD showed significant improvements in affinity to the target (relative to cyclosporine A). The conformation laden virtual library, accessed by our methodology, provided derivatives that were able to make many interactions per peptide with the cyclophilinâD target. Machine learning methods showed promise in the training of Support Vector Machines for synthetic feasibility prediction for this library. The synergy between enumeration and conformational sampling greatly improves the performance of this library during virtual screening, even when only a subset is used
Computational studies of protein-ligand molecular recognition
Structure-based drug design is made possible by our understanding of molecular recognition.
The utility of this approach was apparent in the development of the clinically e V ective HIV-1
PR inhibitors, where crystal structures of complexes of HIV-1 protease and inhibitors gave
pivotal information. Computational methods drawing upon structural data are of increasing
relevance to the drug design process. Nonetheless, these methods are quite rudimentary and
signicant improvements are needed. The aim of this thesis was to investigate techniques
which may lead to improved modelling of molecular recognition and a better ability to make
predictions about the binding a Y nity of ligands. The two main themes were the modelling
of acidbase titration behaviour of ligand and receptor, and the application of the simulation
technique of congurational bias Monte Carlo (CBMC). The studies were performed with
HIV-1 PR and its inhibitors as a model system.
Biological processes are inuenced by the pH of the medium in which they take place.
Ligandreceptor binding equilibria are often thermodynamically linked to protonation changes
in ligand and/or receptor, as seen in the the binding of a number of HIV-1 PR inhibitors.
In Chapter 2, a series of sixteen continuum electrostatics pKa calculations of HIV-1 PR
inhibitor complexes was done, in order to characterize the nature and size of these linkages.
The most important e V ects concern changes in the pKa of the enzyme active site aspartate
dyad. Large pKa shifts were predicted in all cases, and at least one of the two dyad pKas
became more basic on binding. At physiologically relevant pH, di V erent ligands induced
di V erent protonation states, with di V erent tautomeric forms favoured. The fully deprotonated
form of the dyad was not signicantly populated for any of the complexes. For about a third
of the complexes, both singly and doubly protonated forms were predicted to be populated.
The predicted predominant protonation states of MVT-101 and VX-478 were consistent with
previous theoretical studies. The size of the predicted pKa shifts for MVT-101 and XK263
di V ered from a previous study using similar methods. The paucity and ambiguity of available
experimental data makes it di Y cult to evaluate the results fully; however the tendency to
exaggerate shifts, as observed in other studies, appears to be present.
Scoring is the prediction of binding a Y nity from the structure of the ligandreceptor
complex, according to an empirical scheme. Scoring studies usually neglect or grossly simplify
the contribution of protonation equilibria to a Y nity, so in Chapter 3 proton linkage data was
included in a regression analysis of the HIV-1 PR complexes from Chapter 2. Parameters
previously shown to correlate with binding, namely electrostatic free energy changes and
buried surface areas, were the basis for the analysis, and terms describing proton linkage,
in the form of a correction for assay pH and an indicator variable for predicted dyad pKa
shift on binding, were also considered. The complex with MVT-101 was an outlier in the
analysis and was excluded. Further analysis demonstrated that the correction for assay pH made a signicant contribution to the regression equation. Amendment of the parameters
for XK263 according to the available experimental data led to an improved regression in
which the term for calculated pKa shifts also made a signicant contribution. The regression
equations obtained had the same form and similar coe Y cients to scoring functions of the
master equation type, and t the experimental data with comparable accuracy.
More physically realistic simulations of ligandreceptor binding using the techniques of
molecular dynamics (MD) or Monte Carlo (MC) are potentially more accurate than scoring
function approaches. These methods are slow, so the alternative of CBMC, which has been
shown to give faster convergence for polymer simulations, was implemented for C harmm 22,
an all-atom protein force eld (Chapter 4). The correctness of the implementation was
demonstrated by comparison with exact and stochastic dynamics (SD) results for individual
terms in the force eld. The algorithm is more complex than those typically used with alkane
force elds, and this has possible consequences for the e Y ciency. CBMC was used to generate
a Ramachandran plot for the alanine dipeptide, and the results were found to be in agreement
with those generated by a SD simulation. Analysis of statistical errors suggests that CBMC
should be competitive with umbrella sampling for simulating conformational equilibria, par-
ticularly when the cost of non-bonded energy evaluations dominates the simulation.
CBMC can be applied to ligandreceptor binding, as demonstrated in grand canonical
simulations of alkane adsorption in zeolites. The more limited problem of nding the pre-
dominant bound conformation of a exible ligand given a rigid protein receptor (i.e. dock-
ing) was treated in Chapter 5, using the example of a tripeptide inhibitor which binds to
HIV-1 PR. Attempts to perform the docking using the Metropolis MC/simulated annealing
and Lamarckian genetic algorithm methods implemented in the program AutoDock failed
to reproduce the native conguration (with runs on the order of two days execution time).
Docking using CBMC, combined with parallel tempering to further improve sampling, was
successful in nding the native binding mode, although this success was dependent on ad hoc
adjustments to the force eld, and a priori knowledge of the ligand protonation state and bind-
ing site. The e Y ciency of the method was considerably lower than hoped, with problems
due to the force eld- and model-dependent coupling between terms in the potential energy
function, and the greedy nature of the CBMC algorithm.
Various conclusions can be drawn from these studies. Chapters 2 and 3 provide evidence
of the importance of protonation equilibria in ligandprotein molecular recognition, and un-
derline the sizable contribution of electrostatic interactions to binding energies. In the face of
this nding, neglect of electrostatic terms, as often seen past studies, appears to be counterpro-
ductive. The scoring study also shows how experimental data can be used more e V ectively if
factors such as assay conditions are carefully taken into account. Implementation of CBMC for
a widely-used protein force eld and application of the algorithm to docking (Chapters 4 and
5) represents a proof of concept for a broadly useful simulation technique. Further work will
be required to nd the right niche for CBMC and fully explore the potential of this and re-
lated techniques. A nal point is the demonstrated utility of the HIV-1 PR test system which
formed the focus of the studies. Abundant structural data has enabled many new approaches
to be tested, and further insights are expected from the analysis of unusual cases, such as the
anomalous results for MVT-101. As well as the question of scoring, studies of mutation and
resistance are likely to attract considerable interest in the future
The roles of processing, presentation and T cell receptor recognition in the T lymphocyte response to a protein antigen
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biology, 1994.Includes bibliographical references (leaves 126-145).by Alex Szabo.Ph.D
Systems biology of the human MHC class I immunopeptidome
Le systĂšme de diffĂ©renciation entre le « soi » et le « non-soi » des vertĂ©brĂ©s permet la dĂ©tection et le rejet de pathogĂšnes et de cellules allogĂ©niques. Il requiert la surveillance de petits peptides prĂ©sentĂ©s Ă la surface cellulaire par les molĂ©cules du complexe majeur dâhistocompatibilitĂ© de classe I (CMH I). Les molĂ©cules du CMH I sont des hĂ©tĂ©rodimĂšres composĂ©s par une chaĂźne lourde encodĂ©e par des gĂšnes du CMH et une chaĂźne lĂ©gĂšre encodĂ©e par le gĂšne ÎČ2-microglobuline. Lâensemble des peptides est appelĂ© lâimmunopeptidome du CMH I. Nous avons utilisĂ© des approches en biologie de systĂšmes pour dĂ©finir la composition et lâorigine cellulaire de lâimmunopeptidome du CMH I prĂ©sentĂ© par des cellules B lymphoblastoĂŻdes dĂ©rivĂ©s de deux pairs de fratries
avec un CMH I identique. Nous avons dĂ©couvert que lâimmunopeptidome du CMH I est spĂ©cifique Ă lâindividu et au type cellulaire, quâil dĂ©rive prĂ©fĂ©rentiellement de transcrits abondants, est enrichi en transcrits possĂ©dant dâĂ©lĂ©ments de reconnaissance par les petits ARNs, mais quâil ne montre aucun biais ni vers les rĂ©gions gĂ©nĂ©tiques invariables ni vers les rĂ©gions polymorphiques. Nous avons Ă©galement dĂ©veloppĂ© une nouvelle mĂ©thode qui combine la spectromĂ©trie de masse, le sĂ©quençage de nouvelle gĂ©nĂ©ration et la bioinformatique pour lâidentification Ă grand Ă©chelle de peptides du CMH I, dont ceux rĂ©sultants de polymorphismes nuclĂ©otidiques simples non-synonymes (PNS-ns), appelĂ©s
antigĂšnes mineurs dâhistocompatibilitĂ© (AMHs), qui sont les cibles de rĂ©ponses allo-immunitaires. La comparaison de lâorigine gĂ©nomique de lâimmunopeptidome de soeurs avec un CMH I identique a rĂ©vĂ©lĂ© que 0,5% des PNS-ns Ă©taient reprĂ©sentĂ©s dans lâimmunopeptidome et que 0,3% des peptides du CMH I seraient immunogĂ©niques envers une des deux soeurs. En rĂ©sumĂ©, nous avons dĂ©couvert des nouveaux facteurs qui modĂšlent lâimmunopeptidome du CMH I et nous prĂ©sentons une nouvelle stratĂ©gie pour lâindentification de ces peptides, laquelle pourrait accĂ©lĂ©rer Ă©normĂ©ment le dĂ©veloppement dâimmunothĂ©rapies ciblant les AMHs.The self/nonself discrimination system of vertebrates allows detection and rejection of pathogens and allogeneic cells. It requires the surveillance of short peptides presented by major histocompatibility class I (MHC I) molecules on the cell surface. MHC I molecules are heterodimers that consist of a heavy chain produced by MHC genes and a light chain encoded by the ÎČ2-microglobulin gene. The peptides presented by MHC I molecules are collectively referred to as the MHC I immunopeptidome. We employed systems biology approaches to define the composition and cellular origin of the self MHC I immunopeptidome presented by B lymphoblastoid cells derived from two pairs of MHC-identical siblings. We found that the MHC I immunopeptidome is subject- and cell-specific, derives preferentially from abundant transcripts, is enriched in transcripts bearing microRNA response elements and shows no bias toward invariant vs. polymorphic genomic sequences. We also developed a novel personalized approach combining mass-spectrometry, next-generation sequencing and bioinformatics for high-throughput identification of MHC I peptides including those caused by nonsynonymous single nucleotide polymorphisms (ns-SNPs), termed minor histocompatibility antigens (MiHAs), which are the targets of allo-immune responses. Comparison of the genomic landscape of the immunopeptidome of MHC-identical siblings revealed that 0.5% of ns-SNPs
were represented in the immunopeptidome and that 0.3% of the MHC I-peptide
repertoire would be immunogenic for one of the siblings. We discovered new factors that shape the self MHC I immunopeptidome and present a novel strategy for the identification of MHC I-associated peptides that could greatly accelerate the development of MiHA-targeted immunotherapy
- âŠ