1,831 research outputs found

    Enhancing in silico protein-based vaccine discovery for eukaryotic pathogens using predicted peptide-MHC binding and peptide conservation scores

    Get PDF
    © 2014 Goodswen et al. Given thousands of proteins constituting a eukaryotic pathogen, the principal objective for a high-throughput in silico vaccine discovery pipeline is to select those proteins worthy of laboratory validation. Accurate prediction of T-cell epitopes on protein antigens is one crucial piece of evidence that would aid in this selection. Prediction of peptides recognised by T-cell receptors have to date proved to be of insufficient accuracy. The in silico approach is consequently reliant on an indirect method, which involves the prediction of peptides binding to major histocompatibility complex (MHC) molecules. There is no guarantee nevertheless that predicted peptide-MHC complexes will be presented by antigen-presenting cells and/or recognised by cognate T-cell receptors. The aim of this study was to determine if predicted peptide-MHC binding scores could provide contributing evidence to establish a protein's potential as a vaccine. Using T-Cell MHC class I binding prediction tools provided by the Immune Epitope Database and Analysis Resource, peptide binding affinity to 76 common MHC I alleles were predicted for 160 Toxoplasma gondii proteins: 75 taken from published studies represented proteins known or expected to induce T-cell immune responses and 85 considered less likely vaccine candidates. The results show there is no universal set of rules that can be applied directly to binding scores to distinguish a vaccine from a non-vaccine candidate. We present, however, two proposed strategies exploiting binding scores that provide supporting evidence that a protein is likely to induce a T-cell immune response-one using random forest (a machine learning algorithm) with a 72% sensitivity and 82.4% specificity and the other, using amino acid conservation scores with a 74.6% sensitivity and 70.5% specificity when applied to the 160 benchmark proteins. More importantly, the binding score strategies are valuable evidence contributors to the overall in silico vaccine discovery pool of evidence

    Evaluating High-Throughput Ab Initio Gene Finders to Discover Proteins Encoded in Eukaryotic Pathogen Genomes Missed by Laboratory Techniques

    Get PDF
    Next generation sequencing technology is advancing genome sequencing at an unprecedented level. By unravelling the code within a pathogen's genome, every possible protein (prior to post-translational modifications) can theoretically be discovered, irrespective of life cycle stages and environmental stimuli. Now more than ever there is a great need for high-throughput ab initio gene finding. Ab initio gene finders use statistical models to predict genes and their exon-intron structures from the genome sequence alone. This paper evaluates whether existing ab initio gene finders can effectively predict genes to deduce proteins that have presently missed capture by laboratory techniques. An aim here is to identify possible patterns of prediction inaccuracies for gene finders as a whole irrespective of the target pathogen. All currently available ab initio gene finders are considered in the evaluation but only four fulfil high-throughput capability: AUGUSTUS, GeneMark_hmm, GlimmerHMM, and SNAP. These gene finders require training data specific to a target pathogen and consequently the evaluation results are inextricably linked to the availability and quality of the data. The pathogen, Toxoplasma gondii, is used to illustrate the evaluation methods. The results support current opinion that predicted exons by ab initio gene finders are inaccurate in the absence of experimental evidence. However, the results reveal some patterns of inaccuracy that are common to all gene finders and these inaccuracies may provide a focus area for future gene finder developers. © 2012 Goodswen et al

    Vacceed: A high-throughput in silico vaccine candidate discovery pipeline for eukaryotic pathogens based on reverse vaccinology

    Full text link
    Summary: We present Vacceed, a highly configurable and scalable framework designed to automate the process of high-throughput in silico vaccine candidate discovery for eukaryotic pathogens. Given thousands of protein sequences from the target pathogen as input, the main output is a ranked list of protein candidates determined by a set of machine learning algorithms. Vacceed has the potential to save time and money by reducing the number of false candidates allocated for laboratory validation. Vacceed, if required, can also predict protein sequences from the pathogen's genome. © The Author 2014

    A gene-based positive selection detection approach to identify vaccine candidates using Toxoplasma gondii as a test case protozoan pathogen

    Full text link
    © 2018 Goodswen, Kennedy and Ellis. Over the last two decades, various in silico approaches have been developed and refined that attempt to identify protein and/or peptide vaccines candidates from informative signals encoded in protein sequences of a target pathogen. As to date, no signal has been identified that clearly indicates a protein will effectively contribute to a protective immune response in a host. The premise for this study is that proteins under positive selection from the immune system are more likely suitable vaccine candidates than proteins exposed to other selection pressures. Furthermore, our expectation is that protein sequence regions encoding major histocompatibility complexes (MHC) binding peptides will contain consecutive positive selection sites. Using freely available data and bioinformatic tools, we present a high-throughput approach through a pipeline that predicts positive selection sites, protein subcellular locations, and sequence locations of medium to high T-Cell MHC class I binding peptides. Positive selection sites are estimated from a sequence alignment by comparing rates of synonymous (dS) and non-synonymous (dN) substitutions among protein coding sequences of orthologous genes in a phylogeny. The main pipeline output is a list of protein vaccine candidates predicted to be naturally exposed to the immune system and containing sites under positive selection. Candidates are ranked with respect to the number of consecutive sites located on protein sequence regions encoding MHCI-binding peptides. Results are constrained by the reliability of prediction programs and quality of input data. Protein sequences from Toxoplasma gondii ME49 strain (TGME49) were used as a case study. Surface antigen (SAG), dense granules (GRA), microneme (MIC), and rhoptry (ROP) proteins are considered worthy T. gondii candidates. Given 8263 TGME49 protein sequences processed anonymously, the top 10 predicted candidates were all worthy candidates. In particular, the top ten included ROP5 and ROP18, which are T. gondii virulence determinants. The chance of randomly selecting a ROP protein was 0.2% given 8263 sequences. We conclude that the approach described is a valuable addition to other in silico approaches to identify vaccines candidates worthy of laboratory validation and could be adapted for other apicomplexan parasite species (with appropriate data)

    Predicting Protein Therapeutic Candidates for Bovine Babesiosis Using Secondary Structure Properties and Machine Learning

    Get PDF
    Bovine babesiosis causes significant annual global economic loss in the beef and dairy cattle industry. It is a disease instigated from infection of red blood cells by haemoprotozoan parasites of the genus Babesia in the phylum Apicomplexa. Principal species are Babesia bovis, Babesia bigemina, and Babesia divergens. There is no subunit vaccine. Potential therapeutic targets against babesiosis include members of the exportome. This study investigates the novel use of protein secondary structure characteristics and machine learning algorithms to predict exportome membership probabilities. The premise of the approach is to detect characteristic differences that can help classify one protein type from another. Structural properties such as a protein's local conformational classification states, backbone torsion angles ϕ (phi) and ψ (psi), solvent-accessible surface area, contact number, and half-sphere exposure are explored here as potential distinguishing protein characteristics. The presented methods that exploit these structural properties via machine learning are shown to have the capacity to detect exportome from non-exportome Babesia bovis proteins with an 86-92% accuracy (based on 10-fold cross validation and independent testing). These methods are encapsulated in freely available Linux pipelines setup for automated, high-throughput processing. Furthermore, proposed therapeutic candidates for laboratory investigation are provided for B. bovis, B. bigemina, and two other haemoprotozoan species, Babesia canis, and Plasmodium falciparum.</i

    Machine learning and applications in microbiology

    Full text link
    To understand the intricacies of microorganisms at the molecular level requires making sense of copious volumes of data such that it may now be humanly impossible to detect insightful data patterns without an artificial intelligence application called machine learning. Applying machine learning to address biological problems is expected to grow at an unprecedented rate, yet it is perceived by the uninitiated as a mysterious and daunting entity entrusted to the domain of mathematicians and computer scientists. The aim of this review is to identify key points required to start the journey of becoming an effective machine learning practitioner. These key points are further reinforced with an evaluation of how machine learning has been applied so far in a broad scope of real-life microbiology examples. This includes predicting drug targets or vaccine candidates, diagnosing microorganisms causing infectious diseases, classifying drug resistance against antimicrobial medicines, predicting disease outbreaks and exploring microbial interactions. Our hope is to inspire microbiologists and other related researchers to join the emerging machine learning revolution

    Solution structure of a repeated unit of the ABA-1 nematode polyprotein allergen of ascaris reveals a novel fold and two discrete lipid-binding sites

    Get PDF
    Parasitic nematode worms cause serious health problems in humans and other animals. They can induce allergic-type immune responses, which can be harmful but may at the same time protect against the infections. Allergens are proteins that trigger allergic reactions and these parasites produce a type that is confined to nematodes, the nematode polyprotein allergens (NPAs). These are synthesized as large precursor proteins comprising repeating units of similar amino acid sequence that are subsequently cleaved into multiple copies of the allergen protein. NPAs bind small lipids such as fatty acids and retinol (Vitamin A) and probably transport these sensitive and insoluble compounds between the tissues of the worms. Nematodes cannot synthesize these lipids, so NPAs may also be crucial for extracting nutrients from their hosts. They may also be involved in altering immune responses by controlling the lipids by which the immune and inflammatory cells communicate. We describe the molecular structure of one unit of an NPA, the well-known ABA-1 allergen of Ascaris and find its structure to be of a type not previously found for lipid-binding proteins, and we describe the unusual sites where lipids bind within this structur

    Caterpillars and fungal pathogens: two co-occurring parasites of an ant-plant mutualism

    Get PDF
    In mutualisms, each interacting species obtains resources from its partner that it would obtain less efficiently if alone, and so derives a net fitness benefit. In exchange for shelter (domatia) and food, mutualistic plant-ants protect their host myrmecophytes from herbivores, encroaching vines and fungal pathogens. Although selective filters enable myrmecophytes to host those ant species most favorable to their fitness, some insects can by-pass these filters, exploiting the rewards supplied whilst providing nothing in return. This is the case in French Guiana for Cecropia obtusa (Cecropiaceae) as Pseudocabima guianalis caterpillars (Lepidoptera, Pyralidae) can colonize saplings before the installation of their mutualistic Azteca ants. The caterpillars shelter in the domatia and feed on food bodies (FBs) whose production increases as a result. They delay colonization by ants by weaving a silk shield above the youngest trichilium, where the FBs are produced, blocking access to them. This probable temporal priority effect also allows female moths to lay new eggs on trees that already shelter caterpillars, and so to occupy the niche longer and exploit Cecropia resources before colonization by ants. However, once incipient ant colonies are able to develop, they prevent further colonization by the caterpillars. Although no higher herbivory rates were noted, these caterpillars are ineffective in protecting their host trees from a pathogenic fungus, Fusarium moniliforme (Deuteromycetes), that develops on the trichilium in the absence of mutualistic ants. Therefore, the Cecropia treelets can be parasitized by two often overlooked species: the caterpillars that shelter in the domatia and feed on FBs, delaying colonization by mutualistic ants, and the fungal pathogen that develops on old trichilia. The cost of greater FB production plus the presence of the pathogenic fungus likely affect tree growth
    • …
    corecore