1,516 research outputs found

    Meta-analytic approach to the accurate prediction of secreted virulence effectors in gram-negative bacteria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many pathogens use a type III secretion system to translocate virulence proteins (called effectors) in order to adapt to the host environment. To date, many prediction tools for effector identification have been developed. However, these tools are insufficiently accurate for producing a list of putative effectors that can be applied directly for labor-intensive experimental verification. This also suggests that important features of effectors have yet to be fully characterized.</p> <p>Results</p> <p>In this study, we have constructed an accurate approach to predicting secreted virulence effectors from Gram-negative bacteria. This consists of a support vector machine-based discriminant analysis followed by a simple criteria-based filtering. The accuracy was assessed by estimating the average number of true positives in the top-20 ranking in the genome-wide screening. In the validation, 10 sets of 20 training and 20 testing examples were randomly selected from 40 known effectors of <it>Salmonella enterica </it>serovar Typhimurium LT2. On average, the SVM portion of our system predicted 9.7 true positives from 20 testing examples in the top-20 of the prediction. Removal of the N-terminal instability, codon adaptation index and ProtParam indices decreased the score to 7.6, 8.9 and 7.9, respectively. These discrimination features suggested that the following characteristics of effectors had been uncovered: unstable N-terminus, non-optimal codon usage, hydrophilic, and less aliphathic. The secondary filtering process represented by coexpression analysis and domain distribution analysis further refined the average true positive counts to 12.3. We further confirmed that our system can correctly predict known effectors of <it>P. syringae </it>DC3000, strongly indicating its feasibility.</p> <p>Conclusions</p> <p>We have successfully developed an accurate prediction system for screening effectors on a genome-wide scale. We confirmed the accuracy of our system by external validation using known effectors of <it>Salmonella </it>and obtained the accurate list of putative effectors of the organism. The level of accuracy was sufficient to yield candidates for gene-directed experimental verification. Furthermore, new features of effectors were revealed: non-optimal codon usage and instability of the N-terminal region. From these findings, a new working hypothesis is proposed regarding mechanisms controlling the translocation of virulence effectors and determining the substrate specificity encoded in the secretion system.</p

    PredT4SE-Stack: Prediction of Bacterial Type IV Secreted Effectors From Protein Sequences Using a Stacked Ensemble Method

    Get PDF
    Gram-negative bacteria use various secretion systems to deliver their secreted effectors. Among them, type IV secretion system exists widely in a variety of bacterial species, and secretes type IV secreted effectors (T4SEs), which play vital roles in host-pathogen interactions. However, experimental approaches to identify T4SEs are time- and resource-consuming. In the present study, we aim to develop an in silico stacked ensemble method to predict whether a protein is an effector of type IV secretion system or not based on its sequence information. The protein sequences were encoded by the feature of position specific scoring matrix (PSSM)-composition by summing rows that correspond to the same amino acid residues in PSSM profiles. Based on the PSSM-composition features, we develop a stacked ensemble model PredT4SE-Stack to predict T4SEs, which utilized an ensemble of base-classifiers implemented by various machine learning algorithms, such as support vector machine, gradient boosting machine, and extremely randomized trees, to generate outputs for the meta-classifier in the classification system. Our results demonstrated that the framework of PredT4SE-Stack was a feasible and effective way to accurately identify T4SEs based on protein sequence information. The datasets and source code of PredT4SE-Stack are freely available at http://xbioinfo.sjtu.edu.cn/PredT4SE_Stack/index.php

    Computational prediction of type III secreted proteins from gram-negative bacteria

    Get PDF
    Abstract Background Type III secretion system (T3SS) is a specialized protein delivery system in gram-negative bacteria that injects proteins (called effectors) directly into the eukaryotic host cytosol and facilitates bacterial infection. For many plant and animal pathogens, T3SS is indispensable for disease development. Recently, T3SS has also been found in rhizobia and plays a crucial role in the nodulation process. Although a great deal of efforts have been done to understand type III secretion, the precise mechanism underlying the secretion and translocation process has not been fully understood. In particular, defined secretion and translocation signals enabling the secretion have not been identified from the type III secreted effectors (T3SEs), which makes the identification of these important virulence factors notoriously challenging. The availability of a large number of sequenced genomes for plant and animal-associated bacteria demands the development of efficient and effective prediction methods for the identification of T3SEs using bioinformatics approaches. Results We have developed a machine learning method based on the N-terminal amino acid sequences to predict novel type III effectors in the plant pathogen Pseudomonas syringae and the microsymbiont rhizobia. The extracted features used in the learning model (or classifier) include amino acid composition, secondary structure and solvent accessibility information. The method achieved a precision of over 90% on P. syringae in a cross validation study. In combination with a promoter screen for the type III specific promoters, this classifier trained on the P. syringae data was applied to predict novel T3SEs from the genomic sequences of four rhizobial strains. This application resulted in 57 candidate type III secreted proteins, 17 of which are confirmed effectors. Conclusion Our experimental results demonstrate that the machine learning method based on N-terminal amino acid sequences combined with a promoter screen could prove to be a very effective computational approach for predicting novel type III effectors in gram-negative bacteria. Our method and data are available to the public upon request

    Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing

    Get PDF
    BACKGROUND: The major role of enzymatic toxins that target nucleic acids in biological conflicts at all levels has become increasingly apparent thanks in large part to the advances of comparative genomics. Typically, toxins evolve rapidly hampering the identification of these proteins by sequence analysis. Here we analyze an unexpectedly widespread superfamily of toxin domains most of which possess RNase activity. RESULTS: The HEPN superfamily is comprised of all α-helical domains that were first identified as being associated with DNA polymerase β-type nucleotidyltransferases in prokaryotes and animal Sacsin proteins. Using sensitive sequence and structure comparison methods, we vastly extend the HEPN superfamily by identifying numerous novel families and by detecting diverged HEPN domains in several known protein families. The new HEPN families include the RNase LS and LsoA catalytic domains, KEN domains (e.g. RNaseL and Ire1) and the RNase domains of RloC and PrrC. The majority of HEPN domains contain conserved motifs that constitute a metal-independent endoRNase active site. Some HEPN domains lacking this motif probably function as non-catalytic RNA-binding domains, such as in the case of the mannitol repressor MtlR. Our analysis shows that HEPN domains function as toxins that are shared by numerous systems implicated in intra-genomic, inter-genomic and intra-organismal conflicts across the three domains of cellular life. In prokaryotes HEPN domains are essential components of numerous toxin-antitoxin (TA) and abortive infection (Abi) systems and in addition are tightly associated with many restriction-modification (R-M) and CRISPR-Cas systems, and occasionally with other defense systems such as Pgl and Ter. We present evidence of multiple modes of action of HEPN domains in these systems, which include direct attack on viral RNAs (e.g. LsoA and RNase LS) in conjunction with other RNase domains (e.g. a novel RNase H fold domain, NamA), suicidal or dormancy-inducing attack on self RNAs (RM systems and possibly CRISPR-Cas systems), and suicidal attack coupled with direct interaction with phage components (Abi systems). These findings are compatible with the hypothesis on coupling of pathogen-targeting (immunity) and self-directed (programmed cell death and dormancy induction) responses in the evolution of robust antiviral strategies. We propose that altruistic cell suicide mediated by HEPN domains and other functionally similar RNases was essential for the evolution of kin and group selection and cell cooperation. HEPN domains were repeatedly acquired by eukaryotes and incorporated into several core functions such as endonucleolytic processing of the 5.8S-25S/28S rRNA precursor (Las1), a novel ER membrane-associated RNA degradation system (C6orf70), sensing of unprocessed transcripts at the nuclear periphery (Swt1). Multiple lines of evidence suggest that, similar to prokaryotes, HEPN proteins were recruited to antiviral, antitransposon, apoptotic systems or RNA-level response to unfolded proteins (Sacsin and KEN domains) in several groups of eukaryotes. CONCLUSIONS: Extensive sequence and structure comparisons reveal unexpectedly broad presence of the HEPN domain in an enormous variety of defense and stress response systems across the tree of life. In addition, HEPN domains have been recruited to perform essential functions, in particular in eukaryotic rRNA processing. These findings are expected to stimulate experiments that could shed light on diverse cellular processes across the three domains of life. REVIEWERS: This article was reviewed by Martijn Huynen, Igor Zhulin and Nick Grishi

    STRUCTURAL AND FUNCTIONAL STUDIES OF SALMONELLA TYPHIMURIUM EFFECTORS GTGE AND SPVB

    Get PDF
    Salmonella is a genus of Gram-negative bacteria, which is a major cause of foodborne disease. To create a safe intracellular environment for the pathogen within the host cells, Salmonella secretes a large number of effectors into the host cytosol. My research concentrates on a structural and functional characterization of two Salmonella effectors, GtgE and SpvB. GtgE is a cysteine protease that specifically cleaves closely related GTPases, Rab29, Rab32 and Rab38. The full-length GtgE and several truncated constructs were cloned, expressed and purified. Extensive crystallization trials were performed with these constructs. In addition, several crystallization rescue strategies have been employed, including introducing entropy-reducing surface mutations, chemical modification to the protein surface, and co-crystallization of inactive GtgE variants with substrate peptide. Despite these extensive efforts, no crystals were obtained. However, new information about GtgE was discovered such as low protease activity in vitro against GST-Rab32 and identification of the N-terminal ~30 residues of GtgE as required for the full function. SpvB is a mono-ADP-ribosyltransferase that modifies G-actin. SpvB is composed of two structural domains. The C-terminal domain of SpvB (SpvB-C) possesses the mono-ADP-ribosyltransferase activity and its structure has been determined. This study provides the first structural determination of the N-terminal domain of SpvB (SpvB-N) at 2.4 Å resolution. This domain is made primarily of β-strands and shows similarity to the N-terminal segment of YenB, an ABC toxin component. A long groove on the protein surface suggests that it functions as a recognition domain. The hypothesis that SpvB-N guides the localization of SpvB and targets the protein to actin was tested, but a co-immunoprecipitation assay excluded strong interaction between SpvB-N and actin. Moreover, SpvB-N expressed as a GFP fusion localized to the nucleus while SpvB and SpvB-C localized to the cytosol. I have shown further that SpvB-C was sufficient to disrupt the actin cytoskeleton and induce cell apoptosis. The similarity of SpvB-N to YenB prompted us to investigate its cytotoxicity but only a marginal effect on the host cells was noted. Experiments to study the function of the SpvB-N were able to exclude some possibilities and narrowed down the spectrum of potential functions

    Computational and experimental analysis of TAL effector-DNA binding

    Get PDF
    TAL effectors, from the plant-pathogenic bacterial genus Xanthomonas, are DNA binding proteins that can be engineered to bind to almost any sequence of interest. The DNA target of the TAL effector is encoded by a modular central repeat region, with each repeat specifying a single binding site nucleotide. TAL effectors can be targeted to novel DNA sequences by assembling the corresponding repeat sequence. Therefore, custom TAL effectors have become important tools for manipulating gene expression and creating site-specific DNA modifications. This dissertation explores TAL effector-DNA binding through computational and experimental analyses. I identified positional and composition biases in known TAL effector-target pairs and proposed guidelines for designing custom TAL effectors and TAL effector nucleases (TALENs). Using these guidelines, I created a software tool for TAL effector design. We expanded this tool into a suite of tools for TAL effector/TALEN design and target site prediction. Target site predictions can be used to estimate potential off-target binding of custom TAL effector constructs or to identify unknown targets of natural TAL effectors. Next, I present a case study in engineering disease resist rice plants. Inserting multiple TAL effector binding elements (EBEs) into the promoter of a rice resistance gene conferred resistance to diverse strains of Xanthomonas oryzae. Analysis of the EBE sequences revealed that TAL effectors have evolved to target specific host regulatory sequences, and caution is warranted when introducing such sequences into the promoter of an executor resistance gene. Finally, I examine the role of the TAL effector N terminus in DNA binding. Most natural TAL effector binding sites are preceded by a T at the 5\u27 end (T0). Structural data suggests T0 is encoded by tryptophan 232 (W232) in the cryptic -1st repeat. We show that substitutions for W232 alter TAL effector activity and specificity for T0. However, we find that the TAL effector-T0 interaction is complex and may depend on other residues in the -1st repeat, the 0th cryptic repeat, or repeat sequence context. Better understanding of TAL effector-DNA binding will improve TAL effector design and target prediction and enhance understanding of the role of TAL effectors in plant disease
    corecore