22 research outputs found

    HHomp—prediction and classification of outer membrane proteins

    Get PDF
    Outer membrane proteins (OMPs) are the transmembrane proteins found in the outer membranes of Gram-negative bacteria, mitochondria and plastids. Most prediction methods have focused on analogous features, such as alternating hydrophobicity patterns. Here, we start from the observation that almost all β-barrel OMPs are related by common ancestry. We identify proteins as OMPs by detecting their homologous relationships to known OMPs using sequence similarity. Given an input sequence, HHomp builds a profile hidden Markov model (HMM) and compares it with an OMP database by pairwise HMM comparison, integrating OMP predictions by PROFtmb. A crucial ingredient is the OMP database, which contains profile HMMs for over 20 000 putative OMP sequences. These were collected with the exhaustive, transitive homology detection method HHsenser, starting from 23 representative OMPs in the PDB database. In a benchmark on TransportDB, HHomp detects 63.5% of the true positives before including the first false positive. This is 70% more than PROFtmb, four times more than BOMP and 10 times more than TMB-Hunt. In Escherichia coli, HHomp identifies 57 out of 59 known OMPs and correctly assigns them to their functional subgroups. HHomp can be accessed at http://toolkit.tuebingen.mpg.de/hhomp

    Outer membrane proteins can be simply identified using secondary structure element alignment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Outer membrane proteins (OMPs) are frequently found in the outer membranes of gram-negative bacteria, mitochondria and chloroplasts and have been found to play diverse functional roles. Computational discrimination of OMPs from globular proteins and other types of membrane proteins is helpful to accelerate new genome annotation and drug discovery.</p> <p>Results</p> <p>Based on the observation that almost all OMPs consist of antiparallel β-strands in a barrel shape and that their secondary structure arrangements differ from those of other types of proteins, we propose a simple method called SSEA-OMP to identify OMPs using secondary structure element alignment. Through intensive benchmark experiments, the proposed SSEA-OMP method is better than some well-established OMP detection methods.</p> <p>Conclusions</p> <p>The major advantage of SSEA-OMP is its good prediction performance considering its simplicity. The web server implements the method is freely accessible at <url>http://protein.cau.edu.cn/SSEA-OMP/index.html</url>.</p

    Protein Domain of Unknown Function 3233 is a Translocation Domain of Autotransporter Secretory Mechanism in Gamma proteobacteria

    Get PDF
    Vibrio cholerae, the enteropathogenic gram negative bacteria is one of the main causative agents of waterborne diseases like cholera. About 1/3rd of the organism's genome is uncharacterised with many protein coding genes lacking structure and functional information. These proteins form significant fraction of the genome and are crucial in understanding the organism's complete functional makeup. In this study we report the general structure and function of a family of hypothetical proteins, Domain of Unknown Function 3233 (DUF3233), which are conserved across gram negative gammaproteobacteria (especially in Vibrio sp. and similar bacteria). Profile and HMM based sequence search methods were used to screen homologues of DUF3233. The I-TASSER fold recognition method was used to build a three dimensional structural model of the domain. The structure resembles the transmembrane beta-barrel with an axial N-terminal helix and twelve antiparallel beta-strands. Using a combination of amphipathy and discrimination analysis we analysed the potential transmembrane beta-barrel forming properties of DUF3233. Sequence, structure and phylogenetic analysis of DUF3233 indicates that this gram negative bacterial hypothetical protein resembles the beta-barrel translocation unit of autotransporter Va secretory mechanism with a gene organisation that differs from the conventional Va system

    The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    No full text
    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment

    Eukaryote-wide sequence analysis of mitochondrial β-barrel outer membrane proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The outer membranes of mitochondria are thought to be homologous to the outer membranes of Gram negative bacteria, which contain 100's of distinct families of <it>β</it>-barrel membrane proteins (BOMPs) often forming channels for transport of nutrients or drugs. However, only four families of mitochondrial BOMPs (MBOMPs) have been confirmed to date. Although estimates as high as 100 have been made in the past, the number of yet undiscovered MBOMPs is an open question. Fortunately, the recent discovery of a membrane integration signal (the <it>β</it>-signal) for MBOMPs gave us an opportunity to look for undiscovered MBOMPs.</p> <p>Results</p> <p>We present the results of a comprehensive survey of eukaryotic protein sequences intended to identify new MBOMPs. Our search employs recent results on <it>β</it>-signals as well as structural information and a novel BOMP predictor trained on both bacterial and mitochondrial BOMPs. Our principal finding is circumstantial evidence suggesting that few MBOMPs remain to be discovered, if one assumes that, like known MBOMPs, novel MBOMPs will be monomeric and <it>β</it>-signal dependent. In addition to this, our analysis of MBOMP homologs reveals some exceptions to the current model of the <it>β</it>-signal, but confirms its consistent presence in the C-terminal region of MBOMP proteins. We also report a <it>β</it>-signal independent search for MBOMPs against the yeast and Arabidopsis proteomes. We find no good candidates MBOMPs in yeast but the Arabidopsis results are less conclusive.</p> <p>Conclusions</p> <p>Our results suggest there are no remaining MBOMPs left to discover in yeast; and if one assumes all MBOMPs are <it>β</it>-signal dependent, few MBOMP families remain undiscovered in any sequenced organism.</p

    In silico proteomic and phylogenetic analysis of the outer membrane protein repertoire of gastric Helicobacter species

    Get PDF
    Helicobacter (H.) pylori is an important risk factor for gastric malignancies worldwide. Its outer membrane proteome takes an important role in colonization of the human gastric mucosa. However, in zoonotic non-H. pylori helicobacters (NHPHs) also associated with human gastric disease, the composition of the outer membrane (OM) proteome and its relative contribution to disease remain largely unknown. By means of a comprehensive survey of the diversity and distribution of predicted outer membrane proteins (OMPs) identified in all known gastric Helicobacter species with fully annotated genome sequences, we found genus- and species-specific families known or thought to be implicated in virulence. Hop adhesins, part of the Helicobacter-specific family 13 (Hop, Hor and Horn) were restricted to the gastric species H. pylori, H. cetorum and H. acinonychis. Hof proteins (family 33) were putative adhesins with predicted Occ- or MOMP-family like 18-stranded beta-barrels. They were found to be widespread amongst all gastric Helicobacter species only sporadically detected in enterohepatic Helicobacter species. These latter are other members within the genus Helicobacter, although ecologically and genetically distinct. LpxR, a lipopolysaccharide remodeling factor, was also detected in all gastric Helicobacter species but lacking as well from the enterohepatic species H. cinaedi, H. equorum and H. hepaticus. In conclusion, our systemic survey of Helicobacter OMPs points to species and infection-site specific members that are interesting candidates for future virulence and colonization studies.Peer reviewe

    CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources

    Get PDF
    International audienceBACKGROUND: The functions of proteins are strongly related to their localization in cell compartments (for example the cytoplasm or membranes) but the experimental determination of the sub-cellular localization of proteomes is laborious and expensive. A fast and low-cost alternative approach is in silico prediction, based on features of the protein primary sequences. However, biologists are confronted with a very large number of computational tools that use different methods that address various localization features with diverse specificities and sensitivities. As a result, exploiting these computer resources to predict protein localization accurately involves querying all tools and comparing every prediction output; this is a painstaking task. Therefore, we developed a comprehensive database, called CoBaltDB, that gathers all prediction outputs concerning complete prokaryotic proteomes. DESCRIPTION: The current version of CoBaltDB integrates the results of 43 localization predictors for 784 complete bacterial and archaeal proteomes (2.548.292 proteins in total). CoBaltDB supplies a simple user-friendly interface for retrieving and exploring relevant information about predicted features (such as signal peptide cleavage sites and transmembrane segments). Data are organized into three work-sets ("specialized tools", "meta-tools" and "additional tools"). The database can be queried using the organism name, a locus tag or a list of locus tags and may be browsed using numerous graphical and text displays. CONCLUSIONS: With its new functionalities, CoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations with higher confidence than previously possible. CoBaltDB is available at http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten

    Predicting the outer membrane proteome of Pasteurella multocida based on consensus prediction enhanced by results integration and manual confirmation

    Get PDF
    Background Outer membrane proteins (OMPs) of Pasteurella multocida have various functions related to virulence and pathogenesis and represent important targets for vaccine development. Various bioinformatic algorithms can predict outer membrane localization and discriminate OMPs by structure or function. The designation of a confident prediction framework by integrating different predictors followed by consensus prediction, results integration and manual confirmation will improve the prediction of the outer membrane proteome. Results In the present study, we used 10 different predictors classified into three groups (subcellular localization, transmembrane β-barrel protein and lipoprotein predictors) to identify putative OMPs from two available P. multocida genomes: those of avian strain Pm70 and porcine non-toxigenic strain 3480. Predicted proteins in each group were filtered by optimized criteria for consensus prediction: at least two positive predictions for the subcellular localization predictors, three for the transmembrane β-barrel protein predictors and one for the lipoprotein predictors. The consensus predicted proteins were integrated from each group into a single list of proteins. We further incorporated a manual confirmation step including a public database search against PubMed and sequence analyses, e.g. sequence and structural homology, conserved motifs/domains, functional prediction, and protein-protein interactions to enhance the confidence of prediction. As a result, we were able to confidently predict 98 putative OMPs from the avian strain genome and 107 OMPs from the porcine strain genome with 83% overlap between the two genomes. Conclusions The bioinformatic framework developed in this study has increased the number of putative OMPs identified in P. multocida and allowed these OMPs to be identified with a higher degree of confidence. Our approach can be applied to investigate the outer membrane proteomes of other Gram-negative bacteria

    Accelerated microevolution in an outer membrane protein (OMP) of the intracellular bacteria Wolbachia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Outer membrane proteins (OMPs) of Gram-negative bacteria are key players in the biology of bacterial-host interactions. However, while considerable attention has been given to OMPs of vertebrate pathogens, relatively little is known about the role of these proteins in bacteria that primarily infect invertebrates. One such OMP is found in the intracellular bacteria <it>Wolbachia</it>, which are widespread symbionts of arthropods and filarial nematodes. Recent experimental studies have shown that the <it>Wolbachia </it>surface protein (WSP) can trigger host immune responses and control cell death programming in humans, suggesting a key role of WSP for establishment and persistence of the symbiosis in arthropods.</p> <p>Results</p> <p>Here we performed an analysis of 515 unique alleles found in 831 <it>Wolbachia </it>isolates, to investigate WSP structure, microevolution and population genetics. WSP shows an eight-strand transmembrane β-barrel structure with four extracellular loops containing hypervariable regions (HVRs). A clustering approach based upon patterns of HVR haplotype diversity was used to group similar WSP sequences and to estimate the relative contribution of mutation and recombination during early stages of protein divergence. Results indicate that although point mutations generate most of the new protein haplotypes, recombination is a predominant force triggering diversity since the very first steps of protein evolution, causing at least 50% of the total amino acid variation observed in recently diverged proteins. Analysis of synonymous variants indicates that individual WSP protein types are subject to a very rapid turnover and that HVRs can accommodate a virtually unlimited repertoire of peptides. Overall distribution of WSP across hosts supports a non-random association of WSP with the host genus, although extensive horizontal transfer has occurred also in recent times.</p> <p>Conclusions</p> <p>In OMPs of vertebrate pathogens, large recombination impact, positive selection, reduced structural and compositional constraints, and extensive lateral gene transfer are considered hallmarks of evolution in response to the adaptive immune system. However, <it>Wolbachia </it>do not infect vertebrates. Here we predict that the rapid turnover of WSP loop motifs could aid in evading or inhibiting the invertebrate innate immune response. Overall, these features identify WSP as a strong candidate for future studies of host-<it>Wolbachia </it>interactions that affect establishment and persistence of this widespread endosymbiosis.</p

    Machine learning tools for protein annotation: the cases of transmembrane β-barrel and myristoylated proteins

    Get PDF
    Biology is now a “Big Data Science” thanks to technological advancements allowing the characterization of the whole macromolecular content of a cell or a collection of cells. This opens interesting perspectives, but only a small portion of this data may be experimentally characterized. From this derives the demand of accurate and efficient computational tools for automatic annotation of biological molecules. This is even more true when dealing with membrane proteins, on which my research project is focused leading to the development of two machine learning-based methods: BetAware-Deep and SVMyr. BetAware-Deep is a tool for the detection and topology prediction of transmembrane beta-barrel proteins found in Gram-negative bacteria. These proteins are involved in many biological processes and primary candidates as drug targets. BetAware-Deep exploits the combination of a deep learning framework (bidirectional long short-term memory) and a probabilistic graphical model (grammatical-restrained hidden conditional random field). Moreover, it introduced a modified formulation of the hydrophobic moment, designed to include the evolutionary information. BetAware-Deep outperformed all the available methods in topology prediction and reported high scores in the detection task. Glycine myristoylation in Eukaryotes is the binding of a myristic acid on an N-terminal glycine. SVMyr is a fast method based on support vector machines designed to predict this modification in dataset of proteomic scale. It uses as input octapeptides and exploits computational scores derived from experimental examples and mean physicochemical features. SVMyr outperformed all the available methods for co-translational myristoylation prediction. In addition, it allows (as a unique feature) the prediction of post-translational myristoylation. Both the tools here described are designed having in mind best practices for the development of machine learning-based tools outlined by the bioinformatics community. Moreover, they are made available via user-friendly web servers. All this make them valuable tools for filling the gap between sequential and annotated data
    corecore