2,218 research outputs found

    Investigation of transmembrane proteins using a computational approach

    Get PDF
    Background: An important subfamily of membrane proteins are the transmembrane α-helical proteins, in which the membrane-spanning regions are made up of α-helices. Given the obvious biological and medical significance of these proteins, it is of tremendous practical importance to identify the location of transmembrane segments. The difficulty of inferring the secondary or tertiary structure of transmembrane proteins using experimental techniques has led to a surge of interest in applying techniques from machine learning and bioinformatics to infer secondary structure from primary structure in these proteins. We are therefore interested in determining which physicochemical properties are most useful for discriminating transmembrane segments from non-transmembrane segments in transmembrane proteins, and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins, and in using the results of these investigations to develop classifiers to identify transmembrane segments in transmembrane proteins. Results: We determined that the most useful properties for discriminating transmembrane segments from non-transmembrane segments and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins were hydropathy, polarity, and flexibility, and used the results of this analysis to construct classifiers to discriminate transmembrane segments from non-transmembrane segments using four classification techniques: two variants of the Self-Organizing Global Ranking algorithm, a decision tree algorithm, and a support vector machine algorithm. All four techniques exhibited good performance, with out-of-sample accuracies of approximately 75%. Conclusions: Several interesting observations emerged from our study: intrinsically unstructured segments and transmembrane segments tend to have opposite properties; transmembrane proteins appear to be much richer in intrinsically unstructured segments than other proteins; and, in approximately 70% of transmembrane proteins that contain intrinsically unstructured segments, the intrinsically unstructured segments are close to transmembrane segments

    TRAMPLE: the transmembrane protein labelling environment

    Get PDF
    TRAMPLE () is a web application server dedicated to the detection and the annotation of transmembrane protein sequences. TRAMPLE includes different state-of-the-art algorithms for the prediction of signal peptides, transmembrane segments (both beta-strands and alpha-helices), secondary structure and fast fold recognition. TRAMPLE also includes a complete content management system to manage the results of the predictions. Each user of the server has his/her own workplace, where the data can be stored, organized, accessed and annotated with documents through a simple web-based interface. In this manner, TRAMPLE significantly improves usability with respect to other more traditional web servers

    CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources

    Get PDF
    International audienceBACKGROUND: The functions of proteins are strongly related to their localization in cell compartments (for example the cytoplasm or membranes) but the experimental determination of the sub-cellular localization of proteomes is laborious and expensive. A fast and low-cost alternative approach is in silico prediction, based on features of the protein primary sequences. However, biologists are confronted with a very large number of computational tools that use different methods that address various localization features with diverse specificities and sensitivities. As a result, exploiting these computer resources to predict protein localization accurately involves querying all tools and comparing every prediction output; this is a painstaking task. Therefore, we developed a comprehensive database, called CoBaltDB, that gathers all prediction outputs concerning complete prokaryotic proteomes. DESCRIPTION: The current version of CoBaltDB integrates the results of 43 localization predictors for 784 complete bacterial and archaeal proteomes (2.548.292 proteins in total). CoBaltDB supplies a simple user-friendly interface for retrieving and exploring relevant information about predicted features (such as signal peptide cleavage sites and transmembrane segments). Data are organized into three work-sets ("specialized tools", "meta-tools" and "additional tools"). The database can be queried using the organism name, a locus tag or a list of locus tags and may be browsed using numerous graphical and text displays. CONCLUSIONS: With its new functionalities, CoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations with higher confidence than previously possible. CoBaltDB is available at http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten

    A Combination of Compositional Index and Genetic Algorithm for Predicting Transmembrane Helical Segments

    Get PDF
    Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm

    The Use of Internal and External Functional Domains to Improve Transmembrane Protein Topology Prediction

    Get PDF
    Membrane proteins are involved in vital cellular functions and have important implications in disease processes, drug design and therapy. However, it is difficult to obtain diffraction quality crystals to study transmembrane protein structure. Transmembrane protein topology prediction tools try to fill in the gap between abundant number of transmembrane proteins and scarce number of known membrane protein structures (3D structure and biochemically characterized topology). However, at present, the prediction accuracy is still far from perfect. TMHMM is the current state-of- the-art method for membrane protein topology prediction. In order to improve the prediction accuracy of TMHMM, based upon the method of GenomeScan, the author implemented AHMM (augmented HMM) by incorporating functional domain information externally to TMHMM. Results show that AHMM is better than TMHMM on both helix and sidedness prediction. This improvement is verified by both statistical tests as well as sensitivity and specificity studies. It is expected that when more and more functional domain predictors are available, the prediction accuracy will be further improved

    PONGO: a web server for multiple predictions of all-alpha transmembrane proteins

    Get PDF
    The annotation efforts of the BIOSAPIENS European Network of Excellence have generated several distributed annotation systems (DAS) with the aim of integrating Bioinformatics resources and annotating metazoan genomes (). In this context, the PONGO DAS server () provides the annotation on predictive basis for the all-alpha membrane proteins in the human genome, not only through DAS queries, but also directly using a simple web interface. In order to produce a more comprehensive analysis of the sequence at hand, this annotation is carried out with four selected and high scoring predictors: TMHMM2.0, MEMSAT, PRODIV and ENSEMBLE1.0. The stored and pre-computed predictions for the human proteins can be searched and displayed in a graphical view. However the web service allows the prediction of the topology of any kind of putative membrane proteins, regardless of the organism and more importantly with the same sequence profile for a given sequence when required. Here we present a new web server that incorporates the state-of-the-art topology predictors in a single framework, so that putative users can interactively compare and evaluate four predictions simultaneously for a given sequence. Together with the predicted topology, the server also displays a signal peptide prediction determined with SPEP. The PONGO web server is available at

    Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins

    Get PDF
    BACKGROUND: Hidden Markov Models (HMMs) have been extensively used in computational molecular biology, for modelling protein and nucleic acid sequences. In many applications, such as transmembrane protein topology prediction, the incorporation of limited amount of information regarding the topology, arising from biochemical experiments, has been proved a very useful strategy that increased remarkably the performance of even the top-scoring methods. However, no clear and formal explanation of the algorithms that retains the probabilistic interpretation of the models has been presented so far in the literature. RESULTS: We present here, a simple method that allows incorporation of prior topological information concerning the sequences at hand, while at the same time the HMMs retain their full probabilistic interpretation in terms of conditional probabilities. We present modifications to the standard Forward and Backward algorithms of HMMs and we also show explicitly, how reliable predictions may arise by these modifications, using all the algorithms currently available for decoding HMMs. A similar procedure may be used in the training procedure, aiming at optimizing the labels of the HMM's classes, especially in cases such as transmembrane proteins where the labels of the membrane-spanning segments are inherently misplaced. We present an application of this approach developing a method to predict the transmembrane regions of alpha-helical membrane proteins, trained on crystallographically solved data. We show that this method compares well against already established algorithms presented in the literature, and it is extremely useful in practical applications. CONCLUSION: The algorithms presented here, are easily implemented in any kind of a Hidden Markov Model, whereas the prediction method (HMM-TM) is freely available for academic users at , offering the most advanced decoding options currently available

    Membrane proteins in the outer mebrane of plastids and mitochondria

    Get PDF
    Channels of the plastid and mitochondrial outer membranes facilitate the turnover of molecules and ions via these membranes. Although channels have been studied many questions pertaining to the whole diversity of plastid and mitochondrial channels in Arabidopsis thaliana and Pisum sativum remain unanswered. In this thesis I studied OEP16, OEP37 and VDAC families in two model plants, in Arabidopsis and pea. The Arabidopsis OEP16 family represents four channels of α-helical structure, similar to the pea OEP16 protein. These channels are suggested to transport amino acids and compounds with primary amino groups. Immunoblot analysis, GFP/RFP protein fusion expression, as well as proteomic analysis showed that AtOEP16.1, AtOEP16.2 and AtOEP16.4 are located in the outer envelope membrane of plastids, while AtOEP16.3 is in mitochondria. The gene expression and immunoblot analyses revealed that AtOEP16.1 and AtOEP16.3 proteins are highly abundant and ubiquitous; expression of AtOEP16.1 is regulated by light and cold. AtOEP16.2 is highly expressed in pollen, seeds and seedlings. AtOEP16.4 is a low expressed housekeeping protein. Single knockout mutants of AtOEP16.1, AtOEP16.2 and AtOEP16.4, and double mutants of AtOEP16 gene family did not show any remarkable phenotype. However, macroarray analysis of Atoep16.1-p T-DNA mutant revealed 10 down-regulated and 6 up-regulated genes. In contrast to the α-helical OEP16 proteins, the OEP37 and VDAC proteins are of β-barrel structure. The PsOEP37 and AtOEP37 channel proteins form a selective barrier in the outer envelope of chloroplasts. Electrophysiological studies in lipid bilayer membranes showed that the PsOEP37 channel is permeable for cations. Specific expression profiles showed that AtOEP37 and PsOEP37 are highly expressed in the entire plant. The isolated PsVDAC gene encodes a protein, which is located in mitochondria. In Arabidopsis gene database, five Arabidopsis genes, which code for VDAC-like proteins were announced. One gene was not detected, whereas four of these genes expressed in leaves, roots, flower buds and pollen
    • …
    corecore