115 research outputs found

    Transmembrane protein topology prediction using support vector machines

    Get PDF
    Background: Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated.Results: We present a support vector machine-based (SVM) TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from http://bioinf.cs.ucl.ac.uk/psipred/.Conclusion: The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins

    Solution-Based Structural Analysis of the Decaheme Cytochrome, MtrA, by Small-Angle X-ray Scattering and Analytical Ultracentrifugation

    Get PDF
    The potential exploitation of metal-reducing bacteria as a means for environmental cleanup or alternative fuel is an exciting prospect; however, the cellular processes that would allow for these applications need to be better understood. MtrA is a periplasmic decaheme c-type cytochrome from Shewanella oneidensis involved in the reduction of extracellular iron oxides and therefore is a critical element in Shewanella ability to engage in extracellular charge transfer. As a relatively small 333-residue protein, the heme content is surprisingly high. MtrA is believed to obtain electrons from the inner membrane-bound quinol oxidoreductase, CymA, and shuttle them across the outer membrane to MtrC, another decaheme cytochrome that directly interacts with insoluble metal oxides. How MtrA is able to perform this task is a question of interest. Here through the use of two solution-based techniques, small-angle X-ray scattering (SAXS) and analytical ultracentrifugation (AUC), we present the first structural analysis of MtrA. Our results establish that between 0.5 and 4 mg/mL, MtrA exists as a monomeric protein that is shaped like an extended molecular “wire” with a maximum protein dimension (D[subscript max]) of 104 Å and a rod-like aspect ratio of 2.2 to 2.5. This study contributes to a greater understanding of how MtrA fulfills its role in the redox processes that must occur before electrons reach the outside of the cell.National Science Foundation (U.S.). (0546323)National Institutes of Health (U.S.) (Grant Number F32GM904862)Howard Hughes Medical Institute. InvestigatorNational Science Foundation (U.S.) (Award DMR- 0936384

    Bivariate genome-wide association meta-analysis of pediatric musculoskeletal traits reveals pleiotropic effects at the SREBF1/TOM1L2 locus

    Get PDF
    Bone mineral density is known to be a heritable, polygenic trait whereas genetic variants contributing to lean mass variation remain largely unknown. We estimated the shared SNP heritability and performed a bivariate GWAS meta-analysis of total-body lean mass (TB-LM) and total-body less head bone mineral density (TBLH-BMD) regions in 10,414 children. The estimated SNP heritability is 43% for TBLH-BMD, and 39% for TB-LM, with a shared genetic component of 43%. We identify variants with pleiotropic effects in eight loci, including seven established bone mineral density loci: _WNT4, GALNT3, MEPE, CPED1/WNT16, TNFSF11, RIN3, and PPP6R3/LRP5_. Variants in the _TOM1L2/SREBF1_ locus exert opposing effects TB-LM and TBLH-BMD, and have a stronger association with the former trait. We show that _SREBF1_ is expressed in murine and human osteoblasts, as well as in human muscle tissue. This is the first bivariate GWAS meta-analysis to demonstrate genetic factors with pleiotropic effects on bone mineral density and lean mass

    Bivariate genome-wide association meta-analysis of pediatric musculoskeletal traits reveals pleiotropic effects at the SREBF1/TOM1L2 locus

    Get PDF
    Bone mineral density is known to be a heritable, polygenic trait whereas genetic variants contributing to lean mass variation remain largely unknown. We estimated the shared SNP heritability and performed a bivariate GWAS meta-analysis of total-body lean mass (TB-LM) and total-body less head bone mineral density (TBLH-BMD) regions in 10,414 children. The estimated SNP heritability is 43% for TBLH-BMD, and 39% for TB-LM, with a shared genetic component of 43%. We identify variants with pleiotropic effects in eight loci, including seven established bone mineral density loci: _WNT4, GALNT3, MEPE, CPED1/WNT16, TNFSF11, RIN3, and PPP6R3/LRP5_. Variants in the _TOM1L2/SREBF1_ locus exert opposing effects TB-LM and TBLH-BMD, and have a stronger association with the former trait. We show that _SREBF1_ is expressed in murine and human osteoblasts, as well as in human muscle tissue. This is the first bivariate GWAS meta-analysis to demonstrate genetic factors with pleiotropic effects on bone mineral density and lean mass

    Comparative study of the extracellular proteome of Sulfolobus species reveals limited secretion

    Get PDF
    Although a large number of potentially secreted proteins can be predicted on the basis of genomic distribution of signal sequence-bearing proteins, protein secretion in Archaea has barely been studied. A proteomic inventory and comparison of the growth medium proteins in three hyperthermoacidophiles, i.e., Sulfolobus solfataricus, S. acidocaldarius and S. tokodaii, indicates that only few proteins are freely secreted into the growth medium and that the majority originates from cell envelope bound forms. In S. acidocaldarius both cell-associated and secreted α-amylase activities are detected. Inactivation of the amyA gene resulted in a complete loss of activity, suggesting that the same protein is responsible for the a-amylase activity at both locations. It is concluded that protein secretion in Sulfolobus is a limited process, and it is suggested that the S-layer may act as a barrier for the free diffusion of folded proteins into the medium

    A comprehensive assessment of N-terminal signal peptides prediction methods

    Get PDF
    Background: Amino-terminal signal peptides (SPs) are short regions that guide the targeting of secretory proteins to the correct subcellular compartments in the cell. They are cleaved off upon the passenger protein reaching its destination. The explosive growth in sequencing technologies has led to the deposition of vast numbers of protein sequences necessitating rapid functional annotation techniques, with subcellular localization being a key feature. Of the myriad software prediction tools developed to automate the task of assigning the SP cleavage site of these new sequences, we review here, the performance and reliability of commonly used SP prediction tools. Results: The available signal peptide data has been manually curated and organized into three datasets representing eukaryotes, Gram-positive and Gram-negative bacteria. These datasets are used to evaluate thirteen prediction tools that are publicly available. SignalP (both the HMM and ANN versions) maintains consistency and achieves the best overall accuracy in all three benchmarking experiments, ranging from 0.872 to 0.914 although other prediction tools are narrowing the performance gap. Conclusion: The majority of the tools evaluated in this study encounter no difficulty in discriminating between secretory and non-secretory proteins. The challenge clearly remains with pinpointing the correct SP cleavage site. The composite scoring schemes employed by SignalP may help to explain its accuracy. Prediction task is divided into a number of separate steps, thus allowing each score to tackle a particular aspect of the prediction.12 page(s

    A structural comparison of human serum transferrin and human lactoferrin

    Get PDF
    The transferrins are a family of proteins that bind free iron in the blood and bodily fluids. Serum transferrins function to deliver iron to cells via a receptor-mediated endocytotic process as well as to remove toxic free iron from the blood and to provide an anti-bacterial, low-iron environment. Lactoferrins (found in bodily secretions such as milk) are only known to have an anti-bacterial function, via their ability to tightly bind free iron even at low pH, and have no known transport function. Though these proteins keep the level of free iron low, pathogenic bacteria are able to thrive by obtaining iron from their host via expression of outer membrane proteins that can bind to and remove iron from host proteins, including both serum transferrin and lactoferrin. Furthermore, even though human serum transferrin and lactoferrin are quite similar in sequence and structure, and coordinate iron in the same manner, they differ in their affinities for iron as well as their receptor binding properties: the human transferrin receptor only binds serum transferrin, and two distinct bacterial transport systems are used to capture iron from serum transferrin and lactoferrin. Comparison of the recently solved crystal structure of iron-free human serum transferrin to that of human lactoferrin provides insight into these differences

    The Twin-Arginine Translocation Pathway in α-Proteobacteria Is Functionally Preserved Irrespective of Genomic and Regulatory Divergence

    Get PDF
    The twin-arginine translocation (Tat) pathway exports fully folded proteins out of the cytoplasm of Gram-negative and Gram-positive bacteria. Although much progress has been made in unraveling the molecular mechanism and biochemical characterization of the Tat system, little is known concerning its functionality and biological role to confer adaptive skills, symbiosis or pathogenesis in the α-proteobacteria class. A comparative genomic analysis in the α-proteobacteria class confirmed the presence of tatA, tatB, and tatC genes in almost all genomes, but significant variations in gene synteny and rearrangements were found in the order Rickettsiales with respect to the typically described operon organization. Transcription of tat genes was confirmed for Anaplasma marginale str. St. Maries and Brucella abortus 2308, two α-proteobacteria with full and partial intracellular lifestyles, respectively. The tat genes of A. marginale are scattered throughout the genome, in contrast to the more generalized operon organization. Particularly, tatA showed an approximately 20-fold increase in mRNA levels relative to tatB and tatC. We showed Tat functionality in B. abortus 2308 for the first time, and confirmed conservation of functionality in A. marginale. We present the first experimental description of the Tat system in the Anaplasmataceae and Brucellaceae families. In particular, in A. marginale Tat functionality is conserved despite operon splitting as a consequence of genome rearrangements. Further studies will be required to understand how the proper stoichiometry of the Tat protein complex and its biological role are achieved. In addition, the predicted substrates might be the evidence of role of the Tat translocation system in the transition process from a free-living to a parasitic lifestyle in these α-proteobacteria

    Using graph theory to analyze biological networks

    Get PDF
    Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system

    From protein sequences to 3D-structures and beyond: the example of the UniProt Knowledgebase

    Get PDF
    With the dramatic increase in the volume of experimental results in every domain of life sciences, assembling pertinent data and combining information from different fields has become a challenge. Information is dispersed over numerous specialized databases and is presented in many different formats. Rapid access to experiment-based information about well-characterized proteins helps predict the function of uncharacterized proteins identified by large-scale sequencing. In this context, universal knowledgebases play essential roles in providing access to data from complementary types of experiments and serving as hubs with cross-references to many specialized databases. This review outlines how the value of experimental data is optimized by combining high-quality protein sequences with complementary experimental results, including information derived from protein 3D-structures, using as an example the UniProt knowledgebase (UniProtKB) and the tools and links provided on its website (http://www.uniprot.org/). It also evokes precautions that are necessary for successful predictions and extrapolations
    • 

    corecore