155 research outputs found

    Inferring Function Using Patterns of Native Disorder in Proteins

    Get PDF
    Natively unstructured regions are a common feature of eukaryotic proteomes. Between 30% and 60% of proteins are predicted to contain long stretches of disordered residues, and not only have many of these regions been confirmed experimentally, but they have also been found to be essential for protein function. In this study, we directly address the potential contribution of protein disorder in predicting protein function using standard Gene Ontology (GO) categories. Initially we analyse the occurrence of protein disorder in the human proteome and report ontology categories that are enriched in disordered proteins. Pattern analysis of the distributions of disordered regions in human sequences demonstrated that the functions of intrinsically disordered proteins are both length- and position-dependent. These dependencies were then encoded in feature vectors to quantify the contribution of disorder in human protein function prediction using Support Vector Machine classifiers. The prediction accuracies of 26 GO categories relating to signalling and molecular recognition are improved using the disorder features. The most significant improvements were observed for kinase, phosphorylation, growth factor, and helicase categories. Furthermore, we provide predicted GO term assignments using these classifiers for a set of unannotated and orphan human proteins. In this study, the importance of capturing protein disorder information and its value in function prediction is demonstrated. The GO category classifiers generated can be used to provide more reliable predictions and further insights into the behaviour of orphan and unannotated proteins

    Protein annotation and modelling servers at University College London

    Get PDF
    The UCL Bioinformatics Group web portal offers several high quality protein structure prediction and function annotation algorithms including PSIPRED, pGenTHREADER, pDomTHREADER, MEMSAT, MetSite, DISOPRED2, DomPred and FFPred for the prediction of secondary structure, protein fold, protein structural domain, transmembrane helix topology, metal binding sites, regions of protein disorder, protein domain boundaries and protein function, respectively. We also now offer a fully automated 3D modelling pipeline: BioSerf, which performed well in CASP8 and uses a fragment-assembly approach which placed it in the top five servers in the de novo modelling category. The servers are available via the group web site at http://bioinf.cs.ucl.ac.uk/

    The N-terminal intrinsically disordered domain of mgm101p is localized to the mitochondrial nucleoid.

    Get PDF
    The mitochondrial genome maintenance gene, MGM101, is essential for yeasts that depend on mitochondrial DNA replication. Previously, in Saccharomyces cerevisiae, it has been found that the carboxy-terminal two-thirds of Mgm101p has a functional core. Furthermore, there is a high level of amino acid sequence conservation in this region from widely diverse species. By contrast, the amino-terminal region, that is also essential for function, does not have recognizable conservation. Using a bioinformatic approach we find that the functional core from yeast and a corresponding region of Mgm101p from the coral Acropora millepora have an ordered structure, while the N-terminal domains of sequences from yeast and coral are predicted to be disordered. To examine whether ordered and disordered domains of Mgm101p have specific or general functions we made chimeric proteins from yeast and coral by swapping the two regions. We find, by an in vivo assay in S.cerevisiae, that the ordered domain of A.millepora can functionally replace the yeast core region but the disordered domain of the coral protein cannot substitute for its yeast counterpart. Mgm101p is found in the mitochondrial nucleoid along with enzymes and proteins involved in mtDNA replication. By attaching green fluorescent protein to the N-terminal disordered domain of yeast Mgm101p we find that GFP is still directed to the mitochondrial nucleoid where full-length Mgm101p-GFP is targeted

    MOTIPS: Automated Motif Analysis for Predicting Targets of Modular Protein Domains

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many protein interactions, especially those involved in signaling, involve short linear motifs consisting of 5-10 amino acid residues that interact with modular protein domains such as the SH3 binding domains and the kinase catalytic domains. One straightforward way of identifying these interactions is by scanning for matches to the motif against all the sequences in a target proteome. However, predicting domain targets by motif sequence alone without considering other genomic and structural information has been shown to be lacking in accuracy.</p> <p>Results</p> <p>We developed an efficient search algorithm to scan the target proteome for potential domain targets and to increase the accuracy of each hit by integrating a variety of pre-computed features, such as conservation, surface propensity, and disorder. The integration is performed using naïve Bayes and a training set of validated experiments.</p> <p>Conclusions</p> <p>By integrating a variety of biologically relevant features to predict domain targets, we demonstrated a notably improved prediction of modular protein domain targets. Combined with emerging high-resolution data of domain specificities, we believe that our approach can assist in the reconstruction of many signaling pathways.</p

    ProteinArchitect: Protein Evolution above the Sequence Level

    Get PDF
    While many authors have discussed models and tools for studying protein evolution at the sequence level, molecular function is usually mediated by complex, higher order features such as independently folding domains and linear motifs that are based on or embedded in a particular arrangment of features such as secondary structure elements, transmembrane domains and regions with intrinsic disorder. This 'protein architecture' can, in its most simplistic representation, be visualized as domain organization cartoons that can be used to compare proteins in terms of the order of their mostly globular domains.Here, we describe a visual approach and a webserver for protein comparison that extend the domain organization cartoon concept. By developing an information-rich, compact visualization of different protein features above the sequence level, potentially related proteins can be compared at the level of propensities for secondary structure, transmembrane domains and intrinsic disorder, in addition to PFAM domains. A public Web server is available at www.proteinarchitect.net, while the code is provided at protarchitect.sourceforge.net.Due to recent advances in sequencing technologies we are now flooded with millions of predicted proteins that await comparative analysis. In many cases, mature tools focused on revealing hits with considerable global or local similarity to well-characterized proteins will not be able to lead us to testable hypotheses about a protein's function, or the function of a particular region. The visual comparison of different types of protein features with ProteinArchitect will be useful when assessing the relevance of similarity search hits, to discover subgroups in protein families and superfamilies, and to understand protein regions with conserved features outside globular regions. Therefore, this approach is likely to help researchers to develop testable hypotheses about a protein's function even if is somewhat distant from the more characterized proteins, by facilitating the discovery of features that are conserved above the sequence level for comparison and further experimental investigation

    Structure of the C-terminal domain of the Prokaryotic Sodium Channel Orthologue NsvBa

    Get PDF
    Crystallographic and electrophysiological studies have recently provided insight into the structure, function and drug binding of prokaryotic sodium channels. These channels exhibit significant sequence identities, especially in their transmembrane regions, with human voltage-gated sodium channels. However, rather than being single polypeptides with four homologous domains, they are tetramers of single domain polypeptides, with a C-terminal domain (CTD) composed of an inter-subunit four helix coiled-coil. The structures of the CTDs differ between orthologues. In NavBh and NavMs, the C-termini form a disordered region adjacent to the final transmembrane helix, followed by a coiled-coil region, as demonstrated by synchrotron radiation circular dichroism (SRCD) and double electron-electron resonance electron paramagnetic resonance spectroscopic measurements. In contrast, in the crystal structure of the NavAe orthologue, the entire C-terminus is comprised of a helical region followed by a coiled-coil. In this study we have examined the CTD of the NsvBa from Bacillus alcalophilus, which unlike other orthologues is predicted by different methods to have different types of structures: either a disordered adjacent to the transmembrane region, followed by a helical coiled-coil, or a fully helical CTD. To discriminate between the two possible structures we have used SRCD spectroscopy to experimentally determine the secondary structure of the C-terminus of this orthologue and used the results as the basis for modelling the transition between open and closed conformations of the channel

    The role of disorder in interaction networks: a structural analysis

    Get PDF
    Recent studies have emphasized the value of including structural information into the topological analysis of protein networks. Here, we utilized structural information to investigate the role of intrinsic disorder in these networks. Hub proteins tend to be more disordered than other proteins (i.e. the proteome average); however, we find this only true for those with one or two binding interfaces (‘single'-interface hubs). In contrast, the distribution of disordered residues in multi-interface hubs is indistinguishable from the overall proteome. Surprisingly, we find that the binding interfaces in single-interface hubs are highly structured, as is the case for multi-interface hubs. However, the binding partners of single-interface hubs tend to have a higher level of disorder than the proteome average, suggesting that their binding promiscuity is related to the disorder of their binding partners. In turn, the higher level of disorder of single-interface hubs can be partly explained by their tendency to bind to each other in a cascade. A good illustration of this trend can be found in signaling pathways and, more specifically, in kinase cascades. Finally, our findings have implications for the current controversy related to party and date-hubs
    corecore