34 research outputs found

    Prioritizing orphan proteins for further study using phylogenomics and gene expression profiles in Streptomyces coelicolor

    Get PDF
    BACKGROUND:Streptomyces coelicolor, a model organism of antibiotic producing bacteria, has one of the largest genomes of the bacterial kingdom, including 7825 predicted protein coding genes. A large number of these genes, nearly 34%, are functionally orphan (hypothetical proteins with unknown function). However, in gene expression time course data, many of these functionally orphan genes show interesting expression patterns.RESULTS:In this paper, we analyzed all functionally orphan genes of Streptomyces coelicolor and identified a list of "high priority" orphans by combining gene expression analysis and additional phylogenetic information (i.e. the level of evolutionary conservation of each protein).CONCLUSIONS:The prioritized orphan genes are promising candidates to be examined experimentally in the lab for further characterization of their functio

    A Method of Protein Model Classification and Retrieval Using Bag-of-Visual-Features

    Get PDF
    In this paper we propose a novel visual method for protein model classification and retrieval. Different from the conventional methods, the key idea of the proposed method is to extract image features of proteins and measure the visual similarity between proteins. Firstly, the multiview images are captured by vertices and planes of a given octahedron surrounding the protein. Secondly, the local features are extracted from each image of the different views by the SURF algorithm and are vector quantized into visual words using a visual codebook. Finally, KLD is employed to calculate the similarity distance between two feature vectors. Experimental results show that the proposed method has encouraging performances for protein retrieval and categorization as shown in the comparison with other methods

    Local protein structures to bridge sequence-structure knowledge

    Get PDF
    Protein sequences can be classified based on their structure similarity and/or common evolutionary origin called structural class. Information on structural class is readily available, easing the protein structure and protein function probing. SCOP and CATH are two prominent classification schemes used to assign the structural class of proteins. Both schemes determine the structural class manually base on known protein tertiary structures. However, the quantity of known protein sequences is growing exponentially with respect to the quantity of known tertiary proteins structures. Although SCOP and CATH are examples of well-established databases that contain more reliable information of structural class, yet the lack of known structural class of protein due to the laborious wet-lab experimental routine limits the high-throughput structural class assignment. The fact that this is a tedious and time-consuming manually-determined method has further limited the structural class assignment. As a consequence, the assignment of structural class by computational method suffers from the arbitrated statistical infer-ence. Thus, this study aims to provide a structural class prediction method that can acquire the knowledge of local protein structures, derived from known excessive primary sequences, in order to produce high-throughput sequence-structure class assignment instead of the laborious experimental based method. This structural class prediction method is termed as SVM-LpsSCPred

    Quantification of structure/dynamics correlation of globular proteins.

    No full text

    Predictions of Hot Spot Residues at Protein-Protein Interfaces Using Support Vector Machines

    Get PDF
    Protein-protein interactions are critically dependent on just a few 'hot spot' residues at the interface. Hot spots make a dominant contribution to the free energy of binding and they can disrupt the interaction if mutated to alanine. Here, we present HSPred, a support vector machine(SVM)-based method to predict hot spot residues, given the structure of a complex. HSPred represents an improvement over a previously described approach (Lise et al, BMC Bioinformatics 2009, 10: 365). It achieves higher accuracy by treating separately predictions involving either an arginine or a glutamic acid residue. These are the amino acid types on which the original model did not perform well. We have therefore developed two additional SVM classifiers, specifically optimised for these cases. HSPred reaches an overall precision and recall respectively of 61% and 69%, which roughly corresponds to a 10% improvement. An implementation of the described method is available as a web server at http://bioinf.cs.ucl.ac.uk/hspred. It is free to non-commercial users

    The structure of latherin, a surfactant allergen protein from horse sweat and saliva

    Get PDF
    Latherin is a highly surface-active allergen protein found in the sweat and saliva of horses and other equids. Its surfactant activity is intrinsic to the protein in its native form, and is manifest without associated lipids or glycosylation. Latherin probably functions as a wetting agent in evaporative cooling in horses, but it may also assist in mastication of fibrous food as well as inhibition of microbial biofilms. It is a member of the PLUNC family of proteins abundant in the oral cavity and saliva of mammals, one of which has also been shown to be a surfactant and capable of disrupting microbial biofilms. How these proteins work as surfactants while remaining soluble and cell membrane-compatible is not known. Nor have their structures previously been reported. We have used protein nuclear magnetic resonance spectroscopy to determine the conformation and dynamics of latherin in aqueous solution. The protein is a monomer in solution with a slightly curved cylindrical structure exhibiting a ‘super-roll’ motif comprising a four-stranded anti-parallel ÎČ-sheet and two opposing α-helices which twist along the long axis of the cylinder. One end of the molecule has prominent, flexible loops that contain a number of apolar amino acid side chains. This, together with previous biophysical observations, leads us to a plausible mechanism for surfactant activity in which the molecule is first localized to the non-polar interface via these loops, and then unfolds and flattens to expose its hydrophobic interior to the air or non-polar surface. Intrinsically surface-active proteins are relatively rare in nature, and this is the first structure of such a protein from mammals to be reported. Both its conformation and proposed method of action are different from other, non-mammalian surfactant proteins investigated so far

    PROFESS: a PROtein Function, Evolution, Structure and Sequence database

    Get PDF
    The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∌1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks

    Classification and Exploration of 3D Protein Domain Interactions Using Kbdock

    Get PDF
    International audienceComparing and classifying protein domain interactions according to theirthree-dimensional (3D) structures can help to understand protein structure-function and evolutionary relationships. Additionally, structural knowledge ofexisting domain–domain interactions can provide a useful way to findstructural templates with which to model the 3D structures of unsolvedprotein complexes. Here we present a straightforward guide to using the“Kbdock” protein domain structure database and its associated web site forexploring and comparing protein domain–domain interactions (DDIs) anddomain–peptide interactions (DPIs) at the Pfam domain family level. We alsobriefly explain how the Kbdock web site works, and we provide some notesand suggestions which should help to avoid some common pitfalls whenworking with 3D protein domain structures
    corecore