252 research outputs found

    FLORA: a novel method to predict protein function from structure in diverse superfamilies

    Get PDF
    Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

    In silico assessment of potential druggable pockets on the surface of α1-Antitrypsin conformers

    Get PDF
    The search for druggable pockets on the surface of a protein is often performed on a single conformer, treated as a rigid body. Transient druggable pockets may be missed in this approach. Here, we describe a methodology for systematic in silico analysis of surface clefts across multiple conformers of the metastable protein α1-antitrypsin (A1AT). Pathological mutations disturb the conformational landscape of A1AT, triggering polymerisation that leads to emphysema and hepatic cirrhosis. Computational screens for small molecule inhibitors of polymerisation have generally focused on one major druggable site visible in all crystal structures of native A1AT. In an alternative approach, we scan all surface clefts observed in crystal structures of A1AT and in 100 computationally produced conformers, mimicking the native solution ensemble. We assess the persistence, variability and druggability of these pockets. Finally, we employ molecular docking using publicly available libraries of small molecules to explore scaffold preferences for each site. Our approach identifies a number of novel target sites for drug design. In particular one transient site shows favourable characteristics for druggability due to high enclosure and hydrophobicity. Hits against this and other druggable sites achieve docking scores corresponding to a Kd in the µM–nM range, comparing favourably with a recently identified promising lead. Preliminary ThermoFluor studies support the docking predictions. In conclusion, our strategy shows considerable promise compared with the conventional single pocket/single conformer approach to in silico screening. Our best-scoring ligands warrant further experimental investigation

    SSMap: A new UniProt-PDB mapping resource for the curation of structural-related information in the UniProt/Swiss-Prot Knowledgebase

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequences and structures provide valuable complementary information on protein features and functions. However, it is not always straightforward for users to gather information concurrently from the sequence and structure levels. The UniProt knowledgebase (UniProtKB) strives to help users on this undertaking by providing complete cross-references to Protein Data Bank (PDB) as well as coherent feature annotation using available structural information. In this study, SSMap – a new UniProt-PDB residue-residue level mapping – was generated. The primary objective of this mapping is not only to facilitate the two tasks mentioned above, but also to palliate a number of shortcomings of existent mappings. SSMap is the first isoform sequence-specific mapping resource and is up-to-date for UniProtKB annotation tasks. The method employed by SSMap differs from the other mapping resources in that it stresses on the correct reconstruction of the PDB sequence from structures, and on the correct attribution of a UniProtKB entry to each PDB chain by using a series of post-processing steps.</p> <p>Results</p> <p>SSMap was compared to other existing mapping resources in terms of the correctness of the attribution of PDB chains to UniProtKB entries, and of the quality of the pairwise alignments supporting the residue-residue mapping. It was found that SSMap shared about 80% of the mappings with other mapping sources. New and alternative mappings proposed by SSMap were mostly good as assessed by manual verification of data subsets. As for local pairwise alignments, it was shown that major discrepancies (both in terms of alignment lengths and boundaries), when present, were often due to differences in methodologies used for the mappings.</p> <p>Conclusion</p> <p>SSMap provides an independent, good quality UniProt-PDB mapping. The systematic comparison conducted in this study allows the further identification of general problems in UniProt-PDB mappings so that both the coverage and the quality of the mappings can be systematically improved for the benefit of the scientific community. SSMap mapping is currently used to provide PDB cross-references in UniProtKB.</p

    Epithelioid sarcoma in the thoracic spine

    Get PDF
    Epithelioid sarcoma is a rare and highly malignant soft tissue tumor that is commonly found in the extremities and rarely in the trunk area. This malignant tumor often mimics granuloma or nodular fasciitis, which causes a delay in establishing the diagnosis. This type of cancer has a high recurrence rate. Surgical treatment requires wide radical resection. The objective of this case report is to highlight the unique location of a rare neoplasm and to illustrate the relentless course of epithelioid sarcoma despite initial radical resection. A 14-year-old boy was admitted to our facility with a soft tissue mass on the right lower thoracic spine. The large tumor mass had deeply penetrated into the muscles, infiltrated the neuroforamen of T9–T10 level, and compressed the dural sac. Immunohistological study of the biopsy was highly consistent with an epithelioid sarcoma. Wide excision of the mass, laminectomy and spine fusion with instrumentation was performed. The patient received chemotherapy and irradiation. The first recurrence of the neoplasm was seen as a contralateral metastasis 21 months after the resection. On the last follow-up, 3 years postoperatively, the patient was in a good general condition. However, further progression of the sarcoma had to be recognized. Our case encompasses multiple features that represent negative prognostic factors. Initial wide excision of the neoplasm and adjuvant therapy including chemotherapy and irradiation seem to slow down the relentless course of epithelioid sarcoma in the trunk

    Local comparison of protein structures highlights cases of convergent evolution in analogous functional sites

    Get PDF
    We performed an exhaustive search for local structural similarities in an ensemble of non-redundant protein functional sites. With the purpose of finding new examples of convergent evolution, we selected only those matching sites composed of structural regions whose residue order is inverted in the relative protein sequences

    Discriminative structural approaches for enzyme active-site prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far.</p> <p>Results</p> <p>This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis.</p> <p>Conclusions</p> <p>This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses.</p

    Lectin-like bacteriocins from pseudomonas spp. utilise D-rhamnose containing lipopolysaccharide as a cellular receptor

    Get PDF
    Lectin-like bacteriocins consist of tandem monocot mannose-binding domains and display a genus-specific killing activity. Here we show that pyocin L1, a novel member of this family from Pseudomonas aeruginosa, targets susceptible strains of this species through recognition of the common polysaccharide antigen (CPA) of P. aeruginosa lipopolysaccharide that is predominantly a homopolymer of d-rhamnose. Structural and biophysical analyses show that recognition of CPA occurs through the C-terminal carbohydrate-binding domain of pyocin L1 and that this interaction is a prerequisite for bactericidal activity. Further to this, we show that the previously described lectin-like bacteriocin putidacin L1 shows a similar carbohydrate-binding specificity, indicating that oligosaccharides containing d-rhamnose and not d-mannose, as was previously thought, are the physiologically relevant ligands for this group of bacteriocins. The widespread inclusion of d-rhamnose in the lipopolysaccharide of members of the genus Pseudomonas explains the unusual genus-specific activity of the lectin-like bacteriocins

    BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology.</p> <p>Results</p> <p>Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins.</p> <p>Conclusions</p> <p>This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.</p

    BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology.</p> <p>Results</p> <p>Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins.</p> <p>Conclusions</p> <p>This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.</p
    corecore