146 research outputs found

    Protein structure database search and evolutionary classification

    Get PDF
    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at

    fastSCOP: a fast web server for recognizing protein structural domains and SCOP superfamilies

    Get PDF
    The fastSCOP is a web server that rapidly identifies the structural domains and determines the evolutionary superfamilies of a query protein structure. This server uses 3D-BLAST to scan quickly a large structural classification database (SCOP1.71 with <95% identity with each other) and the top 10 hit domains, which have different superfamily classifications, are obtained from the hit lists. MAMMOTH, a detailed structural alignment tool, is adopted to align these top 10 structures to refine domain boundaries and to identify evolutionary superfamilies. Our previous works demonstrated that 3D-BLAST is as fast as BLAST, and has the characteristics of BLAST (e.g. a robust statistical basis, effective search and reliable database search capabilities) in large structural database searches based on a structural alphabet database and a structural alphabet substitution matrix. The classification accuracy of this server is ∼98% for 586 query structures and the average execution time is ∼5. This server was also evaluated on 8700 structures, which have no annotations in the SCOP; the server can automatically assign 7311 (84%) proteins (9420 domains) to the SCOP superfamilies in 9.6 h. These results suggest that the fastSCOP is robust and can be a useful server for recognizing the evolutionary classifications and the protein functions of novel structures. The server is accessible at http://fastSCOP.life.nctu.edu.tw

    (PS)(2): protein structure prediction server

    Get PDF
    Protein structure prediction provides valuable insights into function, and comparative modeling is one of the most reliable methods to predict 3D structures directly from amino acid sequences. However, critical problems arise during the selection of the correct templates and the alignment of query sequences therewith. We have developed an automatic protein structure prediction server, (PS)(2), which uses an effective consensus strategy both in template selection, which combines PSI-BLAST and IMPALA, and target–template alignment integrating PSI-BLAST, IMPALA and T-Coffee. (PS)(2) was evaluated for 47 comparative modeling targets in CASP6 (Critical Assessment of Techniques for Protein Structure Prediction). For the benchmark dataset, the predictive performance of (PS)(2), based on the mean GTD_TS score, was superior to 10 other automatic servers. Our method is based solely on the consensus sequence and thus is considerably faster than other methods that rely on the additional structural consensus of templates. Our results show that (PS)(2), coupled with suitable consensus strategies and a new similarity score, can significantly improve structure prediction. Our approach should be useful in structure prediction and modeling. The (PS)(2) is available through the website at

    Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for rapid search of protein structure database

    Get PDF
    3D BLAST, a novel protein structure database search tool, is a useful tool for analysing novel structures, capable of returning a list of aligned structures ordered according to E-values

    3D-interologs: an evolution database of physical protein- protein interactions across multiple genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comprehensive exploration of protein-protein interactions is a challenging route to understand biological processes. For efficiently enlarging protein interactions annotated with residue-based binding models, we proposed a new concept "3D-domain interolog mapping" with a scoring system to explore all possible protein pairs between the two homolog families, derived from a known 3D-structure dimmer (template), across multiple species. Each family consists of homologous proteins which have interacting domains of the template for studying domain interface evolution of two interacting homolog families.</p> <p>Results</p> <p>The 3D-interologs database records the evolution of protein-protein interactions database across multiple species. Based on "3D-domain interolog mapping" and a new scoring function, we infer 173,294 protein-protein interactions by using 1,895 three-dimensional (3D) structure heterodimers to search the UniProt database (4,826,134 protein sequences). The 3D- interologs database comprises 15,124 species and 283,980 protein-protein interactions, including 173,294 interactions (61%) and 110,686 interactions (39%) summarized from the IntAct database. For a protein-protein interaction, the 3D-interologs database shows functional annotations (e.g. Gene Ontology), interacting domains and binding models (e.g. hydrogen-bond interactions and conserved residues). Additionally, this database provides couple-conserved residues and the interacting evolution by exploring the interologs across multiple species. Experimental results reveal that the proposed scoring function obtains good agreement for the binding affinity of 275 mutated residues from the ASEdb. The precision and recall of our method are 0.52 and 0.34, respectively, by using 563 non-redundant heterodimers to search on the Integr8 database (549 complete genomes).</p> <p>Conclusions</p> <p>Experimental results demonstrate that the proposed method can infer reliable physical protein-protein interactions and be useful for studying the protein-protein interaction evolution across multiple species. In addition, the top-ranked strategy and template interface score are able to significantly improve the accuracies of identifying protein-protein interactions in a complete genome. The 3D-interologs database is available at <url>http://3D- interologs.life.nctu.edu.tw</url>.</p

    Densest subgraph-based methods for protein-protein interaction hot spot prediction

    Get PDF
    [Background] Hot spots play an important role in protein binding analysis. The residue interaction network is a key point in hot spot prediction, and several graph theory-based methods have been proposed to detect hot spots. Although the existing methods can yield some interesting residues by network analysis, low recall has limited their abilities in finding more potential hot spots. [Result] In this study, we develop three graph theory-based methods to predict hot spots from only a single residue interaction network. We detect the important residues by finding subgraphs with high densities, i.e., high average degrees. Generally, a high degree implies a high binding possibility between protein chains, and thus a subgraph with high density usually relates to binding sites that have a high rate of hot spots. By evaluating the results on 67 complexes from the SKEMPI database, our methods clearly outperform existing graph theory-based methods on recall and F-score. In particular, our main method, Min-SDS, has an average recall of over 0.665 and an f2-score of over 0.364, while the recall and f2-score of the existing methods are less than 0.400 and 0.224, respectively. [Conclusion] The Min-SDS method performs best among all tested methods on the hot spot prediction problem, and all three of our methods provide useful approaches for analyzing bionetworks. In addition, the densest subgraph-based methods predict hot spots with only one residue interaction network, which is constructed from spatial atomic coordinate data to mitigate the shortage of data from wet-lab experiments

    An integrated approach with new strategies for QSAR models and lead optimization

    Get PDF
    Compound testing set for huAChE collected from Guo et al. (PDF 52 kb

    iGEMDOCK: a graphical environment of enhancing GEMDOCK using pharmacological interactions and post-screening analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Pharmacological interactions are useful for understanding ligand binding mechanisms of a therapeutic target. These interactions are often inferred from a set of active compounds that were acquired experimentally. Moreover, most docking programs loosely coupled the stages (binding-site and ligand preparations, virtual screening, and post-screening analysis) of structure-based virtual screening (VS). An integrated VS environment, which provides the friendly interface to seamlessly combine these VS stages and to identify the pharmacological interactions directly from screening compounds, is valuable for drug discovery.</p> <p>Results</p> <p>We developed an easy-to-use graphic environment, <it>i</it>GEMDOCK, integrating VS stages (from preparations to post-screening analysis). For post-screening analysis, <it>i</it>GEMDOCK provides biological insights by deriving the pharmacological interactions from screening compounds without relying on the experimental data of active compounds. The pharmacological interactions represent conserved interacting residues, which often form binding pockets with specific physico-chemical properties, to play the essential functions of a target protein. Our experimental results show that the pharmacological interactions derived by <it>i</it>GEMDOCK are often hot spots involving in the biological functions. In addition, <it>i</it>GEMDOCK provides the visualizations of the protein-compound interaction profiles and the hierarchical clustering dendrogram of the compounds for post-screening analysis.</p> <p>Conclusions</p> <p>We have developed <it>i</it>GEMDOCK to facilitate steps from preparations of target proteins and ligand libraries toward post-screening analysis. <it>i</it>GEMDOCK is especially useful for post-screening analysis and inferring pharmacological interactions from screening compounds. We believe that <it>i</it>GEMDOCK is useful for understanding the ligand binding mechanisms and discovering lead compounds. <it>i</it>GEMDOCK is available at <url>http://gemdock.life.nctu.edu.tw/dock/igemdock.php</url>.</p

    Genome-wide structural modelling of TCR-pMHC interactions

    Full text link
    corecore