12 research outputs found

    TOPOFIT-DB, a database of protein structural alignments based on the TOPOFIT method

    Get PDF
    TOPOFIT-DB (T-DB) is a public web-based database of protein structural alignments based on the TOPOFIT method, providing a comprehensive resource for comparative analysis of protein structure families. The TOPOFIT method is based on the discovery of a saturation point on the alignment curve (topomax point) which presents an ability to objectively identify a border between common and variable parts in a protein structural family, providing additional insight into protein comparison and functional annotation. TOPOFIT also effectively detects non-sequential relations between protein structures. T-DB provides users with the convenient ability to retrieve and analyze structural neighbors for a protein; do one-to-all calculation of a user provided structure against the entire current PDB release with T-Server, and pair-wise comparison using the TOPOFIT method through the T-Pair web page. All outputs are reported in various web-based tables and graphics, with automated viewing of the structure-sequence alignments in the Friend software package for complete, detailed analysis. T-DB presents researchers with the opportunity for comprehensive studies of the variability in proteins and is publicly available at

    A comprehensive analysis of non-sequential alignments between all protein structures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The majority of relations between proteins can be represented as a conventional sequential alignment. Nevertheless, unusual non-sequential alignments with different connectivity of the aligned fragments in compared proteins have been reported by many researchers. It is interesting to understand those non-sequential alignments; are they unique, sporadic cases or they occur frequently; do they belong to a few specific folds or spread among many different folds, as a common feature of protein structure. We present here a comprehensive large-scale study of non-sequential alignments between available protein structures in Protein Data Bank.</p> <p>Results</p> <p>The study has been conducted on a non-redundant set of 8,865 protein structures aligned with the aid of the TOPOFIT method. It has been estimated that between 17.4% and 35.2% of all alignments are non-sequential depending on variations in the parameters. Analysis of the data revealed that non-sequential relations between proteins do occur systematically and in large quantities. Various sizes and numbers of non-sequential fragments have been observed with all possible complexities of fragment rearrangements found for alignments consisting of up to 12 fragments. It has been found that non-sequential alignments are not limited to proteins of any particular fold and are present in more than two hundred of them. Moreover, many of them are found between proteins with different fold assignments. It has been shown that protein structure symmetry does not explain non-sequential alignments. Therefore, compelling evidences have been provided that non-sequential alignments between proteins are systematic and widespread across the protein universe.</p> <p>Conclusion</p> <p>The phenomenon of the widespread occurrence of non-sequential alignments between proteins might represent a missing rule of protein structure organization. More detailed study of this phenomenon will enhance our understanding of protein stability, folding, and evolution.</p

    A Score of the Ability of a Three-Dimensional Protein Model to Retrieve Its Own Sequence as a Quantitative Measure of Its Quality and Appropriateness

    Get PDF
    BACKGROUND: Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. PRINCIPAL FINDINGS: The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449-460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. CONCLUSION: Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone

    deconSTRUCT: general purpose protein database search on the substructure level

    Get PDF
    deconSTRUCT webserver offers an interface to a protein database search engine, usable for a general purpose detection of similar protein (sub)structures. Initially, it deconstructs the query structure into its secondary structure elements (SSEs) and reassembles the match to the target by requiring a (tunable) degree of similarity in the direction and sequential order of SSEs. Hierarchical organization and judicious use of the information about protein structure enables deconSTRUCT to achieve the sensitivity and specificity of the established search engines at orders of magnitude increased speed, without tying up irretrievably the substructure information in the form of a hash. In a post-processing step, a match on the level of the backbone atoms is constructed. The results presented to the user consist of the list of the matched SSEs, the transformation matrix for rigid superposition of the structures and several ways of visualization, both downloadable and implemented as a web-browser plug-in. The server is available at http://epsf.bmad.bii.a-star.edu.sg/struct_server.html

    Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways

    Get PDF
    SNPs located within the open reading frame of a gene that result in an alteration in the amino acid sequence of the encoded protein [nonsynonymous SNPs (nsSNPs)] might directly or indirectly affect functionality of the protein, alone or in the interactions in a multi-protein complex, by increasing/decreasing the activity of the metabolic pathway. Understanding the functional consequences of such changes and drawing conclusions about the molecular basis of diseases, involves integrating information from multiple heterogeneous sources including sequence, structure data and pathway relations between proteins. The data from NCBI's SNP database (dbSNP), gene and protein databases from Entrez, protein structures from the PDB and pathway information from KEGG have all been cross referenced into the StSNP web server, in an effort to provide combined integrated, reports about nsSNPs. StSNP provides ‘on the fly’ comparative modeling of nsSNPs with links to metabolic pathway information, along with real-time visual comparative analysis of the modeled structures using the Friend software application. The use of metabolic pathways in StSNP allows a researcher to examine possible disease-related pathways associated with a particular nsSNP(s), and link the diseases with the current available molecular structure data. The server is publicly available at http://glinka.bio.neu.edu/StSNP/

    A novel method to compare protein structures using local descriptors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships.</p> <p>Results</p> <p>We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy).</p> <p>Conclusions</p> <p>DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at <url>http://bioexploratorium.pl/EP/DEDAL</url>.</p

    Alignment of protein structures in the presence of domain motions

    Get PDF
    Abstract Background Structural alignment is an important step in protein comparison. Well-established methods exist for solving this problem under the assumption that the structures under comparison are considered as rigid bodies. However, proteins are flexible entities often undergoing movements that alter the positions of domains or subdomains with respect to each other. Such movements can impede the identification of structural equivalences when rigid aligners are used. Results We introduce a new method called RAPIDO (Rapid Alignment of Proteins in terms of Domains) for the three-dimensional alignment of protein structures in the presence of conformational changes. The flexible aligner is coupled to a genetic algorithm for the identification of structurally conserved regions. RAPIDO is capable of aligning protein structures in the presence of large conformational changes. Structurally conserved regions are reliably detected even if they are discontinuous in sequence but continuous in space and can be used for superpositions revealing subtle differences. Conclusion RAPIDO is more sensitive than other flexible aligners when applied to cases of closely homologues proteins undergoing large conformational changes. When applied to a set of kinase structures it is able to detect similarities that are missed by other alignment algorithms. The algorithm is sufficiently fast to be applied to the comparison of large sets of protein structures.</p

    A Mathematical Framework for Protein Structure Comparison

    Get PDF
    Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set

    Entwicklung von Methoden für das computergestützte Design von Mimotopen

    Get PDF
    Die wachsende Menge an experimentell aufgeklärten Protein-Protein-Komplexen oder allgemeiner: Protein-Ligand-Komplexen, erlaubt das immer genauere Studium von biomolekularen Wechselwirkungen. Eine Teilmenge der existierenden Wechselwirkungen bilden die für die adaptive Immunantworten wichtigen Antigen-Antikörper-Wechselwirkungen. Die chemischen Gruppen an der Oberfläche der Antigene entscheiden über die spezifischen Wechselwirkungen mit Antikörpern, und werden als antigene Determinanten oder Epitope bezeichnet. Die dazu komplementären Bindestellen auf den Antikörpern werden als Paratope bezeichnet. Häufig verwendet man den Begriff „Epitop“ allgemein für Molekülteile, die spezifisch erkannt werden. Die Spezifität der Epitope wird sowohl durch die geometrische Anordnung als auch durch die chemische Konfiguration der monomeren Gruppen bestimmt. Mimotope sind synthetisch hergestellte Proteine, die die strukturellen Erkennungsmerkmale der Epitope nachahmen und somit z.B. eine definierte Immunantwort auslösen können. Beispielsweise ist es nun möglich, Epitope bis zur atomaren Auflösung zu identifizieren und nach ähnlichen Strukturmotiven auf anderen Proteinstrukturen zu suchen. Diese Art des Strukturvergleichs eröffnet interessante Anwendungen: Epitope lassen sich ggf. auf andere Trägermoleküle transplantieren, oder es könnten Kreuzreaktivitäten vorhergesagt werden. Entscheidend für diese Ansätze ist die Verfügbarkeit einer Methode, mit der sich Strukturmotive schnell und genau vergleichen lassen. Die Entwicklung einer solchen Methode (EpitopeMatch) ist das Ziel dieser Promotionsarbeit. Im Einzelnen soll EpitopeMatch folgende Eigenschaften besitzen: • Einbeziehung geometrischer und chemischer Ähnlichkeit. • Flexible Definition von i. Allg. diskontinuierlichen Epitopen auf der Grundlage bekannter Komplexstrukturen. • Effiziente Suche auf großen Strukturdatenbanken. • Möglichkeit der Transplantation vollständiger Epitope. • Verknüpfung der Fundstellen mit funktionellen biologischen Daten
    corecore