7 research outputs found

    CPDB: a database of circular permutation in proteins

    Get PDF
    Circular permutation (CP) in a protein can be considered as if its sequence were circularized followed by a creation of termini at a new location. Since the first observation of CP in 1979, a substantial number of studies have concluded that circular permutants (CPs) usually retain native structures and functions, sometimes with increased stability or functional diversity. Although this interesting property has made CP useful in many protein engineering and folding researches, large-scale collections of CP-related information were not available until this study. Here we describe CPDB, the first CP DataBase. The organizational principle of CPDB is a hierarchical categorization in which pairs of circular permutants are grouped into CP clusters, which are further grouped into folds and in turn classes. Additions to CPDB include a useful set of tools and resources for the identification, characterization, comparison and visualization of CP. Besides, several viable CP site prediction methods are implemented and assessed in CPDB. This database can be useful in protein folding and evolution studies, the discovery of novel protein structural and functional relationships, and facilitating the production of new CPs with unique biotechnical or industrial interests. The CPDB database can be accessed at http://sarst.life.nthu.edu.tw/cpd

    Ligand Binding and Circular Permutation Modify Residue Interaction Network in DHFR

    Get PDF
    Residue interaction networks and loop motions are important for catalysis in dihydrofolate reductase (DHFR). Here, we investigate the effects of ligand binding and chain connectivity on network communication in DHFR. We carry out systematic network analysis and molecular dynamics simulations of the native DHFR and 19 of its circularly permuted variants by breaking the chain connections in ten folding element regions and in nine nonfolding element regions as observed by experiment. Our studies suggest that chain cleavage in folding element areas may deactivate DHFR due to large perturbations in the network properties near the active site. The protein active site is near or coincides with residues through which the shortest paths in the residue interaction network tend to go. Further, our network analysis reveals that ligand binding has “network-bridging effects” on the DHFR structure. Our results suggest that ligand binding leads to a modification, with most of the interaction networks now passing through the cofactor, shortening the average shortest path. Ligand binding at the active site has profound effects on the network centrality, especially the closeness

    Optimized Null Model for Protein Structure Networks

    Get PDF
    Much attention has recently been given to the statistical significance of topological features observed in biological networks. Here, we consider residue interaction graphs (RIGs) as network representations of protein structures with residues as nodes and inter-residue interactions as edges. Degree-preserving randomized models have been widely used for this purpose in biomolecular networks. However, such a single summary statistic of a network may not be detailed enough to capture the complex topological characteristics of protein structures and their network counterparts. Here, we investigate a variety of topological properties of RIGs to find a well fitting network null model for them. The RIGs are derived from a structurally diverse protein data set at various distance cut-offs and for different groups of interacting atoms. We compare the network structure of RIGs to several random graph models. We show that 3-dimensional geometric random graphs, that model spatial relationships between objects, provide the best fit to RIGs. We investigate the relationship between the strength of the fit and various protein structural features. We show that the fit depends on protein size, structural class, and thermostability, but not on quaternary structure. We apply our model to the identification of significantly over-represented structural building blocks, i.e., network motifs, in protein structure networks. As expected, choosing geometric graphs as a null model results in the most specific identification of motifs. Our geometric random graph model may facilitate further graph-based studies of protein conformation space and have important implications for protein structure comparison and prediction. The choice of a well-fitting null model is crucial for finding structural motifs that play an important role in protein folding, stability and function. To our knowledge, this is the first study that addresses the challenge of finding an optimized null model for RIGs, by comparing various RIG definitions against a series of network models

    Deciphering the Preference and Predicting the Viability of Circular Permutations in Proteins

    Get PDF
    Circular permutation (CP) refers to situations in which the termini of a protein are relocated to other positions in the structure. CP occurs naturally and has been artificially created to study protein function, stability and folding. Recently CP is increasingly applied to engineer enzyme structure and function, and to create bifunctional fusion proteins unachievable by tandem fusion. CP is a complicated and expensive technique. An intrinsic difficulty in its application lies in the fact that not every position in a protein is amenable for creating a viable permutant. To examine the preferences of CP and develop CP viability prediction methods, we carried out comprehensive analyses of the sequence, structural, and dynamical properties of known CP sites using a variety of statistics and simulation methods, such as the bootstrap aggregating, permutation test and molecular dynamics simulations. CP particularly favors Gly, Pro, Asp and Asn. Positions preferred by CP lie within coils, loops, turns, and at residues that are exposed to solvent, weakly hydrogen-bonded, environmentally unpacked, or flexible. Disfavored positions include Cys, bulky hydrophobic residues, and residues located within helices or near the protein's core. These results fostered the development of an effective viable CP site prediction system, which combined four machine learning methods, e.g., artificial neural networks, the support vector machine, a random forest, and a hierarchical feature integration procedure developed in this work. As assessed by using the hydrofolate reductase dataset as the independent evaluation dataset, this prediction system achieved an AUC of 0.9. Large-scale predictions have been performed for nine thousand representative protein structures; several new potential applications of CP were thus identified. Many unreported preferences of CP are revealed in this study. The developed system is the best CP viability prediction method currently available. This work will facilitate the application of CP in research and biotechnology

    Prediction of viable circular permutants using a graph theoretic approach

    No full text
    Motivation: In recent years graph-theoretic descriptions have been applied to aid the analysis of a number of complex biological systems. However, such an approach has only just begun to be applied to examine protein structures and the network of interactions between residues with promising results. Here we examine whether a graph measure known as closeness is capable of predicting regions where a protein can be split to form a viable circular permutant. Circular permutants are a powerful experimental tool to probe folding mechanisms and more recently have been used to design split enzyme reporter proteins. Results: We test our method on an extensive set of experiments carried out on dihydrofolate reductase in which circular permutants were constructed for every amino acid position in the sequence, together with partial data from studies on other proteins. Results show that closeness is capable of correctly identifying significantly more residues which are suitable for circular permutation than solvent accessibility. This has potential implications for the design of successful split enzymes having particular importance for the development of protein–protein interaction screening methods and offers new perspectives on protein folding. More generally, the method illustrates the success with which graph-theoretic measures encapsulate the variety of long and short range interactions between residues during the folding process

    Graph-based Approaches to Protein Structure- and Function Prediction

    No full text
    corecore