16,771 research outputs found

    SPRITE and ASSAM: web servers for side chain 3D-motif searching in protein structures

    Get PDF
    Similarities in the 3D patterns of amino acid side chains can provide insights into their function despite the absence of any detectable sequence or fold similarities. Search for protein sites (SPRITE) and amino acid pattern search for substructures and motifs (ASSAM) are graph theoretical programs that can search for 3D amino side chain matches in protein structures, by representing the amino acid side chains as pseudo-atoms. The geometric relationship of the pseudo-atoms to each other as a pattern can be represented as a labeled graph where the pseudo-atoms are the graph's nodes while the edges are the inter-pseudo-atomic distances. Both programs require the input file to be in the PDB format. The objective of using SPRITE is to identify matches of side chains in a query structure to patterns with characterized function. In contrast, a 3D pattern of interest can be searched for existing occurrences in available PDB structures using ASSAM. Both programs are freely accessible without any login requirement. SPRITE is available at http://mfrlab.org/grafss/sprite/while ASSAM can be accessed at http://mfrlab.org/grafss/assam/

    NASSAM: a server to search for and annotate tertiary interactions and motifs in three-dimensional structures of complex RNA molecules

    Get PDF
    Similarities in the 3D patterns of RNA base interactions or arrangements can provide insights into their functions and roles in stabilization of the RNA 3D structure. Nucleic Acids Search for Substructures and Motifs (NASSAM) is a graph theoretical program that can search for 3D patterns of base arrangements by representing the bases as pseudo-atoms. The geometric relationship of the pseudo-atoms to each other as a pattern can be represented as a labeled graph where the pseudo-atoms are the graph's nodes while the edges are the inter-pseudo-atomic distances. The input files for NASSAM are PDB formatted 3D coordinates. This web server can be used to identify matches of base arrangement patterns in a query structure to annotated patterns that have been reported in the literature or that have possible functional and structural stabilization implications. The NASSAM program is freely accessible without any login requirement at http://mfrlab.org/grafss/nassam/

    Representation, searching and discovery of patterns of bases in complex RNA structures

    Get PDF
    We describe a graph theoretic method designed to perform efficient searches for substructural patterns in nucleic acid structural coordinate databases using a simplified vectorial representation. Two vectors represent each nucleic acid base and the relative positions of bases with respect to one another are described in terms of distances between the defined start and end points of the vectors on each base. These points comprise the nodes and the distances the edges of a graph, and a pattern search can then be performed using a subgraph isomorphism algorithm. The minimal representation was designed to facilitate searches for complex patterns but was first tested on simple, well-characterised arrangements of bases such as base pairs and GNRA-tetraloop receptor interactions. The method performed very well for these interaction types. A survey of side-by-side base interactions, of which the adenosine platform is the best known example, also locates examples of similar base rearrangements that we consider to be important in structural regulation. A number of examples were found, with GU platforms being particularly prevalent. A GC platform in the RNA of the Thermus thermophilus small ribosomal subunit is in an analogous position to an adenosine platform in other species. An unusual GG platform is also observed close to one of the substrate binding sites in Haloarcula marismortui large ribosomal subunit RNA

    MRFalign: Protein Homology Detection through Alignment of Markov Random Fields

    Full text link
    Sequence-based protein homology detection has been extensively studied and so far the most sensitive method is based upon comparison of protein sequence profiles, which are derived from multiple sequence alignment (MSA) of sequence homologs in a protein family. A sequence profile is usually represented as a position-specific scoring matrix (PSSM) or an HMM (Hidden Markov Model) and accordingly PSSM-PSSM or HMM-HMM comparison is used for homolog detection. This paper presents a new homology detection method MRFalign, consisting of three key components: 1) a Markov Random Fields (MRF) representation of a protein family; 2) a scoring function measuring similarity of two MRFs; and 3) an efficient ADMM (Alternating Direction Method of Multipliers) algorithm aligning two MRFs. Compared to HMM that can only model very short-range residue correlation, MRFs can model long-range residue interaction pattern and thus, encode information for the global 3D structure of a protein family. Consequently, MRF-MRF comparison for remote homology detection shall be much more sensitive than HMM-HMM or PSSM-PSSM comparison. Experiments confirm that MRFalign outperforms several popular HMM or PSSM-based methods in terms of both alignment accuracy and remote homology detection and that MRFalign works particularly well for mainly beta proteins. For example, tested on the benchmark SCOP40 (8353 proteins) for homology detection, PSSM-PSSM and HMM-HMM succeed on 48% and 52% of proteins, respectively, at superfamily level, and on 15% and 27% of proteins, respectively, at fold level. In contrast, MRFalign succeeds on 57.3% and 42.5% of proteins at superfamily and fold level, respectively. This study implies that long-range residue interaction patterns are very helpful for sequence-based homology detection. The software is available for download at http://raptorx.uchicago.edu/download/.Comment: Accepted by both RECOMB 2014 and PLOS Computational Biolog

    CLP-based protein fragment assembly

    Full text link
    The paper investigates a novel approach, based on Constraint Logic Programming (CLP), to predict the 3D conformation of a protein via fragments assembly. The fragments are extracted by a preprocessor-also developed for this work- from a database of known protein structures that clusters and classifies the fragments according to similarity and frequency. The problem of assembling fragments into a complete conformation is mapped to a constraint solving problem and solved using CLP. The constraint-based model uses a medium discretization degree Ca-side chain centroid protein model that offers efficiency and a good approximation for space filling. The approach adapts existing energy models to the protein representation used and applies a large neighboring search strategy. The results shows the feasibility and efficiency of the method. The declarative nature of the solution allows to include future extensions, e.g., different size fragments for better accuracy.Comment: special issue dedicated to ICLP 201

    3D-PP: A tool for discovering conserved three-dimensional protein patterns

    Get PDF
    Discovering conserved three-dimensional (3D) patterns among protein structures may provide valuable insights into protein classification, functional annotations or the rational design of multi-target drugs. Thus, several computational tools have been developed to discover and compare protein 3D-patterns. However, most of them only consider previously known 3D-patterns such as orthosteric binding sites or structural motifs. This fact makes necessary the development of new methods for the identification of all possible 3D-patterns that exist in protein structures (allosteric sites, enzyme-cofactor interaction motifs, among others). In this work, we present 3D-PP, a new free access web server for the discovery and recognition all similar 3D amino acid patterns among a set of proteins structures (independent of their sequence similarity). This new tool does not require any previous structural knowledge about ligands, and all data are organized in a high-performance graph database. The input can be a text file with the PDB access codes or a zip file of PDB coordinates regardless of the origin of the structural data: X-ray crystallographic experiments or in silico homology modeling. The results are presented as lists of sequence patterns that can be further analyzed within the web page. We tested the accuracy and suitability of 3D-PP using two sets of proteins coming from the Protein Data Bank: (a) Zinc finger containing and (b) Serotonin target proteins. We also evaluated its usefulness for the discovering of new 3D-patterns, using a set of protein structures coming from in silico homology modeling methodologies, all of which are overexpressed in different types of cancer. Results indicate that 3D-PP is a reliable, flexible and friendly-user tool to identify conserved structural motifs, which could be relevant to improve the knowledge about protein function or classification. The web server can be freely utilized at https://appsbio.utalca.cl/3d-pp/.Peer ReviewedPostprint (published version

    Charge environments around phosphorylation sites in proteins

    Get PDF
    Background: Phosphorylation is a central feature in many biological processes. Structural analyses have identified the importance of charge-charge interactions, for example mediating phosphorylation-driven allosteric change and protein binding to phosphopeptides. Here, we examine computationally the prevalence of charge stabilisation around phosphorylated sites in the structural database, through comparison with locations that are not phosphorylated in the same structures. Results: A significant fraction of phosphorylated sites appear to be electrostatically stabilised, largely through interaction with sidechains. Some examples of stabilisation across a subunit interface are evident from calculations with biological units. When considering the immediately surrounding environment, in many cases favourable interactions are only apparent after conformational change that accompanies phosphorylation. A simple calculation of potential interactions at longer-range, applied to non-phosphorylated structures, recovers the separation exhibited by phosphorylated structures. In a study of sites in the Phospho.ELM dataset, for which structural annotation is provided by non-phosphorylated proteins, there is little separation of the known phospho-acceptor sites relative to background, even using the wider interaction radius. However, there are differences in the distributions of patch polarity for acceptor and background sites in the Phospho.ELM dataset. Conclusion: In this study, an easy to implement procedure is developed that could contribute to the identification of phospho-acceptor sites associated with charge-charge interactions and conformational change. Since the method gives information about potential anchoring interactions subsequent to phosphorylation, it could be combined with simulations that probe conformational change. Our analysis of the Phospho.ELM dataset also shows evidence for mediation of phosphorylation effects through (i) conformational change associated with making a solvent inaccessible phospho-acceptor site accessible, and (ii) modulation of protein-protein interactions
    • 

    corecore