15,838 research outputs found

    The Phyre2 web portal for protein modeling, prediction and analysis

    Get PDF
    Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission

    PSPP: A Protein Structure Prediction Pipeline for Computing Clusters

    Get PDF
    BACKGROUND:Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster. METHODOLOGY/PRINCIPAL FINDINGS:The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP) fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML) formats. So far, the pipeline has been used to study viral and bacterial proteomes. CONCLUSIONS:The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform resource-intensive ab initio structure prediction

    Alignment of helical membrane protein sequences using AlignMe

    Get PDF
    Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme​/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set

    Secretory RING finger proteins function as effectors in a grapevine galling insect.

    Get PDF
    BackgroundAll eukaryotes share a conserved network of processes regulated by the proteasome and fundamental to growth, development, or perception of the environment, leading to complex but often predictable responses to stress. As a specialized component of the ubiquitin-proteasome system (UPS), the RING finger domain mediates protein-protein interactions and displays considerable versatility in regulating many physiological processes in plants. Many pathogenic organisms co-opt the UPS through RING-type E3 ligases, but little is known about how insects modify these integral networks to generate novel plant phenotypes.ResultsUsing a combination of transcriptome sequencing and genome annotation of a grapevine galling species, Daktulosphaira vitifoliae, we identified 138 putatively secretory protein RING-type (SPRINGs) E3 ligases that showed structure and evolutionary signatures of genes under rapid evolution. Moreover, the majority of the SPRINGs were more expressed in the feeding stage than the non-feeding egg stage, in contrast to the non-secretory RING genes. Phylogenetic analyses indicated that the SPRINGs formed clusters, likely resulting from species-specific gene duplication and conforming to features of arthropod host-manipulating (effector) genes. To test the hypothesis that these SPRINGs evolved to manipulate cellular processes within the plant host, we examined SPRING interactions with grapevine proteins using the yeast two-hybrid assay. An insect SPRING interacted with two plant proteins, a cellulose synthase, CSLD5, and a ribosomal protein, RPS4B suggesting secretion reprograms host immune signaling, cell division, and stress response in favor of the insect. Plant UPS gene expression during gall development linked numerous processes to novel organogenesis.ConclusionsTaken together, D. vitifoliae SPRINGs represent a novel gene expansion that evolved to interact with Vitis hosts. Thus, a pattern is emerging for gall forming insects to manipulate plant development through UPS targeting

    Glabralysins, potential New β-pore-forming toxin family members from the schistosomiasis vector snail biomphalaria glabrata

    Get PDF
    Biomphalaria glabrata is a freshwater Planorbidae snail. In its environment, this mollusk faces numerous microorganisms or pathogens, and has developed sophisticated innate immune mechanisms to survive. The mechanisms of recognition are quite well understood in Biomphalaria glabrata, but immune effectors have been seldom described. In this study, we analyzed a new family of potential immune effectors and characterized five new genes that were named Glabralysins. The five Glabralysin genes showed different genomic structures and the high degree of amino acid identity between the Glabralysins, and the presence of the conserved ETX/MTX2 domain, support the hypothesis that they are pore-forming toxins. In addition, tertiary structure prediction confirms that they are structurally related to a subset of Cry toxins from Bacillus thuringiensis, including Cry23, Cry45, and Cry51. Finally, we investigated their gene expression profiles in snail tissues and demonstrated a mosaic transcription. We highlight the specificity in Glabralysin expression following immune stimulation with bacteria, yeast or trematode parasites. Interestingly, one Glabralysin was found to be expressed in immune-specialized hemocytes, and two others were induced following parasite exposure

    Resuscitation-promoting factors possess a lysozyme-like domain

    Get PDF
    The novel bacterial cytokine family – resuscitation-promoting factors (Rpfs) – share a conserved domain of uncharacterized function. Predicting the structure of this domain suggests that Rpfs possess a lysozyme-like domain. The model highlights the good conservation of residues involved in catalysis and substrate binding. A lysozyme-like function makes sense for this domain in the light of experimental characterization of the biological function of Rpfs

    Purification and functional characterisation of rhiminopeptidase A, a novel aminopeptidase from the venom of Bitis gabonica rhinoceros

    Get PDF
    This study describes the discovery and characterisation of a novel aminopeptidase A from the venom of B. g. rhinoceros and highlights its potential biological importance. Similar to mammalian aminopeptidases, rhiminopeptidase A might be capable of playing roles in altering the blood pressure and brain function of victims. Furthermore, it could have additional effects on the biological functions of other host proteins by cleaving their N-terminal amino acids. This study points towards the importance of complete analysis of individual components of snake venom in order to develop effective therapies for snake bites

    Novel Strategies for Model-Building of G Protein-Coupled Receptors

    Get PDF
    The G protein-coupled receptors constitute still the most densely populated proteinfamily encompassing numerous disease-relevant drug targets. Consequently, medicinal chemistry is expected to pursue targets from that protein family in that hits need to be generated and subsequently optimized towards viable clinical candidates for a variety of therapeutic areas. For the purpose of rationalizing structure-activity relationships within such optimization programs, structural information derived from the ligand's as well as the macromolecule's perspective is essential. While it is relatively straightforward to define pharmacophore hypotheses based on comparative modelling of structurally and biologically characterized low-molecular weight ligands, a deeper understanding of the molecular recognition event underlying, remains challenging, since the principally available amount of experimentally derived structural data on GPCRs is extremely scarse when compared to, e.g., soluble enzymes. In this context, the protein modelling methodologies introduced, developed, optimized, and applied in this thesis provide structural models that are capable of assisting in the development of structural hypotheses on ligand-receptor complexes. As such they provide a valuable structural framework not only for a more detailed insight into ligand-GPCR interaction, but also for guiding the design process towards next-generation compounds which should display enhanced affinity. The model building procedure developed in this thesis systematically follows a hierarchical approach, sequentially generating a 1D topology, followed by a 2D topology that is finally converted into a 3D topology. The determination of a 1D topology is based on a compartmentalization of the linear amino acid sequence of a GPCR of interest into the extracellular, intracellular, and transmembrane sequence stretches. The entire chapter 3 of this study elaborates on the strengths and weaknesses of applying automated prediction tools for the purpose of identifying the transmembrane sequence domains. Based on an once derived 1D topology, a type of in-plane projection structure for the seven transmembrane helices can be derived with the aide of calculated vectorial property moments, yielding the 2D topology. Thorough bioinformatics studies revealed that only a consensus approach based on a conceptual combination of different methods employing a carefully made selection of parameter sets gave reliable results, emphasizing the danger to fully automate a GPCR modelling procedure. Chapter 4 describes a procedure to further expand the 2D topological findings into 3D space, exemplified on the human CCK-B receptor protein. This particular GPCR was chosen as the receptor of interest, since an enormous experimentally derived and structurally relevant data-set was available. Within the computational refinement procedure of constructed GPCR models, major emphasis was laid on the explicit treatment of a non-isotropic solvent environment during molecular mechanics (i.e. energy minimization and molecular dynamics simulations) calculations. The majority of simulations was therefore carried out in a tri-phasic solvent box accounting for a central lipid environment, flanked by two aqueous compartments, mimicking the extracellular and cytoplasmic space. Chapter 5 introduces the reference compound set, comprising low-molecular weight compounds modulating CCK receptors, that was used for validation purposes of the generated models of the receptor protein. Chapter 6 describes how the generated model of the CCK-B receptor was subjected to intensive docking studies employing compound series introduced in chapter 5. It turned out that by applying the DRAGHOME methodology viable structural hypotheses on putative receptor-ligand complexes could be generated. Based on the methodology pursued in this thesis a detailed model of the receptor binding site could be devised that accounts for known structure-activity relationships as well as for results obtained by site-directed mutagenesis studies in a qualitative manner. The overall study presented in this thesis is primarily aimed to deliver a feasibility study on generating model structures of GPCRs by a conceptual combination of tailor-made bioinformatics techniques with the toolbox of protein modelling, exemplified on the human CCK-B receptor. The generated structures should be envisioned as models only, not necessarily providing a detailed image of reality. However, consistent models, when verified and refined against experimental data, deliver an extremely useful structural contextual platform on which different scientific disciplines such as medicinal chemistry, molecular biology, and biophysics can effectively communicate

    Membrane Topology and Predicted RNA-Binding Function of the ‘Early Responsive to Dehydration (ERD4)’ Plant Protein

    Get PDF
    Functional annotation of uncharacterized genes is the main focus of computational methods in the post genomic era. These tools search for similarity between proteins on the premise that those sharing sequence or structural motifs usually perform related functions, and are thus particularly useful for membrane proteins. Early responsive to dehydration (ERD) genes are rapidly induced in response to dehydration stress in a variety of plant species. In the present work we characterized function of Brassica juncea ERD4 gene using computational approaches. The ERD4 protein of unknown function possesses ubiquitous DUF221 domain (residues 312–634) and is conserved in all plant species. We suggest that the protein is localized in chloroplast membrane with at least nine transmembrane helices. We detected a globular domain of 165 amino acid residues (183–347) in plant ERD4 proteins and expect this to be posited inside the chloroplast. The structural-functional annotation of the globular domain was arrived at using fold recognition methods, which suggested in its sequence presence of two tandem RNA-recognition motif (RRM) domains each folded into βαββαβ topology. The structure based sequence alignment with the known RNA-binding proteins revealed conservation of two non-canonical ribonucleoprotein sub-motifs in both the putative RNA-recognition domains of the ERD4 protein. The function of highly conserved ERD4 protein may thus be associated with its RNA-binding ability during the stress response. This is the first functional annotation of ERD4 family of proteins that can be useful in designing experiments to unravel crucial aspects of stress tolerance mechanism

    The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function

    Get PDF
    BACKGROUND: The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence similarity between these proteins is so low, that most of the relationships between known members of the PD-(D/E)XK superfamily were identified only after the corresponding structures were determined experimentally. Thus, it is tempting to speculate that among the uncharacterized protein families, there are potential nucleases that remain to be discovered, but their identification requires more sensitive tools than traditional PSI-BLAST searches. RESULTS: The low degree of amino acid conservation hampers the possibility of identification of new members of the PD-(D/E)XK superfamily based solely on sequence comparisons to known members. Therefore, we used a recently developed method HHsearch for sensitive detection of remote similarities between protein families represented as profile Hidden Markov Models enhanced by secondary structure. We carried out a comparison of known families of PD-(D/E)XK nucleases to the database comprising the COG and PFAM profiles corresponding to both functionally characterized as well as uncharacterized protein families to detect significant similarities. The initial candidates for new nucleases were subsequently verified by sequence-structure threading, comparative modeling, and identification of potential active site residues. CONCLUSION: In this article, we report identification of the PD-(D/E)XK nuclease domain in numerous proteins implicated in interactions with DNA but with unknown structure and mechanism of action (such as putative recombinase RmuC, DNA competence factor CoiA, a DNA-binding protein SfsA, a large human protein predicted to be a DNA repair enzyme, predicted archaeal transcription regulators, and the head completion protein of phage T4) and in proteins for which no function was assigned to date (such as YhcG, various phage proteins, novel candidates for restriction enzymes). Our results contributes to the reduction of "white spaces" on the sequence-structure-function map of the protein universe and will help to jump-start the experimental characterization of new nucleases, of which many may be of importance for the complete understanding of mechanisms that govern the evolution and stability of the genome
    • …
    corecore