348 research outputs found
MODBASE, a database of annotated comparative protein structure models and associated resources.
MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by MODPIPE, an automated modeling pipeline that relies primarily on MODELLER for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE currently contains 5,152,695 reliable models for domains in 1,593,209 unique protein sequences; only models based on statistically significant alignments and/or models assessed to have the correct fold are included. MODBASE also allows users to calculate comparative models on demand, through an interface to the MODWEB modeling server (http://salilab.org/modweb). Other resources integrated with MODBASE include databases of multiple protein structure alignments (DBAli), structurally defined ligand binding sites (LIGBASE), predicted ligand binding sites (AnnoLyze), structurally defined binary domain interfaces (PIBASE) and annotated single nucleotide polymorphisms and somatic mutations found in human proteins (LS-SNP, LS-Mut). MODBASE models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/)
DBAli tools: mining the protein structure space
The DBAli tools use a comprehensive set of structural alignments in the DBAli database to leverage the structural information deposited in the Protein Data Bank (PDB). These tools include (i) the DBAlit program that allows users to input the 3D coordinates of a protein structure for comparison by MAMMOTH against all chains in the PDB; (ii) the AnnoLite and AnnoLyze programs that annotate a target structure based on its stored relationships to other structures; (iii) the ModClus program that clusters structures by sequence and structure similarities; (iv) the ModDom program that identifies domains as recurrent structural fragments and (v) an implementation of the COMPARER method in the SALIGN command in MODELLER that creates a multiple structure alignment for a set of related protein structures. Thus, the DBAli tools, which are freely accessible via the World Wide Web at http://salilab.org/DBAli/, allow users to mine the protein structure space by establishing relationships between protein structures and their functions
A Kernel for Open Source Drug Discovery in Tropical Diseases
Open source drug discovery, a promising alternative avenue to conventional patent-based drug development, has so far remained elusive with few exceptions. A major stumbling block has been the absence of a critical mass of preexisting work that volunteers can improve through a series of granular contributions. This paper introduces the results from a newly assembled computational pipeline for identifying protein targets for drug discovery in ten organisms that cause tropical diseases. We have also experimentally tested two promising targets for their binding to commercially available drugs, validating one and invalidating the other. The resulting kernel provides a base of drug targets and lead candidates around which an open source community can nucleate. We invite readers to donate their judgment and in silico and in vitro experiments to develop these targets to the point where drug optimization can begin
A Review of 2011 for PLoS Computational Biology
A Review of 2011 for <em>PLoS Computational Biology</em
SARA: a server for function annotation of RNA structures
Recent interest in non-coding RNA transcripts has resulted in a rapid increase of deposited RNA structures in the Protein Data Bank. However, a characterization and functional classification of the RNA structure and function space have only been partially addressed. Here, we introduce the SARA program for pair-wise alignment of RNA structures as a web server for structure-based RNA function assignment. The SARA server relies on the SARA program, which aligns two RNA structures based on a unit-vector root-mean-square approach. The likely accuracy of the SARA alignments is assessed by three different P-values estimating the statistical significance of the sequence, secondary structure and tertiary structure identity scores, respectively. Our benchmarks, which relied on a set of 419 RNA structures with known SCOR structural class, indicate that at a negative logarithm of mean P-value higher or equal than 2.5, SARA can assign the correct or a similar SCOR class to 81.4% and 95.3% of the benchmark set, respectively. The SARA server is freely accessible via the World Wide Web at http://sgu.bioinfo.cipf.es/services/SARA/
Comparative analysis of homology models of the Ah receptor ligand binding domain: Verification of structure-function predictions by site-directed mutagenesis of a nonfunctional receptor
The aryl hydrocarbon receptor (AHR) is a ligand-dependent transcription factor that mediates the biological and toxic effects of a wide variety of structurally diverse chemicals, including the toxic environmental contaminant 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). While significant interspecies differences in AHR ligand binding specificity, selectivity, and response have been observed, the structural determinants responsible for those differences have not been determined, and homology models of the AHR ligand-binding domain (LBD) are available for only a few species. Here we describe the development and comparative analysis of homology models of the LBD of 16 AHRs from 12 mammalian and nonmammalian species and identify the specific residues contained within their ligand binding cavities. The ligand-binding cavity of the fish AHR exhibits differences from those of mammalian and avian AHRs, suggesting a slightly different TCDD binding mode. Comparison of the internal cavity in the LBD model of zebrafish (zf) AHR2, which binds TCDD with high affinity, to that of zfAHR1a, which does not bind TCDD, revealed that the latter has a dramatically shortened binding cavity due to the side chains of three residues (Tyr296, Thr386, and His388) that reduce the amount of internal space available to TCDD. Mutagenesis of two of these residues in zfAHR1a to those present in zfAHR2 (Y296H and T386A) restored the ability of zfAHR1a to bind TCDD and to exhibit TCDD-dependent binding to DNA. These results demonstrate the importance of these two amino acids and highlight the predictive potential of comparative analysis of homology models from diverse species. The availability of these AHR LBD homology models will facilitate in-depth comparative studies of AHR ligand binding and ligand-dependent AHR activation and provide a novel avenue for examining species-specific differences in AHR responsiveness. © 2013 American Chemical Society
Quaternary structure of a G-protein coupled receptor heterotetramer in complex with Gi and Gs
Background: G-protein-coupled receptors (GPCRs), in the form of monomers or homodimers that bind heterotrimeric G proteins, are fundamental in the transfer of extracellular stimuli to intracellular signaling pathways. Different GPCRs may also interact to form heteromers that are novel signaling units. Despite the exponential growth in the number of solved GPCR crystal structures, the structural properties of heteromers remain unknown. Results: We used single-particle tracking experiments in cells expressing functional adenosine A1-A2A receptors fused to fluorescent proteins to show the loss of Brownian movement of the A1 receptor in the presence of the A2A receptor, and a preponderance of cell surface 2:2 receptor heteromers (dimer of dimers). Using computer modeling, aided by bioluminescence resonance energy transfer assays to monitor receptor homomerization and heteromerization and G-protein coupling, we predict the interacting interfaces and propose a quaternary structure of the GPCR tetramer in complex with two G proteins. Conclusions: The combination of results points to a molecular architecture formed by a rhombus-shaped heterotetramer, which is bound to two different interacting heterotrimeric G proteins (Gi and Gs). These novel results constitute an important advance in understanding the molecular intricacies involved in GPCR function
ModBase, a database of annotated comparative protein structure models, and associated resources
ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence–structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains 10 355 444 reliable models for domains in 2 421 920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are available through the ModBase interface as well as the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs)
Structural genomics is the largest contributor of novel structural leverage
The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database
FLORA: a novel method to predict protein function from structure in diverse superfamilies
Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues
- …