98 research outputs found

    CACHE (Critical Assessment of Computational Hit-finding Experiments): A public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding

    Get PDF
    One aspirational goal of computational chemistry is to predict potent and drug-like binders for any protein, such that only those that bind are synthesized. In this Roadmap, we describe the launch of Critical Assessment of Computational Hit-finding Experiments (CACHE), a public benchmarking project to compare and improve small-molecule hit-finding algorithms through cycles of prediction and experimental testing. Participants will predict small-molecule binders for new and biologically relevant protein targets representing different prediction scenarios. Predicted compounds will be tested rigorously in an experimental hub, and all predicted binders as well as all experimental screening data, including the chemical structures of experimentally tested compounds, will be made publicly available and not subject to any intellectual property restrictions. The ability of a range of computational approaches to find novel binders will be evaluated, compared and openly published. CACHE will launch three new benchmarking exercises every year. The outcomes will be better prediction methods, new small-molecule binders for target proteins of importance for fundamental biology or drug discovery and a major technological step towards achieving the goal of Target 2035, a global initiative to identify pharmacological probes for all human proteins. [Figure not available: see fulltext.

    The Protein Model Portal

    Get PDF
    Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at http://www.proteinmodelportal.org and from the PSI Structural Genomics Knowledgebase

    PRIMO: an interactive homology modeling pipeline

    Get PDF
    The development of automated servers to predict the three-dimensional structure of proteins has seen much progress over the years. These servers make calculations simpler, but largely exclude users from the process. In this study, we present the PRotein Interactive MOdeling (PRIMO) pipeline for homology modeling of protein monomers. The pipeline eases the multi-step modeling process, and reduces the workload required by the user, while still allowing engagement from the user during every step. Default parameters are given for each step, which can either be modified or supplemented with additional external input. PRIMO has been designed for users of varying levels of experience with homology modeling. The pipeline incorporates a user-friendly interface that makes it easy to alter parameters used during modeling

    Optimizing structural modeling for a specific protein scaffold: knottins or inhibitor cystine knots

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 15% and 40% sequence identity and 1.5 to 4.5 Å backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.</p> <p>Results</p> <p>We have designed an automated modeling procedure for predicting the three-dimensionnal structure of knottins. The different steps of the homology modeling pipeline were carefully optimized relatively to a test set of knottins with known structures: template selection and alignment, extraction of structural constraints and model building, model evaluation and refinement. After optimization, the accuracy of predicted models was shown to lie between 1.50 and 1.96 Å from native structures at 50% and 10% maximum sequence identity levels, respectively. These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template. A database of 1621 structural models for all known knottin sequences was generated and is freely accessible from our web server at <url>http://knottin.cbs.cnrs.fr</url>. Models can also be interactively constructed from any knottin sequence using the structure prediction module Knoter1D3D available from our protein analysis toolkit PAT at <url>http://pat.cbs.cnrs.fr</url>.</p> <p>Conclusions</p> <p>This work explores different directions for a systematic homology modeling of a diverse family of protein sequences. In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.</p

    Target 2035 - an update on private sector contributions

    Get PDF
    Target 2035, an international federation of biomedical scientists from the public and private sectors, is leveraging ‘open’ principles to develop a pharmacological tool for every human protein. These tools are important reagents for scientists studying human health and disease and will facilitate the development of new medicines. It is therefore not surprising that pharmaceutical companies are joining Target 2035, contributing both knowledge and reagents to study novel proteins. Here, we present a brief progress update on Target 2035 and highlight some of industry's contributions

    Target 2035 - an update on private sector contributions

    Get PDF
    Target 2035, an international federation of biomedical scientists from the public and private sectors, is leveraging ‘open’ principles to develop a pharmacological tool for every human protein. These tools are important reagents for scientists studying human health and disease and will facilitate the development of new medicines. It is therefore not surprising that pharmaceutical companies are joining Target 2035, contributing both knowledge and reagents to study novel proteins. Here, we present a brief progress update on Target 2035 and highlight some of industry's contributions

    ModBase, a database of annotated comparative protein structure models, and associated resources

    Get PDF
    ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence–structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains 10 355 444 reliable models for domains in 2 421 920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are available through the ModBase interface as well as the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs)

    Target 2035-update on the quest for a probe for every protein

    Get PDF
    Twenty years after the publication of the first draft of the human genome, our knowledge of the human proteome is still fragmented. The challenge of translating the wealth of new knowledge from genomics into new medicines is that proteins, and not genes, are the primary executers of biological function. Therefore, much of how biology works in health and disease must be understood through the lens of protein function. Accordingly, a subset of human proteins has been at the heart of research interests of scientists over the centuries, and we have accumulated varying degrees of knowledge about approximately 65% of the human proteome. Nevertheless, a large proportion of proteins in the human proteome (∼35%) remains uncharacterized, and less than 5% of the human proteome has been successfully targeted for drug discovery. This highlights the profound disconnect between our abilities to obtain genetic information and subsequent development of effective medicines. Target 2035 is an international federation of biomedical scientists from the public and private sectors, which aims to address this gap by developing and applying new technologies to create by year 2035 chemogenomic libraries, chemical probes, and/or biological probes for the entire human proteome

    Quantitative metric profiles capture three-dimensional temporospatial architecture to discriminate cellular functional states

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational analysis of tissue structure reveals sub-visual differences in tissue functional states by extracting quantitative signature features that establish a diagnostic profile. Incomplete and/or inaccurate profiles contribute to misdiagnosis.</p> <p>Methods</p> <p>In order to create more complete tissue structure profiles, we adapted our cell-graph method for extracting quantitative features from histopathology images to now capture temporospatial traits of three-dimensional collagen hydrogel cell cultures. Cell-graphs were proposed to characterize the spatial organization between the cells in tissues by exploiting graph theory wherein the nuclei of the cells constitute the <it>nodes </it>and the approximate adjacency of cells are represented with <it>edges</it>. We chose 11 different cell types representing non-tumorigenic, pre-cancerous, and malignant states from multiple tissue origins.</p> <p>Results</p> <p>We built cell-graphs from the cellular hydrogel images and computed a large set of features describing the structural characteristics captured by the graphs over time. Using three-mode tensor analysis, we identified the five most significant features (metrics) that capture the compactness, clustering, and spatial uniformity of the 3D architectural changes for each cell type throughout the time course. Importantly, four of these metrics are also the discriminative features for our histopathology data from our previous studies.</p> <p>Conclusions</p> <p>Together, these descriptive metrics provide rigorous quantitative representations of image information that other image analysis methods do not. Examining the changes in these five metrics allowed us to easily discriminate between all 11 cell types, whereas differences from visual examination of the images are not as apparent. These results demonstrate that application of the cell-graph technique to 3D image data yields discriminative metrics that have the potential to improve the accuracy of image-based tissue profiles, and thus improve the detection and diagnosis of disease.</p

    Target 2035-update on the quest for a probe for every protein

    Get PDF
    Twenty years after the publication of the first draft of the human genome, our knowledge of the human proteome is still fragmented. The challenge of translating the wealth of new knowledge from genomics into new medicines is that proteins, and not genes, are the primary executers of biological function. Therefore, much of how biology works in health and disease must be understood through the lens of protein function. Accordingly, a subset of human proteins has been at the heart of research interests of scientists over the centuries, and we have accumulated varying degrees of knowledge about approximately 65% of the human proteome. Nevertheless, a large proportion of proteins in the human proteome (∼35%) remains uncharacterized, and less than 5% of the human proteome has been successfully targeted for drug discovery. This highlights the profound disconnect between our abilities to obtain genetic information and subsequent development of effective medicines. Target 2035 is an international federation of biomedical scientists from the public and private sectors, which aims to address this gap by developing and applying new technologies to create by year 2035 chemogenomic libraries, chemical probes, and/or biological probes for the entire human proteome
    corecore