176 research outputs found

    Knowledge-based annotation of small molecule binding sites in proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The study of protein-small molecule interactions is vital for understanding protein function and for practical applications in drug discovery. To benefit from the rapidly increasing structural data, it is essential to improve the tools that enable large scale binding site prediction with greater emphasis on their biological validity.</p> <p>Results</p> <p>We have developed a new method for the annotation of protein-small molecule binding sites, using inference by homology, which allows us to extend annotation onto protein sequences without experimental data available. To ensure biological relevance of binding sites, our method clusters similar binding sites found in homologous protein structures based on their sequence and structure conservation. Binding sites which appear evolutionarily conserved among non-redundant sets of homologous proteins are given higher priority. After binding sites are clustered, position specific score matrices (PSSMs) are constructed from the corresponding binding site alignments. Together with other measures, the PSSMs are subsequently used to rank binding sites to assess how well they match the query and to better gauge their biological relevance. The method also facilitates a succinct and informative representation of observed and inferred binding sites from homologs with known three-dimensional structures, thereby providing the means to analyze conservation and diversity of binding modes. Furthermore, the chemical properties of small molecules bound to the inferred binding sites can be used as a starting point in small molecule virtual screening. The method was validated by comparison to other binding site prediction methods and to a collection of manually curated binding site annotations. We show that our method achieves a sensitivity of 72% at predicting biologically relevant binding sites and can accurately discriminate those sites that bind biological small molecules from non-biological ones.</p> <p>Conclusions</p> <p>A new algorithm has been developed to predict binding sites with high accuracy in terms of their biological validity. It also provides a common platform for function prediction, knowledge-based docking and for small molecule virtual screening. The method can be applied even for a query sequence without structure. The method is available at <url>http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi</url>.</p

    Inferred Biomolecular Interaction Serverā€”a web server to analyze and predict protein interacting partners and binding sites

    Get PDF
    IBIS is the NCBI Inferred Biomolecular Interaction Server. This server organizes, analyzes and predicts interaction partners and locations of binding sites in proteins. IBIS provides annotations for different types of binding partners (protein, chemical, nucleic acid and peptides), and facilitates the mapping of a comprehensive biomolecular interaction network for a given protein query. IBIS reports interactions observed in experimentally determined structural complexes of a given protein, and at the same time IBIS infers binding sites/interacting partners by inspecting protein complexes formed by homologous proteins. Similar binding sites are clustered together based on their sequence and structure conservation. To emphasize biologically relevant binding sites, several algorithms are used for verification in terms of evolutionary conservation, biological importance of binding partners, size and stability of interfaces, as well as evidence from the published literature. IBIS is updated regularly and is freely accessible via http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.html

    Homology Inference of Protein-Protein Interactions via Conserved Binding Sites

    Get PDF
    The coverage and reliability of protein-protein interactions determined by high-throughput experiments still needs to be improved, especially for higher organisms, therefore the question persists, how interactions can be verified and predicted by computational approaches using available data on protein structural complexes. Recently we developed an approach called IBIS (Inferred Biomolecular Interaction Server) to predict and annotate protein-protein binding sites and interaction partners, which is based on the assumption that the structural location and sequence patterns of protein-protein binding sites are conserved between close homologs. In this study first we confirmed high accuracy of our method and found that its accuracy depends critically on the usage of all available data on structures of homologous complexes, compared to the approaches where only a non-redundant set of complexes is employed. Second we showed that there exists a trade-off between specificity and sensitivity if we employ in the prediction only evolutionarily conserved binding site clusters or clusters supported by only one observation (singletons). Finally we addressed the question of identifying the biologically relevant interactions using the homology inference approach and demonstrated that a large majority of crystal packing interactions can be correctly identified and filtered by our algorithm. At the same time, about half of biological interfaces that are not present in the protein crystallographic asymmetric unit can be reconstructed by IBIS from homologous complexes without the prior knowledge of crystal parameters of the query protein

    An overview of the PubChem BioAssay resource

    Get PDF
    The PubChem BioAssay database (http://pubchem.ncbi.nlm.nih.gov) is a public repository for biological activities of small molecules and small interfering RNAs (siRNAs) hosted by the US National Institutes of Health (NIH). It archives experimental descriptions of assays and biological test results and makes the information freely accessible to the public. A PubChem BioAssay data entry includes an assay description, a summary and detailed test results. Each assay record is linked to the molecular target, whenever possible, and is cross-referenced to other National Center for Biotechnology Information (NCBI) database records. ā€˜Related BioAssaysā€™ are identified by examining the assay target relationship and activity profile of commonly tested compounds. A key goal of PubChem BioAssay is to make the biological activity information easily accessible through the NCBI information retrieval system-Entrez, and various web-based PubChem services. An integrated suite of data analysis tools are available to optimize the utility of the chemical structure and biological activity information within PubChem, enabling researchers to aggregate, compare and analyze biological test results contributed by multiple organizations. In this work, we describe the PubChem BioAssay database, including data model, bioassay deposition and utilities that PubChem provides for searching, downloading and analyzing the biological activity information contained therein

    CDD: a Conserved Domain Database for protein classification

    Get PDF
    The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMedĀ®, and can be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. CD-Search, which is available at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, is a fast, interactive tool to identify conserved domains in new protein sequences. CD-Search results for protein sequences in Entrez are pre-computed to provide links between proteins and domain models, and computational annotation visible upon request. Proteinā€“protein queries submitted to NCBI's BLAST search service at http://www.ncbi.nlm.nih.gov/BLAST are scanned for the presence of conserved domains by default. While CDD started out as essentially a mirror of publicly available domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and in some cases replace these models with domain hierarchies curated at the NCBI. Here, we report on the progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system

    Investigating the correlations among the chemical structures, bioactivity profiles and molecular targets of small molecules

    Get PDF
    Motivation: Most of the previous data mining studies based on the NCI-60 dataset, due to its intrinsic cell-based nature, can hardly provide insights into the molecular targets for screened compounds. On the other hand, the abundant information of the compoundā€“target associations in PubChem can offer extensive experimental evidence of molecular targets for tested compounds. Therefore, by taking advantages of the data from both public repositories, one may investigate the correlations between the bioactivity profiles of small molecules from the NCI-60 dataset (cellular level) and their patterns of interactions with relevant protein targets from PubChem (molecular level) simultaneously

    Testing gravitational-wave searches with numerical relativity waveforms: Results from the first Numerical INJection Analysis (NINJA) project

    Get PDF
    The Numerical INJection Analysis (NINJA) project is a collaborative effort between members of the numerical relativity and gravitational-wave data analysis communities. The purpose of NINJA is to study the sensitivity of existing gravitational-wave search algorithms using numerically generated waveforms and to foster closer collaboration between the numerical relativity and data analysis communities. We describe the results of the first NINJA analysis which focused on gravitational waveforms from binary black hole coalescence. Ten numerical relativity groups contributed numerical data which were used to generate a set of gravitational-wave signals. These signals were injected into a simulated data set, designed to mimic the response of the Initial LIGO and Virgo gravitational-wave detectors. Nine groups analysed this data using search and parameter-estimation pipelines. Matched filter algorithms, un-modelled-burst searches and Bayesian parameter-estimation and model-selection algorithms were applied to the data. We report the efficiency of these search methods in detecting the numerical waveforms and measuring their parameters. We describe preliminary comparisons between the different search methods and suggest improvements for future NINJA analyses.Comment: 56 pages, 25 figures; various clarifications; accepted to CQ

    Real Potential

    Get PDF
    There\u27s a student in my philosophy class who has real potential. I might express this thought in any of the following ways: She is potentially a philosopher ; She is a potential philosopher ; She has the potential to be a philosopher. The first way uses a cognate of potential as an adverb to modify is. The second ways uses potential as an adjective to modify philosopher. However, the third way uses potential as a noun to refer to something that the student has. What kind of thing is this potential? One worry about even asking this question is that this nominalization of the adjective potential suggests a metaphysical picture that is an artifact of language. This is even more strongly suggested by the less ambiguous nominalization potentiality. Once we have the term potentiality, we have a new kind of entity to countenance, and questions about its nature arise. One might argue, just because we use the word potentiality, we should not think that it refers to a thing that someone can have. There is something disingenuous about such an argument. It proceeds as if the adverbial and adjectival uses of potential are unproblematic, and questions only arise with the nominalization. But it is not obvious what it means to potentially be something, or what it means to be a potential something. To say that someone is potentially a philosopher is to talk about a way of being that falls short of actuality. And a potential philosopher is not a kind of philosopher at all. So what is it? Each of the three above formulations is a modal claim. If there is anything philosophical puzzling about a potentiality claim, it is not going to go away by translating it into an equivalent modal claim. In this chapter I defend the existence of potentialities against anti-realist arguments, and make a proposal as to their nature. The proposal, in short, is that potentialities are properties, specifically dispositions, though more needs to be said about properties and dispositions. I will do this in Part I. In Part II, I will address two lines of argument against potentialities: that they are reducible, and that they are causally inert

    Search for gravitational wave bursts in LIGO's third science run

    Get PDF
    We report on a search for gravitational wave bursts in data from the three LIGO interferometric detectors during their third science run. The search targets subsecond bursts in the frequency range 100-1100 Hz for which no waveform model is assumed, and has a sensitivity in terms of the root-sum-square (rss) strain amplitude of hrss ~ 10^{-20} / sqrt(Hz). No gravitational wave signals were detected in the 8 days of analyzed data.Comment: 12 pages, 6 figures. Amaldi-6 conference proceedings to be published in Classical and Quantum Gravit

    Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining

    Get PDF
    Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets o
    • ā€¦
    corecore