2,212 research outputs found

    WENDI: A tool for finding non-obvious relationships between compounds and biological properties, genes, diseases and scholarly publications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In recent years, there has been a huge increase in the amount of publicly-available and proprietary information pertinent to drug discovery. However, there is a distinct lack of data mining tools available to harness this information, and in particular for knowledge discovery across multiple information sources. At Indiana University we have an ongoing project with Eli Lilly to develop web-service based tools for integrative mining of chemical and biological information. In this paper, we report on the first of these tools, called WENDI (Web Engine for Non-obvious Drug Information) that attempts to find non-obvious relationships between a query compound and scholarly publications, biological properties, genes and diseases using multiple information sources.</p> <p>Results</p> <p>We have created an aggregate web service that takes a query compound as input, calls multiple web services for computation and database search, and returns an XML file that aggregates this information. We have also developed a client application that provides an easy-to-use interface to this web service. Both the service and client are publicly available.</p> <p>Conclusions</p> <p>Initial testing indicates this tool is useful in identifying potential biological applications of compounds that are not obvious, and in identifying corroborating and conflicting information from multiple sources. We encourage feedback on the tool to help us refine it further. We are now developing further tools based on this model.</p

    Investigating the correlations among the chemical structures, bioactivity profiles and molecular targets of small molecules

    Get PDF
    Motivation: Most of the previous data mining studies based on the NCI-60 dataset, due to its intrinsic cell-based nature, can hardly provide insights into the molecular targets for screened compounds. On the other hand, the abundant information of the compound–target associations in PubChem can offer extensive experimental evidence of molecular targets for tested compounds. Therefore, by taking advantages of the data from both public repositories, one may investigate the correlations between the bioactivity profiles of small molecules from the NCI-60 dataset (cellular level) and their patterns of interactions with relevant protein targets from PubChem (molecular level) simultaneously

    pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties.

    Get PDF
    The development of new, effective, and safe drugs to treat cancer remains a challenging and time-consuming task due to limited hit rates, restraining subsequent development efforts. Despite the impressive progress of quantitative structure-activity relationship and machine learning-based models that have been developed to predict molecule pharmacodynamics and bioactivity, they have had mixed success at identifying compounds with anticancer properties against multiple cell lines. Here, we have developed a novel predictive tool, pdCSM-cancer, which uses a graph-based signature representation of the chemical structure of a small molecule in order to accurately predict molecules likely to be active against one or multiple cancer cell lines. pdCSM-cancer represents the most comprehensive anticancer bioactivity prediction platform developed till date, comprising trained and validated models on experimental data of the growth inhibition concentration (GI50%) effects, including over 18,000 compounds, on 9 tumor types and 74 distinct cancer cell lines. Across 10-fold cross-validation, it achieved Pearson's correlation coefficients of up to 0.74 and comparable performance of up to 0.67 across independent, non-redundant blind tests. Leveraging the insights from these cell line-specific models, we developed a generic predictive model to identify molecules active in at least 60 cell lines. Our final model achieved an area under the receiver operating characteristic curve (AUC) of up to 0.94 on 10-fold cross-validation and up to 0.94 on independent non-redundant blind tests, outperforming alternative approaches. We believe that our predictive tool will provide a valuable resource to optimizing and enriching screening libraries for the identification of effective and safe anticancer molecules. To provide a simple and integrated platform to rapidly screen for potential biologically active molecules with favorable anticancer properties, we made pdCSM-cancer freely available online at http://biosig.unimelb.edu.au/pdcsm_cancer

    CancerResource: a comprehensive database of cancer-relevant proteins and compound interactions supported by experimental knowledge

    Get PDF
    During the development of methods for cancer diagnosis and treatment, a vast amount of information is generated. Novel cancer target proteins have been identified and many compounds that activate or inhibit cancer-relevant target genes have been developed. This knowledge is based on an immense number of experimentally validated compound–target interactions in the literature, and excerpts from literature text mining are spread over numerous data sources. Our own analysis shows that the overlap between important existing repositories such as Comparative Toxicogenomics Database (CTD), Therapeutic Target Database (TTD), Pharmacogenomics Knowledge Base (PharmGKB) and DrugBank as well as between our own literature mining for cancer-annotated entries is surprisingly small. In order to provide an easy overview of interaction data, it is essential to integrate this information into a single, comprehensive data repository. Here, we present CancerResource, a database that integrates cancer-relevant relationships of compounds and targets from (i) our own literature mining and (ii) external resources complemented with (iii) essential experimental and supporting information on genes and cellular effects. In order to facilitate an overview of existing and supporting information, a series of novel information connections have been established. CancerResource addresses the spectrum of research on compound–target interactions in natural sciences as well as in individualized medicine; CancerResource is available at: http://bioinformatics.charite.de/cancerresource/

    HLA class I and II genotype of the NCI-60 cell lines

    Get PDF
    Sixty cancer cell lines have been extensively characterized and used by the National Cancer Institute's Developmental Therapeutics Program (NCI-60) since the early 90's as screening tools for anti-cancer drug development. An extensive database has been accumulated that could be used to select individual cells lines for specific experimental designs based on their global genetic and biological profile. However, information on the human leukocyte antigen (HLA) genotype of these cell lines is scant and mostly antiquated since it was derived from serological typing. We, therefore, re-typed the NCI-60 panel of cell lines by high-resolution sequence-based typing. This information may be used to: 1) identify and verify the identity of the same cell lines at various institutions; 2) check for possible contaminant cell lines in culture; 3) adopt individual cell lines for experiments in which knowledge of HLA molecule expression is relevant. Since genome-based typing does not guarantee actual surface protein expression, further characterization of relevant cell lines should be entertained to verify surface expression in experiments requiring correct antigen presentation

    Exploring a structural protein-drug interactome for new therapeutics in lung cancer

    Get PDF
    The pharmacology of drugs is often defined by more than one protein target. This property can be exploited to use approved drugs to uncover new targets and signaling pathways in cancer. Towards enabling a rational approach to uncover new targets, we expand a structural protein-ligand interactome () by scoring the interaction among 1000 FDA-approved drugs docked to 2500 pockets on protein structures of the human genome. This afforded a drug-target network whose properties compared favorably with previous networks constructed using experimental data. Among drugs with the highest degree and betweenness two are cancer drugs and one is currently used for treatment of lung cancer. Comparison of predicted cancer and non-cancer targets reveals that the most cancer-specific compounds were also the most selective compounds. Analysis of compound flexibility, hydrophobicity, and size showed that the most selective compounds were low molecular weight fragment-like heterocycles. We use a previously-developed screening approach using the cancer drug erlotinib as a template to screen other approved drugs that mimic its properties. Among the top 12 ranking candidates, four are cancer drugs, two of them kinase inhibitors (like erlotinib). Cellular studies using non-small cell lung cancer (NSCLC) cells revealed that several drugs inhibited lung cancer cell proliferation. We mined patient records at the Regenstrief Medical Record System to explore the possible association of exposure to three of these drugs with occurrence of lung cancer. Preliminary in vivo studies using the non-small cell lung cancer (NCLSC) xenograft model showed that losartan- and astemizole-treated mice had tumors that weighed 50 (p < 0.01) and 15 (p < 0.01) percent less than the treated controls. These results set the stage for further exploration of these drugs and to uncover new drugs for lung cancer therapy

    Correlation between cell line chemosensitivity and protein expression pattern as new approach for the design of targeted anticancer small molecules

    Get PDF
    BACKGROUND AND RATIONALE: Over the past few decades, several databases with a significant amount of biological data related to cancer cells and anticancer agents (e.g.: National Cancer Institute database, NCI; Cancer Cell Line Encyclopedia, CCLE; Genomic and Drug Sensitivity in Cancer portal, GDSC) have been developed. The huge amount of heterogeneous biological data extractable from these databanks (among all, drug response and protein expression) provides a real foundation for predictive cancer chemogenomics, which aims to investigate the relationships between genomic traits and the response of cancer cells to drug treatment with the aim to identify novel therapeutic molecules and targets. In very recent times many computational and statistical approaches have been proposed to integrate and correlate these heterogeneous biological data sequences (protein expression – drug response), with the aim to assign the putative mechanism of action of anticancer small molecules with unknown biological target/s. The main limitation of all these computational methods is the need for experimental drug response data (after screening data). From this point of view, the possibility to predict in silico the antiproliferative activity of new/untested small molecules against specific cell lines, could enable correlations to be found between the predicted drug response and protein expression of the desired target from the very earliest stages of research. Such an innovative approach could allow to select the compounds with molecular mechanisms that are more likely to be connected with the target of interest preliminary to the in vitro assays, which would be a critical aid in the design of new targeted anticancer agents. RESULTS: In the present study, we aimed to develop a new innovative computational protocol based on the correlation of drug activity and protein expression data to support the discovery of new targeted anticancer agents. Compared with the approaches reported in the literature, the main novelty of the proposed protocol was represented by the use of predicted antiproliferative activity data, instead of experimental ones. To this aim, in the first phase of the research the new in silico Antiproliferative Activity Predictor (AAP) tool able to predict the anticancer activity (expressed as GI50) of new/untested small molecules against the NCI-60 panel was developed. The ligand-based tool, which took the advantages of the consolidated expertise of the research group in the manipulation of molecular descriptors, was adequately validated and the reliability of the prediction was further confirmed by the analysis of an in-house database and subsequent evaluation of a set of molecules selected by the NCI for the one-dose/five-doses antiproliferative assays. In the second part of the study, a new computational method to correlate drug activity data and protein expression pattern data was proposed and evaluated by analysing several case studies of targeted drugs tested by NCI, confirming the reliability of the proposed method for the biological data analysis. In the last part of the project the proposed correlation approach was applied to design new small molecules as selective inhibitors of Cdc25 phosphatase, a well-known protein involved in carcinogenic processes. By means of this innovative approach, integrated with other classical ligand/structures-based techniques, it was possible to screen a large database of molecular structures, and to select the ones with optimal relationship with the focused target. In vitro antiproliferative and enzymatic inhibition assays of the selected compounds led to the identification of new structurally heterogeneous inhibitors of Cdc25 proteins and confirmed the results of the in silico analysis. CONCLUSIONS: Collectively, the obtained results showed that the correlation between protein expression pattern and chemosensitivity is an innovative, alternative, and effective method to identify new modulators for the selected targets. In contrast to traditional in silico methods, the proposed protocol allows for the selection of molecular structures with heterogeneous scaffolds, which are not strictly related to the binding sites and with chemical-physical features that may be more suitable for all the pathways involved in the overall mechanism. The biological assays further corroborate the robustness and the reliability of this new approach and encourage its application in the anticancer targeted drug discovery field

    Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining

    Get PDF
    Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets o

    Selective Targeting of Tumorigenic Cancer Cell Lines by Microtubule Inhibitors

    Get PDF
    For anticancer drug therapy, it is critical to kill those cells with highest tumorigenic potential, even when they comprise a relatively small fraction of the overall tumor cell population. We have used the established NCI/DTP 60 cell line growth inhibition assay as a platform for exploring the relationship between chemical structure and growth inhibition in both tumorigenic and non-tumorigenic cancer cell lines. Using experimental measurements of “take rate” in ectopic implants as a proxy for tumorigenic potential, we identified eight chemical agents that appear to strongly and selectively inhibit the growth of the most tumorigenic cell lines. Biochemical assay data and structure-activity relationships indicate that these compounds act by inhibiting tubulin polymerization. Yet, their activity against tumorigenic cell lines is more selective than that of the other microtubule inhibitors in clinical use. Biochemical differences in the tubulin subunits that make up microtubules, or differences in the function of microtubules in mitotic spindle assembly or cell division may be associated with the selectivity of these compounds
    corecore