10 research outputs found

    Semantic Web for data harmonization in Chinese medicine

    Get PDF
    Scientific studies to investigate Chinese medicine with Western medicine have been generating a large amount of data to be shared preferably under a global data standard. This article provides an overview of Semantic Web and identifies some representative Semantic Web applications in Chinese medicine. Semantic Web is proposed as a standard for representing Chinese medicine data and facilitating their integration with Western medicine data

    Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent advances in high-throughput screening (HTS) techniques and readily available compound libraries generated using combinatorial chemistry or derived from natural products enable the testing of millions of compounds in a matter of days. Due to the amount of information produced by HTS assays, it is a very challenging task to mine the HTS data for potential interest in drug development research. Computational approaches for the analysis of HTS results face great challenges due to the large quantity of information and significant amounts of erroneous data produced.</p> <p>Results</p> <p>In this study, Decision Trees (DT) based models were developed to discriminate compound bioactivities by using their chemical structure fingerprints provided in the PubChem system <url>http://pubchem.ncbi.nlm.nih.gov</url>. The DT models were examined for filtering biological activity data contained in four assays deposited in the PubChem Bioassay Database including assays tested for 5HT1a agonists, antagonists, and HIV-1 RT-RNase H inhibitors. The 10-fold Cross Validation (CV) sensitivity, specificity and Matthews Correlation Coefficient (MCC) for the models are 57.2~80.5%, 97.3~99.0%, 0.4~0.5 respectively. A further evaluation was also performed for DT models built for two independent bioassays, where inhibitors for the same HIV RNase target were screened using different compound libraries, this experiment yields enrichment factor of 4.4 and 9.7.</p> <p>Conclusion</p> <p>Our results suggest that the designed DT models can be used as a virtual screening technique as well as a complement to traditional approaches for hits selection.</p

    Exploring the Ligand-Protein Networks in Traditional Chinese Medicine: Current Databases, Methods, and Applications

    Get PDF
    The traditional Chinese medicine (TCM), which has thousands of years of clinical application among China and other Asian countries, is the pioneer of the &quot;multicomponent-multitarget&quot; and network pharmacology. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This paper firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful in silico target identification attempts are discussed in detail along with experiments in exploring the ligand-protein networks of TCM. Finally, it will be concluded that the prospective application of ligand-protein networks can be used not only to predict protein targets of a small molecule, but also to explore the mode of action of TCM

    Molecular Similarity and Xenobiotic Metabolism

    Get PDF
    MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner.MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm.In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions.This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds.MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics website.----Boehringer-Ingelhie

    Plant extracts and natural products - Predictive structural and biodiversity-based analyses of uses, bioactivity, and 'research and development' potential

    Get PDF
    The process of drug discovery and development over the last 30 years has been increasingly shaped by formulaic approaches and natural products – integral to the drug discovery process and widely recognized as the most successful class of drug leads – have significantly been deprioritized by a struggling worldwide pharmaceutical industry. Alkaloids - historically the most important superclass of medically important secondary metabolites - have been used worldwide as a source of remedies to treat a wide variety of illnesses yet, there exists a wide discrepancy between their historical and modern significances. To understand these trends from an insider’s perspective, 52 senior-stakeholders in industry and academia were engaged to provide insights on a series of qualitative and quantitative aspects related to developments in the process of drug discovery from natural products. Stakeholders highlighted the dissonance between the perceived high potential of natural products as drug leads and overall industry and company level strategies. Many industry contacts were highly critical to prevalent company and industry-wide drug discovery strategies indicating a high level of dissatisfaction within the industry. One promising strategy which respondents highlighted was virtual screening which, to a large extent has not been explored in natural products research strategies. Furthermore, the physicochemical features of 27,783 alkaloids from the Dictionary of Natural Products were cross-referenced to pharmacologically significant and other metrics from various databases including the European Bioinformatics Institute’s ChEMBL and Global Biodiversity Information Facility’s GBIF biodiversity data. The combined dataset revealed that a compound's likelihood of medicinal use can be linked to its host species’ abundance and was input into target-independent machine learning algorithms to predict likelihood of pharmaceutical use. The neural network model demonstrated an accuracy of >57% for all pharmaceutical alkaloids and 98% of all alkaloids. This study is the first to incorporate the biodiversity of host organisms in a machine learning scheme characterizing druglikeness and thus demonstrates the link between host species’ abundance and druglikeness. These findings yield new insights into cost-effective, real-world indicators of drug development potential across the diverse field of natural products
    corecore