3,320 research outputs found

    In silico generation of novel, drug-like chemical matter using the LSTM neural network

    Full text link
    The exploration of novel chemical spaces is one of the most important tasks of cheminformatics when supporting the drug discovery process. Properly designed and trained deep neural networks can provide a viable alternative to brute-force de novo approaches or various other machine-learning techniques for generating novel drug-like molecules. In this article we present a method to generate molecules using a long short-term memory (LSTM) neural network and provide an analysis of the results, including a virtual screening test. Using the network one million drug-like molecules were generated in 2 hours. The molecules are novel, diverse (contain numerous novel chemotypes), have good physicochemical properties and have good synthetic accessibility, even though these qualities were not specific constraints. Although novel, their structural features and functional groups remain closely within the drug-like space defined by the bioactive molecules from ChEMBL. Virtual screening using the profile QSAR approach confirms that the potential of these novel molecules to show bioactivity is comparable to the ChEMBL set from which they were derived. The molecule generator written in Python used in this study is available on request.Comment: in this version fixed some reference number

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    The use of a quantitative structure-activity relationship (QSAR) model to predict GABA-A receptor binding of newly emerging benzodiazepines

    Get PDF
    The illicit market for new psychoactive substances is forever expanding. Benzodiazepines and their derivatives are one of a number of groups of these substances and thus far their number has grown year upon year. For both forensic and clinical purposes it is important to be able to rapidly understand these emerging substances. However as a consequence of the illicit nature of these compounds, there is a deficiency in the pharmacological data available for these ‘new’ benzodiazepines. In order to further understand the pharmacology of ‘new’ benzodiazepines we utilised a quantitative structure-activity relationship (QSAR) approach. A set of 69 benzodiazepine-based compounds was analysed to develop a QSAR training set with respect to published binding values to GABAA receptors. The QSAR model returned an R2 value of 0.90. The most influential factors were found to be the positioning of two H-bond acceptors, two aromatic rings and a hydrophobic group. A test set of nine random compounds was then selected for internal validation to determine the predictive ability of the model and gave an R2 value of 0.86 when comparing the binding values with their experimental data. The QSAR model was then used to predict the binding for 22 benzodiazepines that are classed as new psychoactive substances. This model will allow rapid prediction of the binding activity of emerging benzodiazepines in a rapid and economic way, compared with lengthy and expensive in vitro/in vivo analysis. This will enable forensic chemists and toxicologists to better understand both recently developed compounds and prediction of substances likely to emerge in the future

    Visual and computational analysis of structure-activity relationships in high-throughput screening data

    Get PDF
    Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. This paper reviews recent work in visualisation and data mining that can be used to develop structure-activity relationships from such chemical/biological datasets

    Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

    Get PDF
    This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

    Ontology of core data mining entities

    Get PDF
    In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

    Development of models for predicting Torsade de Pointes cardiac arrhythmias using perceptron neural networks

    Full text link
    Blockage of some ion channels and in particular, the hERG cardiac potassium channel delays cardiac repolarization and can induce arrhythmia. In some cases it leads to a potentially life-threatening arrhythmia known as Torsade de Pointes (TdP). Therefore recognizing drugs with TdP risk is essential. Candidate drugs that are determined not to cause cardiac ion channel blockage are more likely to pass successfully through clinical phases II and III trials (and preclinical work) and not be withdrawn even later from the marketplace due to cardiotoxic effects. The objective of the present study is to develop an SAR model that can be used as an early screen for torsadogenic (causing TdP arrhythmias) potential in drug candidates. The method is performed using descriptors comprised of atomic NMR chemical shifts and corresponding interatomic distances which are combined into a 3D abstract space matrix. The method is called 3D-SDAR (3 dimensional spectral data-activity relationship) and can be interrogated to identify molecular features responsible for the activity, which can in turn yield simplified hERG toxicophores. A dataset of 55 hERG potassium channel inhibitors collected from Kramer et al. consisting of 32 drugs with TdP risk and 23 with no TdP risk was used for training the 3D-SDAR model.An ANN model with multilayer perceptron was used to define collinearities among the independent 3D-SDAR features. A composite model from 200 random iterations with 25% of the molecules in each case yielded the following figures of merit: training, 99.2 %; internal test sets, 66.7%; external (blind validation) test set, 68.4%. In the external test set, 70.3% of positive TdP drugs were correctly predicted. Moreover, toxicophores were generated from TdP drugs. A 3D-SDAR was successfully used to build a predictive model for drug-induced torsadogenic and non-torsadogenic drugs.Comment: Accepted for publication in BMC Bioinformatics (Springer) July 201
    corecore