3,320 research outputs found
In silico generation of novel, drug-like chemical matter using the LSTM neural network
The exploration of novel chemical spaces is one of the most important tasks
of cheminformatics when supporting the drug discovery process. Properly
designed and trained deep neural networks can provide a viable alternative to
brute-force de novo approaches or various other machine-learning techniques for
generating novel drug-like molecules. In this article we present a method to
generate molecules using a long short-term memory (LSTM) neural network and
provide an analysis of the results, including a virtual screening test. Using
the network one million drug-like molecules were generated in 2 hours. The
molecules are novel, diverse (contain numerous novel chemotypes), have good
physicochemical properties and have good synthetic accessibility, even though
these qualities were not specific constraints. Although novel, their structural
features and functional groups remain closely within the drug-like space
defined by the bioactive molecules from ChEMBL. Virtual screening using the
profile QSAR approach confirms that the potential of these novel molecules to
show bioactivity is comparable to the ChEMBL set from which they were derived.
The molecule generator written in Python used in this study is available on
request.Comment: in this version fixed some reference number
The use of a quantitative structure-activity relationship (QSAR) model to predict GABA-A receptor binding of newly emerging benzodiazepines
The illicit market for new psychoactive substances is forever expanding. Benzodiazepines and their derivatives are one of a number of groups of these substances and thus far their number has grown year upon year. For both forensic and clinical purposes it is important to be able to rapidly understand these emerging substances. However as a consequence of the illicit nature of these compounds, there is a deficiency in the pharmacological data available for these ‘new’ benzodiazepines. In order to further understand the pharmacology of ‘new’ benzodiazepines we utilised a quantitative structure-activity relationship (QSAR) approach. A set of 69 benzodiazepine-based compounds was analysed to develop a QSAR training set with respect to published binding values to GABAA receptors. The QSAR model returned an R2 value of 0.90. The most influential factors were found to be the positioning of two H-bond acceptors, two aromatic rings and a hydrophobic group. A test set of nine random compounds was then selected for internal validation to determine the predictive ability of the model and gave an R2 value of 0.86 when comparing the binding values with their experimental data. The QSAR model was then used to predict the binding for 22 benzodiazepines that are classed as new psychoactive substances. This model will allow rapid prediction of the binding activity of emerging benzodiazepines in a rapid and economic way, compared with lengthy and expensive in vitro/in vivo analysis. This will enable forensic chemists and toxicologists to better understand both recently developed compounds and prediction of substances likely to emerge in the future
Visual and computational analysis of structure-activity relationships in high-throughput screening data
Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. This paper reviews recent work in visualisation and data mining that can be used to develop structure-activity relationships from such chemical/biological datasets
Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis
This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work
Ontology of core data mining entities
In this article, we present OntoDM-core, an ontology of core data mining
entities. OntoDM-core defines themost essential datamining entities in a three-layered
ontological structure comprising of a specification, an implementation and an application
layer. It provides a representational framework for the description of mining
structured data, and in addition provides taxonomies of datasets, data mining tasks,
generalizations, data mining algorithms and constraints, based on the type of data.
OntoDM-core is designed to support a wide range of applications/use cases, such as
semantic annotation of data mining algorithms, datasets and results; annotation of
QSAR studies in the context of drug discovery investigations; and disambiguation of
terms in text mining. The ontology has been thoroughly assessed following the practices
in ontology engineering, is fully interoperable with many domain resources and
is easy to extend
Development of models for predicting Torsade de Pointes cardiac arrhythmias using perceptron neural networks
Blockage of some ion channels and in particular, the hERG cardiac potassium
channel delays cardiac repolarization and can induce arrhythmia. In some cases
it leads to a potentially life-threatening arrhythmia known as Torsade de
Pointes (TdP). Therefore recognizing drugs with TdP risk is essential.
Candidate drugs that are determined not to cause cardiac ion channel blockage
are more likely to pass successfully through clinical phases II and III trials
(and preclinical work) and not be withdrawn even later from the marketplace due
to cardiotoxic effects. The objective of the present study is to develop an SAR
model that can be used as an early screen for torsadogenic (causing TdP
arrhythmias) potential in drug candidates. The method is performed using
descriptors comprised of atomic NMR chemical shifts and corresponding
interatomic distances which are combined into a 3D abstract space matrix. The
method is called 3D-SDAR (3 dimensional spectral data-activity relationship)
and can be interrogated to identify molecular features responsible for the
activity, which can in turn yield simplified hERG toxicophores. A dataset of 55
hERG potassium channel inhibitors collected from Kramer et al. consisting of 32
drugs with TdP risk and 23 with no TdP risk was used for training the 3D-SDAR
model.An ANN model with multilayer perceptron was used to define collinearities
among the independent 3D-SDAR features. A composite model from 200 random
iterations with 25% of the molecules in each case yielded the following figures
of merit: training, 99.2 %; internal test sets, 66.7%; external (blind
validation) test set, 68.4%. In the external test set, 70.3% of positive TdP
drugs were correctly predicted. Moreover, toxicophores were generated from TdP
drugs. A 3D-SDAR was successfully used to build a predictive model for
drug-induced torsadogenic and non-torsadogenic drugs.Comment: Accepted for publication in BMC Bioinformatics (Springer) July 201
- …