Search CORE

268 research outputs found

Machine learning for the prediction of drug-induced toxicity

Author: Schöning Verena
Publication venue
Publication date: 01/01/2019
Field of study

The knowledge of toxicological properties of compounds (e.g. drugs, chemicals, and contaminants) is crucial for drug development, definition of toxicological thresholds and exposure limits. However, toxicological testing, either in vitro or in vivo, is time-consuming, labour intensive and expensive. An alternative to the classic experiments is the use of computational (in silico) approaches, such as machine learning. For machine learning, it is assumed that substances with comparable structure or molecular features also exhibit the comparable pharmacological or toxicological action. Based on the comparison of substances with known pharmacological or toxicological action to substances with unknown properties, models, which were generated using machine learning methods, are able to predict the action of the latter substances. The aim of this work was the development of predictive machine learning models for the estimation of risk of hepatotoxicity and genotoxicity. These models were then applied on two different substance groups and the outcome was compared to available literature data. The acute hepatotoxic potential of over 600 different pyrrolizidine alkaloids (PAs) was evaluated using the methods Random Forest and artificial Neural Networks. The predicted qualitative hepatotoxicity of both models was highly correlated. Furthermore, specific structural motives showed different hepatotoxic potential. Overall, the obtained results fitted well with already published in vitro and in vivo data on the acute hepatotoxic properties of PAs. The genotoxic/ mutagenic potential of PAs was addressed using six different machine learning methods (LAZAR (Lazy Structure-Activity Relationships), Support Vector Machines, Random Forest and two Deep Learning Networks). Even though the models achieved only low to moderate accuracy rates, the best model clearly showed structural specific differences in the predicted genotoxic potential. Furthermore, the acute hepatotoxic potential of 165 protein kinase inhibitors (PKIs) was predicted using Random Forest and artificial Neural Networks. The models confirmed clinical observations that PKIs have in general a high probability for inducing hepatotoxicity. However, interestingly, there seemed to be a target specific difference, with inhibitors of Janus kinases having the lowest hepatotoxic probability of 60-67%. The greatest challenge is the performance of the models. This has to be validated e.g. by cross-validation before the model can be used on the substances of interest. Although group statements could be easily obtained, due caution has to be taken while interpreting the results of predictive models for single compounds and if possible, comparison to already published data is advisable, as a form of external validation

edoc

Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

Author: Cedrón Francisco
Pastur-Romay L.A.
Pazos A.
Porto-Pazos Ana B.
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

[Abstract] Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2014/049Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2014/039Instituto de Salud Carlos III; PI13/0028

Multidisciplinary Digital Publishing Institute

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

Integration of Spectroscopic and Mass Spectrometric Tools for the Analysis of Novel Psychoactive Substances in Forensic and Toxicology Applications

Author: Cooman Travon
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2022
Field of study

Analytical methods aiming for the detection of novel psychoactive substances are continuously revised due to their utility in the seized drug and toxicology realms. One method frequently employed for the preliminary identification of illicit materials is portable Raman spectroscopy. Even when a substance in possession of an offender is identified, conclusive evidence that it may have been consumed requires additional confirmatory work and further toxicological evaluation of a biological specimen. Many times, the substance consumed may not be detected in the analyzed specimen due to its extensive metabolism. It is therefore challenging to rule out the identity of the drug ingested if metabolic studies have not been performed on a particular substance. This research aims to evaluate portable Raman as a quick, safe, non-destructive method for drug analysis using the instrument’s built-in algorithms and in-house machine and deep learning algorithms. Furthermore, metabolic and toxicologic studies using zebrafish and human liver microsomes are used to elucidate selected opioids. In the first part of this research, a portable Raman instrument—TacticID was validated according to the United Nations Office on Drugs and Crime guidelines using 14 drugs and 15 cutting agents commonly encountered in seized drugs. Analysis was performed through glass and plastic packaging. In-house binary mixtures (n = 64) at the following ratios—1:4, 1:7, 1:10, and 1:20 were evaluated and the results compared to direct analysis in real-time mass spectrometry (DART-MS). Whereas Raman performed better at detecting diluents which consisted of the majority in the mixtures, DART-MS resulted in higher identification for easily ionizable drugs which were present in lower percentages. To compliment the weaknesses in each technique, both methods were combined, resulting in 96% accuracy. However, analysis of 15 authentic adjudicated cases resulted in 83% accuracy using the combined methods, demonstrating the usefulness of these methods as preliminary tests over traditional subjective techniques such as color tests. In instances where a portable Raman instrument is used for drug screening, its accuracy as a single technique is crucial. In this study, the correct identification of the instrument detecting both drug and diluent in binary mixtures was 19%. Therefore, machine learning methods were explored as alternatives to the instrument’s built-in hit quality index algorithm. The findings in this research demonstrated that neural networks and convolutional neural networks were superior to the other algorithms, increasing the correct identification of both compounds to 65 and 64%, respectively. This work demonstrated how the contribution of machine learning can help improve the accuracy of analytical instruments outputs thereby increasing confidence in compounds reported. In the second part of this research, zebrafish which share 70% of gene similarity to humans, were used as a toxicity model to provide information about drug effects on a living system. Fentanyl was selected as a model drug and zebrafish (0 – 96 hours post fertilization) were dosed at 0.01 – 100 µM. Major dose dependent phenotypic effects included pericardial malformations, spine, and yolk extension malformation, all of which inhibited the normal growth and development of the larvae. Additionally, the metabolism of fentanyl and valerylfentanyl were elucidated using zebrafish. Therefore, this work provided insight into the zebrafish model as an alternative to human toxicity and metabolism. The knowledge gained through this research will be used to understand the mechanisms by which these toxic and metabolic effects are observed

The Research Repository @ WVU (West Virginia University)

Evolutionary Computation and QSAR Research

Author: Aguiar-Pulido Vanessa
Cruz-Monteagudo Maykel
Dorado Julián
Gestal M.
Munteanu Cristian-Robert
Rabuñal Juan R.
Publication venue: 'Bentham Science Publishers Ltd.'
Publication date: 01/01/2013
Field of study

[Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. Consellería de Economía e Industria; 10SIN105004P

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Recommended from our members

Improvements in Molecular Mechanics Sampling and Energy Models

Author: Bylund Joseph
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

The process of bringing drugs to market continues to be a slow and expensive affair. And despite recent advances in technology, the cost both in monetary terms and in terms of time between target identification and arrival of a new drug on the market continues to increase. High throughput screening is a first step towards testing a large number of possible bioactive compounds very quickly. However, the space of possible small molecules is limitless, and high throughput screening is limited both by the size of available libraries and the cost of running such a large number of experiments. Therefore, advancements in computational drug screening are necessary in order to maintain the current rate of progress in modern medicine. Computational drug design, or computer assisted drug design, offers a possible way of addressing some of the shortfalls of conventional high throughput screening. Using computational methods, it is possible to estimate parameters such as binding affinity of any small molecule, even those not currently present in any small molecule library, without having to first invest in the often slow and expensive process of finding a synthetic pathway. Computational methods can be used to screen similar molecules, or mutations in small molecule space, seeking to increase binding affinity to the protein target, and thereby efficacy, while simultaneously minimizing binding affinity to other proteins, decreasing cross reactivity, and reducing toxicity and harmful side effects.Computational biology methods of drug research can be broadly classified in a number of different ways. However, one of the most common classifications is according to the methods used to identify possible drug compounds and later optimize those leads. The first broad category is informatics or artificial intelligence based approaches. In these approaches, artificial intelligence methods such as neural networks, support vector machines, and qualitative structure-activity relationships (QSAR) are used to identify chemical or structural properties that contribute heavily to binding affinity. The next category, ligand based approaches, is very useful when there are a large number of known binders for a specific family of proteins. In this approach, the ligands are clustered using a metric of chemical similarity and new compounds which occupy a similar chemical space are likely to also bind strongly with the protein of interest. The final class of methods of computational drug design, and the method explored in this thesis, is the diverse class known as structural methods. These approaches in the most general sense make use of a sampling method to sample a number of protein, or protein-small-molecule interaction conformations and an energy model or scoring function to measure dimensions which would be very difficult and or expensive to measure experimentally. In this thesis, a number of different sampling methods that are applicable to different questions in computational biology are presented. Additionally, an improved algorithm for evaluating implicit solvent effects is presented, and a number of improvements in performance, reliability and utility of the molecular mechanics program used are discussed

Columbia University Academic Commons

Database development and machine learning classification of medicinal chemicals and biomolecules

Author: PANKAJ KUMAR
Publication venue
Publication date: 11/08/2009
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Molecular Similarity and Xenobiotic Metabolism

Author: Adams Samuel E.
Publication venue: University of Cambridge
Publication date: 01/01/2010
Field of study

MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner.MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm.In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions.This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds.MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics website.----Boehringer-Ingelhie

CiteSeerX

Apollo (Cambridge)

Algorithms for pre-microrna classification and a GPU program for whole genome comparison

Author: Zhong Ling
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2016
Field of study

MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpin can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). The first part of this dissertation presents a new method, called MirID, for identifying and classifying microRNA precursors. MirID is comprised of three steps. Initially, a combinatorial feature mining algorithm is developed to identify suitable feature sets. Then, the feature sets are used to train support vector machines to obtain classification models, based on which classifier ensemble is constructed. Finally, an AdaBoost algorithm is adopted to further enhance the accuracy of the classifier ensemble. Experimental results on a variety of species demonstrate the good performance of the proposed approach, and its superiority over existing methods. In the second part of this dissertation, A GPU (Graphics Processing Unit) program is developed for whole genome comparison. The goal for the research is to identify the commonalities and differences of two genomes from closely related organisms, via multiple sequencing alignments by using a seed and extend technique to choose reliable subsets of exact or near exact matches, which are called anchors. A rigorous method named Smith-Waterman search is applied for the anchor seeking, but takes days and months to map millions of bases for mammalian genome sequences. With GPU programming, which is designed to run in parallel hundreds of short functions called threads, up to 100X speed up is achieved over similar CPU executions

Digital Commons @ New Jersey Institute of Technology (NJIT)

An Analysis of Global Gene Expression Resulting from Exposure to Energetic Materials

Author: McIntosh Vernon L, Jr.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2010
Field of study

AN ANALYSIS OF GLOBAL GENE EXPRESSION RESULTING FROM EXPOSURE TO ENERGETIC MATERIALS A Dissertation Presented for the Doctor of Philosophy Degree University of Tennessee, Knoxville VERNON LASHAWN MCINTOSH JR. August 2010 Dedication This dissertation is dedicated to my family. My mother and father Debra and Vernon McIntosh instilled in me the respect for academic excellence and the drive maximize my potential. Early on, my younger brother Kyle started showing signs of a shared interest in biology thus my desire to be a positive role model for him kept me motivated. Last but certainly not least, my loving wife and best friend Nichole has been there to offer love and support throughout my entire undergraduate and graduate degrees. It’s difficult to imagine making it this far without her (and that’s not just because she paid the bills). Abstract Characteristic transcriptional biomarkers have been identified for microbial cultures exposed to 2, 4, 6-trinitrotoluene (TNT), 2, 6-dinitrotoluene (DNT), or triacetone-triperoxide (TATP). This study describes the generation of expression profiles for exposure to each compound, the functional significance of each response, and the identification of the characteristic alterations in gene expression associated with exposure to each compound. Expression profiles were generated from a total of three different candidate organisms: Escherichia coli, Saccharomyces cerevisiae, and Pseudomonas putida. Common to all three organisms, TNT exposure resulted in increased expression of genes involved in toxin resistance and drug efflux systems. The S.cerevisiae and E.coli expression profiles were both characterized by increased expression of genes involved in iron-sulfur cluster assembly, sulfur containing amino acids, sulfate transport and assimilation and the metabolism of nitrogen compounds. Only E.coli and Saccharomyces were used to generate DNT induced expression profiles; both profiles exhibited high degrees of similarity with each organism’s respective TNT profiles. This was especially true of the E.coli profile where 25 of the 30 alterations were also observed after exposure to TNT. A computational discriminant functional analysis was performed to identify characteristic biomarkers for each exposure. For each compound a set of transcriptional biomarkers (10 or less) was developed. An additional set of biomarkers was developed encompassing both TNT and DNT exposure. These sets of genes serve as a transcriptional fingerprint for exposure to each respective compound. The sensitivity and specificity of each transcriptional fingerprint is sufficient to correctly identify exposure to energetic materials against a background of non-energetic compound exposures. This study makes several novel contributions to the greater body of scientific knowledge: • This is the first documented study of the interactions of TATP in any biological system. • This is the first comprehensive gene expression study of the TNT response by P. putida, E.coli or E.coli. • This is the first application of computational class prediction in the development of biomarkers for exposure to energetic material

University of Tennessee, Knoxville: Trace