268 research outputs found

    Machine learning for the prediction of drug-induced toxicity

    Get PDF
    The knowledge of toxicological properties of compounds (e.g. drugs, chemicals, and contaminants) is crucial for drug development, definition of toxicological thresholds and exposure limits. However, toxicological testing, either in vitro or in vivo, is time-consuming, labour intensive and expensive. An alternative to the classic experiments is the use of computational (in silico) approaches, such as machine learning. For machine learning, it is assumed that substances with comparable structure or molecular features also exhibit the comparable pharmacological or toxicological action. Based on the comparison of substances with known pharmacological or toxicological action to substances with unknown properties, models, which were generated using machine learning methods, are able to predict the action of the latter substances. The aim of this work was the development of predictive machine learning models for the estimation of risk of hepatotoxicity and genotoxicity. These models were then applied on two different substance groups and the outcome was compared to available literature data. The acute hepatotoxic potential of over 600 different pyrrolizidine alkaloids (PAs) was evaluated using the methods Random Forest and artificial Neural Networks. The predicted qualitative hepatotoxicity of both models was highly correlated. Furthermore, specific structural motives showed different hepatotoxic potential. Overall, the obtained results fitted well with already published in vitro and in vivo data on the acute hepatotoxic properties of PAs. The genotoxic/ mutagenic potential of PAs was addressed using six different machine learning methods (LAZAR (Lazy Structure-Activity Relationships), Support Vector Machines, Random Forest and two Deep Learning Networks). Even though the models achieved only low to moderate accuracy rates, the best model clearly showed structural specific differences in the predicted genotoxic potential. Furthermore, the acute hepatotoxic potential of 165 protein kinase inhibitors (PKIs) was predicted using Random Forest and artificial Neural Networks. The models confirmed clinical observations that PKIs have in general a high probability for inducing hepatotoxicity. However, interestingly, there seemed to be a target specific difference, with inhibitors of Janus kinases having the lowest hepatotoxic probability of 60-67%. The greatest challenge is the performance of the models. This has to be validated e.g. by cross-validation before the model can be used on the substances of interest. Although group statements could be easily obtained, due caution has to be taken while interpreting the results of predictive models for single compounds and if possible, comparison to already published data is advisable, as a form of external validation

    Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Get PDF
    [Abstract] Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2014/049Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2014/039Instituto de Salud Carlos III; PI13/0028

    Integration of Spectroscopic and Mass Spectrometric Tools for the Analysis of Novel Psychoactive Substances in Forensic and Toxicology Applications

    Get PDF
    Analytical methods aiming for the detection of novel psychoactive substances are continuously revised due to their utility in the seized drug and toxicology realms. One method frequently employed for the preliminary identification of illicit materials is portable Raman spectroscopy. Even when a substance in possession of an offender is identified, conclusive evidence that it may have been consumed requires additional confirmatory work and further toxicological evaluation of a biological specimen. Many times, the substance consumed may not be detected in the analyzed specimen due to its extensive metabolism. It is therefore challenging to rule out the identity of the drug ingested if metabolic studies have not been performed on a particular substance. This research aims to evaluate portable Raman as a quick, safe, non-destructive method for drug analysis using the instrument’s built-in algorithms and in-house machine and deep learning algorithms. Furthermore, metabolic and toxicologic studies using zebrafish and human liver microsomes are used to elucidate selected opioids. In the first part of this research, a portable Raman instrument—TacticID was validated according to the United Nations Office on Drugs and Crime guidelines using 14 drugs and 15 cutting agents commonly encountered in seized drugs. Analysis was performed through glass and plastic packaging. In-house binary mixtures (n = 64) at the following ratios—1:4, 1:7, 1:10, and 1:20 were evaluated and the results compared to direct analysis in real-time mass spectrometry (DART-MS). Whereas Raman performed better at detecting diluents which consisted of the majority in the mixtures, DART-MS resulted in higher identification for easily ionizable drugs which were present in lower percentages. To compliment the weaknesses in each technique, both methods were combined, resulting in 96% accuracy. However, analysis of 15 authentic adjudicated cases resulted in 83% accuracy using the combined methods, demonstrating the usefulness of these methods as preliminary tests over traditional subjective techniques such as color tests. In instances where a portable Raman instrument is used for drug screening, its accuracy as a single technique is crucial. In this study, the correct identification of the instrument detecting both drug and diluent in binary mixtures was 19%. Therefore, machine learning methods were explored as alternatives to the instrument’s built-in hit quality index algorithm. The findings in this research demonstrated that neural networks and convolutional neural networks were superior to the other algorithms, increasing the correct identification of both compounds to 65 and 64%, respectively. This work demonstrated how the contribution of machine learning can help improve the accuracy of analytical instruments outputs thereby increasing confidence in compounds reported. In the second part of this research, zebrafish which share 70% of gene similarity to humans, were used as a toxicity model to provide information about drug effects on a living system. Fentanyl was selected as a model drug and zebrafish (0 – 96 hours post fertilization) were dosed at 0.01 – 100 µM. Major dose dependent phenotypic effects included pericardial malformations, spine, and yolk extension malformation, all of which inhibited the normal growth and development of the larvae. Additionally, the metabolism of fentanyl and valerylfentanyl were elucidated using zebrafish. Therefore, this work provided insight into the zebrafish model as an alternative to human toxicity and metabolism. The knowledge gained through this research will be used to understand the mechanisms by which these toxic and metabolic effects are observed

    Evolutionary Computation and QSAR Research

    Get PDF
    [Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. Consellería de Economía e Industria; 10SIN105004P

    Molecular Similarity and Xenobiotic Metabolism

    Get PDF
    MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner.MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm.In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions.This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds.MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics website.----Boehringer-Ingelhie

    Algorithms for pre-microrna classification and a GPU program for whole genome comparison

    Get PDF
    MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpin can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). The first part of this dissertation presents a new method, called MirID, for identifying and classifying microRNA precursors. MirID is comprised of three steps. Initially, a combinatorial feature mining algorithm is developed to identify suitable feature sets. Then, the feature sets are used to train support vector machines to obtain classification models, based on which classifier ensemble is constructed. Finally, an AdaBoost algorithm is adopted to further enhance the accuracy of the classifier ensemble. Experimental results on a variety of species demonstrate the good performance of the proposed approach, and its superiority over existing methods. In the second part of this dissertation, A GPU (Graphics Processing Unit) program is developed for whole genome comparison. The goal for the research is to identify the commonalities and differences of two genomes from closely related organisms, via multiple sequencing alignments by using a seed and extend technique to choose reliable subsets of exact or near exact matches, which are called anchors. A rigorous method named Smith-Waterman search is applied for the anchor seeking, but takes days and months to map millions of bases for mammalian genome sequences. With GPU programming, which is designed to run in parallel hundreds of short functions called threads, up to 100X speed up is achieved over similar CPU executions

    An Analysis of Global Gene Expression Resulting from Exposure to Energetic Materials

    Get PDF
    AN ANALYSIS OF GLOBAL GENE EXPRESSION RESULTING FROM EXPOSURE TO ENERGETIC MATERIALS A Dissertation Presented for the Doctor of Philosophy Degree University of Tennessee, Knoxville VERNON LASHAWN MCINTOSH JR. August 2010 Dedication This dissertation is dedicated to my family. My mother and father Debra and Vernon McIntosh instilled in me the respect for academic excellence and the drive maximize my potential. Early on, my younger brother Kyle started showing signs of a shared interest in biology thus my desire to be a positive role model for him kept me motivated. Last but certainly not least, my loving wife and best friend Nichole has been there to offer love and support throughout my entire undergraduate and graduate degrees. It’s difficult to imagine making it this far without her (and that’s not just because she paid the bills). Abstract Characteristic transcriptional biomarkers have been identified for microbial cultures exposed to 2, 4, 6-trinitrotoluene (TNT), 2, 6-dinitrotoluene (DNT), or triacetone-triperoxide (TATP). This study describes the generation of expression profiles for exposure to each compound, the functional significance of each response, and the identification of the characteristic alterations in gene expression associated with exposure to each compound. Expression profiles were generated from a total of three different candidate organisms: Escherichia coli, Saccharomyces cerevisiae, and Pseudomonas putida. Common to all three organisms, TNT exposure resulted in increased expression of genes involved in toxin resistance and drug efflux systems. The S.cerevisiae and E.coli expression profiles were both characterized by increased expression of genes involved in iron-sulfur cluster assembly, sulfur containing amino acids, sulfate transport and assimilation and the metabolism of nitrogen compounds. Only E.coli and Saccharomyces were used to generate DNT induced expression profiles; both profiles exhibited high degrees of similarity with each organism’s respective TNT profiles. This was especially true of the E.coli profile where 25 of the 30 alterations were also observed after exposure to TNT. A computational discriminant functional analysis was performed to identify characteristic biomarkers for each exposure. For each compound a set of transcriptional biomarkers (10 or less) was developed. An additional set of biomarkers was developed encompassing both TNT and DNT exposure. These sets of genes serve as a transcriptional fingerprint for exposure to each respective compound. The sensitivity and specificity of each transcriptional fingerprint is sufficient to correctly identify exposure to energetic materials against a background of non-energetic compound exposures. This study makes several novel contributions to the greater body of scientific knowledge: • This is the first documented study of the interactions of TATP in any biological system. • This is the first comprehensive gene expression study of the TNT response by P. putida, E.coli or E.coli. • This is the first application of computational class prediction in the development of biomarkers for exposure to energetic material
    corecore