32 research outputs found

    Machine Learning Toxicity Prediction: Latest Advances by Toxicity End Point

    Get PDF
    Machine learning (ML) models to predict the toxicity of small molecules have garnered great attention and have become widely used in recent years. Computational toxicity prediction is particularly advantageous in the early stages of drug discovery in order to filter out molecules with high probability of failing in clinical trials. This has been helped by the increase in the number of large toxicology databases available. However, being an area of recent application, a greater understanding of the scope and applicability of ML methods is still necessary. There are various kinds of toxic end points that have been predicted in silico. Acute oral toxicity, hepatotoxicity, cardiotoxicity, mutagenicity, and the 12 Tox21 data end points are among the most commonly investigated. Machine learning methods exhibit different performances on different data sets due to dissimilar complexity, class distributions, or chemical space covered, which makes it hard to compare the performance of algorithms over different toxic end points. The general pipeline to predict toxicity using ML has already been analyzed in various reviews. In this contribution, we focus on the recent progress in the area and the outstanding challenges, making a detailed description of the state-of-the-art models implemented for each toxic end point. The type of molecular representation, the algorithm, and the evaluation metric used in each research work are explained and analyzed. A detailed description of end points that are usually predicted, their clinical relevance, the available databases, and the challenges they bring to the field are also highlighted.Fil: Cavasotto, Claudio Norberto. Universidad Austral. Facultad de Ciencias Biomédicas. Instituto de Investigaciones en Medicina Traslacional. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones en Medicina Traslacional; ArgentinaFil: Scardino, Valeria. Universidad Austral; Argentin

    Inroads to Predict in Vivo Toxicology—An Introduction to the eTOX Project

    Get PDF
    There is a widespread awareness that the wealth of preclinical toxicity data that the pharmaceutical industry has generated in recent decades is not exploited as efficiently as it could be. Enhanced data availability for compound comparison (“read-across”), or for data mining to build predictive tools, should lead to a more efficient drug development process and contribute to the reduction of animal use (3Rs principle). In order to achieve these goals, a consortium approach, grouping numbers of relevant partners, is required. The eTOX (“electronic toxicity”) consortium represents such a project and is a public-private partnership within the framework of the European Innovative Medicines Initiative (IMI). The project aims at the development of in silico prediction systems for organ and in vivo toxicity. The backbone of the project will be a database consisting of preclinical toxicity data for drug compounds or candidates extracted from previously unpublished, legacy reports from thirteen European and European operation-based pharmaceutical companies. The database will be enhanced by incorporation of publically available, high quality toxicology data. Seven academic institutes and five small-to-medium size enterprises (SMEs) contribute with their expertise in data gathering, database curation, data mining, chemoinformatics and predictive systems development. The outcome of the project will be a predictive system contributing to early potential hazard identification and risk assessment during the drug development process. The concept and strategy of the eTOX project is described here, together with current achievements and future deliverables

    Associating adverse drug effects with protein targets by integrating adverse event, in vitro bioactivity, and pharmacokinetic data

    Get PDF
    Adverse drug effects are unintended and undesirable effects of medicines, causing attrition of molecules in drug development and harm to patients. To anticipate potential adverse effects early, drug candidates are commonly screened for pharmacological activity against a panel of protein targets. However, there is a lack of large-scale, quantitative information on the links between routinely screened proteins and the reporting of adverse events (AEs). This work describes a systematic analysis of associations between AEs observed in humans and bioactivities of drugs while taking into account drug plasma concentrations. In the first chapter, post-marketing drug-AE associations are derived from the United States Food and Drug Administration Adverse Event Reporting System using disproportionality methods, while applying Propensity Score Matching to reduce confounding factors. The resulting drug-AE associations are compared to those from the Side Effect Resource, which are primarily derived from clinical trials. The analysis reveals that the datasets generally share less than 10% of reported AEs for the same drug and have different distributions of AEs across System Organ Classes (SOCs). Using the drugs from the two AE datasets described in the first chapter, the second chapter integrates corresponding bioactivities, i.e. measured potencies and affinities from the ChEMBL database and ligand-based target predictions obtained with the tool PIDGIN, with drug plasma concentrations compiled from literature, such as Cmax. Compared to a constant bioactivity cut-off of 1 uM, using the ratio of the unbound drug plasma concentration over the drug potency, i.e. Cmax/XC50, results in different binary activity calls for protein targets. Whether deriving activity calls in this way results in the selection of targets with greater relevance to human AEs is investigated in the third chapter, which computes relationships between targets and AEs using different measures of statistical association. Using the Cmax/XC50 ratio results in higher Likelihood Ratios and Positive Predictive Values (PPVs) for target-AE associations that were previously reported in the context of secondary pharmacology screening, at the cost of a lower recall, possibly due to the smaller size of the dataset with available plasma concentrations. Furthermore, a large-scale quantitative assessment of bioactivities as indicators of AEs reveals a trade-off between the PPV and how many AE-associated drugs can potentially be detected from in vitro screening, although using combinations of targets can improve the detection rate in ~40% of cases at limited cost to the PPV. The work highlights AEs most strongly related to bioactivities and their SOC distribution. Overall, this thesis contributes to knowledge of the relationships between in vitro bioactivities and empirical evidence of AEs in humans. The results can inform the selection of proteins for secondary pharmacology screening and the development of computational models to predict AEs.Lhasa Limite

    Machine Learning Small Molecule Properties in Drug Discovery

    Full text link
    Machine learning (ML) is a promising approach for predicting small molecule properties in drug discovery. Here, we provide a comprehensive overview of various ML methods introduced for this purpose in recent years. We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity). We discuss existing popular datasets and molecular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. We highlight also challenges of predicting and optimizing multiple properties during hit-to-lead and lead optimization stages of drug discovery and explore briefly possible multi-objective optimization techniques that can be used to balance diverse properties while optimizing lead candidates. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed. Overall, this review provides insights into the landscape of ML models for small molecule property predictions in drug discovery. So far, there are multiple diverse approaches, but their performances are often comparable. Neural networks, while more flexible, do not always outperform simpler models. This shows that the availability of high-quality training data remains crucial for training accurate models and there is a need for standardized benchmarks, additional performance metrics, and best practices to enable richer comparisons between the different techniques and models that can shed a better light on the differences between the many techniques.Comment: 46 pages, 1 figur

    Applicability domains of neural networks for toxicity prediction

    Get PDF
    In this paper, the term "applicability domain" refers to the range of chemical compounds for which the statistical quantitative structure-activity relationship (QSAR) model can accurately predict their toxicity. This is a crucial concept in the development and practical use of these models. First, a multidisciplinary review is provided regarding the theory and practice of applicability domains in the context of toxicity problems using the classical QSAR model. Then, the advantages and improved performance of neural networks (NNs), which are the most promising machine learning algorithms, are reviewed. Within the domain of medicinal chemistry, nine different methods using NNs for toxicity prediction were compared utilizing 29 alternative artificial intelligence (AI) techniques. Similarly, seven NN-based toxicity prediction methodologies were compared to six other AI techniques within the realm of food safety, 11 NN-based methodologies were compared to 16 different AI approaches in the environmental sciences category and four specific NN-based toxicity prediction methodologies were compared to nine alternative AI techniques in the field of industrial hygiene. Within the reviewed approaches, given known toxic compound descriptors and behaviors, we observed a difficulty in being able to extrapolate and predict the effects with untested chemical compounds. Different methods can be used for unsupervised clustering, such as distance-based approaches and consensus-based decision methods. Additionally, the importance of model validation has been highlighted within a regulatory context according to the Organization for Economic Co-operation and Development (OECD) principles, to predict the toxicity of potential new drugs in medicinal chemistry, to determine the limits of detection for harmful substances in food to predict the toxicity limits of chemicals in the environment, and to predict the exposure limits to harmful substances in the workplace. Despite its importance, a thorough application of toxicity models is still restricted in the field of medicinal chemistry and is virtually overlooked in other scientific domains. Consequently, only a small proportion of the toxicity studies conducted in medicinal chemistry consider the applicability domain in their mathematical models, thereby limiting their predictive power to untested drugs. Conversely, the applicability of these models is crucial; however, this has not been sufficiently assessed in toxicity prediction or in other related areas such as food science, environmental science, and industrial hygiene. Thus, this review sheds light on the prevalent use of Neural Networks in toxicity prediction, thereby serving as a valuable resource for researchers and practitioners across these multifaceted domains that could be extended to other fields in future research

    Integration of Data Quality, Kinetics and Mechanistic Modelling into Toxicological Assessment of Cosmetic Ingredients

    Get PDF
    In our modern society we are exposed to many natural and synthetic chemicals. The assessment of chemicals with regard to human safety is difficult but nevertheless of high importance. Beside clinical studies, which are restricted to potential pharmaceuticals only, most toxicity data relevant for regulatory decision-making are based on in vivo data. Due to the ban on animal testing of cosmetic ingredients in the European Union, alternative approaches, such as in vitro and in silico tests, have become more prevalent. In this thesis existing non-testing approaches (i.e. studies without additional experiments) have been extended, e.g. QSAR models, and new non-testing approaches, e.g. in vitro data supported structural alert systems, have been created. The main aspect of the thesis depends on the determination of data quality, improving modelling performance and supporting Adverse Outcome Pathways (AOPs) with definitions of structural alerts and physico-chemical properties. Furthermore, there was a clear focus on the transparency of models, i.e. approaches using algorithmic feature selection, machine learning etc. have been avoided. Furthermore structural alert systems have been written in an understandable and transparent manner. Beside the methodological aspects of this work, cosmetically relevant examples of models have been chosen, e.g. skin penetration and hepatic steatosis. Interpretations of models, as well as the possibility of adjustments and extensions, have been discussed thoroughly. As models usually do not depict reality flawlessly, consensus approaches of various non-testing approaches and in vitro tests should be used to support decision-making in the regulatory context. For example within read-across, it is feasible to use supporting information from QSAR models, docking, in vitro tests etc. By applying a variety of models, results should lead to conclusions being more usable/acceptable within toxicology. Within this thesis (and associated publications) novel methodologies on how to assess and employ statistical data quality and how to screen for potential liver toxicants have been described. Furthermore computational tools, such as models for skin permeability and dermal absorption, have been created
    corecore