595 research outputs found

    In silico modeling of chemical and biological interactions at different scales

    Get PDF
    En les últimes dècades, molts països han imposat regulacions sobre els efectes potencials de les substàncies químiques envers la salut humana i els criteris mediambientals. A més a més, tenint en compte el temps necessari per a les proves d’avaluació dels efectes de gran nombre de productes químics i el seu cost ha produït un ràpid augment en el nombre de models computacionals, que relacionen l'estructura de les substàncies químiques amb la seva activitat biològica. Actualment existeixen els models de relació estructura-activitat (SAR) per a productes químics, utilitzant un enfocament similar s’ha desenvolupat un nou model i generat conjunts d'alertes metabòliques que es puguin utilitzar juntament amb els mètodes Q(SAR). Aquest treball presenta regles SAR per a la predicció de mutagenicitat in vitro, juntament amb alertes metabòliques per a la predicció in vivo. Permetent, obtenir una idea preliminar sobre si un producte químic exhibeix el mateix comportament mutagènic in vitro i in vivo. Entre els compostos químics, les nanopartícules, també s'estan utilitzant cada cop més a través de diferents classes de productes usats pels consumidors. En un context fisiològic, la corona de les proteïnes constitueix la interfície entre les nanopartícules i les cèl·lules. En aquest treball, s'han utilitzat les propietats fisicoquímiques de la corona de les proteïnes per tal de desenvolupar un model capaç de predir l'associació cel·lular. Finalment, aquesta tesi es centra en el tema de la resistència als fàrmacs en els bacteris, que s'ha convertit en un assumpte d'interès global. Amb l'augment de la resistència dels bacteris als antibiòtics, és important disposar d'informació sobre la resposta que les noves proteïnes bacterianes tindrien sobre els antibiòtics actualment disponibles. Pel qual, en aquest treball s'ha desenvolupat un mètode d'alineació lliure per millorar la classificació en perfils de resistència de les proteïnes bacterianes, en base a les seves propietats fisicoquímiques.En las últimas décadas, muchos países han impuesto regulaciones sobre los efectos potenciales de las sustancias químicas con respecto a la salud humana y a criterios medio ambientales. Además, el tiempo necesario para las pruebas de evaluación de los efectos de un gran número de productos químicos y su coste ha producido un rápido aumento en el número de modelos computacionales que relacionan la estructura de las sustancias químicas con su actividad biológica. Actualmente existen los modelos de relación estructura-actividad (SAR) para productos químicos, utilizando un enfoque similar se ha desarrollado un nuevo modelo para generar conjuntos de alertas metabólicas que puedan utilizarse junto con los métodos Q(SAR). Este trabajo presenta reglas SAR para la predicción de mutagenicidad in vitro, junto con alertas metabólicas para la predicción también in vivo. Permitiendo, además, obtener una idea preliminar de si un producto químico exhibe el mismo comportamiento mutagénico in vitro e in vivo. Entre los compuestos químicos, las nanopartículas, también se están utilizando cada vez más en diferentes clases de productos usados por los consumidores. En términos fisiológicos, la corona de las proteínas constituye la interfaz entre las nanopartículas y las células. En este trabajo se ha desarrollado un modelo con las propiedades físico-químicas de la corona de las proteínas para predecir la asociación celular. Por último, esta tesis se centra en el tema de la resistencia a los fármacos en las bacterias, que se ha convertido en un asunto de interés global. Con el aumento de la resistencia de las bacterias a los antibióticos, es importante disponer información sobre la respuesta que las nuevas proteínas bacterianas tendrán sobre los antibióticos actualmente disponibles. Por esto se ha desarrollado un método de alineación libre para mejorar la clasificación en perfiles de resistencia de las proteínas bacterianas en base a sus propiedades físico-químicas.In the past decades, government, society and industry at large have taken keen interest in the impact at different scales that exposure to chemicals has on humans and environment. Many countries governments have imposed regulations as per which it has become important to establish the potential effects of these chemical entities with respect to human health and environmental endpoints. Given the time taken by traditional tests, costs and large number of chemicals to be evaluated, there has been a rapid growth in the number of computational models that link the structure of chemicals to their biological activity. To extend the basis of knowledge that currently exists in Structure Activity Relationship (SAR) models for chemicals, a similar approach was used to develop a new model and generate sets of metabolic triggers which can be used together with Q(SAR) methods. This thesis presents SAR rules for prediction of mutagenicity in vitro, along with metabolic triggers for prediction of mutagenicity in vitro and in vivo. Along with chemical compounds, nanoparticles are also being used increasingly across different classes of consumers’ products. Since, in physiological context, the protein corona constitutes the interface between the nanoparticle and cells, it plays a fundamental role in nanoparticle-cell association. In this thesis, the physicochemical properties of protein corona were used to develop a model to predict cell association. Lastly, this thesis focuses on the topic of drug resistance in bacteria, which has become a matter of global concern. With bacteria growing resistant to antibiotics at a faster pace than discovery of new antibiotics, information on the response that new bacterial proteins would have to the currently available antibiotics, based on their similarity with the known antibiotic-resistant proteins is necessary. An alignment-free method was developed to improve the resistance profile classification of bacterial proteins based on their physicochemical properties

    Computational methods for prediction of in vitro effects of new chemical structures

    Get PDF
    Background With a constant increase in the number of new chemicals synthesized every year, it becomes important to employ the most reliable and fast in silico screening methods to predict their safety and activity profiles. In recent years, in silico prediction methods received great attention in an attempt to reduce animal experiments for the evaluation of various toxicological endpoints, complementing the theme of replace, reduce and refine. Various computational approaches have been proposed for the prediction of compound toxicity ranging from quantitative structure activity relationship modeling to molecular similarity-based methods and machine learning. Within the “Toxicology in the 21st Century” screening initiative, a crowd-sourcing platform was established for the development and validation of computational models to predict the interference of chemical compounds with nuclear receptor and stress response pathways based on a training set containing more than 10,000 compounds tested in high-throughput screening assays. Results Here, we present the results of various molecular similarity-based and machine-learning based methods over an independent evaluation set containing 647 compounds as provided by the Tox21 Data Challenge 2014. It was observed that the Random Forest approach based on MACCS molecular fingerprints and a subset of 13 molecular descriptors selected based on statistical and literature analysis performed best in terms of the area under the receiver operating characteristic curve values. Further, we compared the individual and combined performance of different methods. In retrospect, we also discuss the reasons behind the superior performance of an ensemble approach, combining a similarity search method with the Random Forest algorithm, compared to individual methods while explaining the intrinsic limitations of the latter. Conclusions Our results suggest that, although prediction methods were optimized individually for each modelled target, an ensemble of similarity and machine-learning approaches provides promising performance indicating its broad applicability in toxicity prediction

    Machine Learning Toxicity Prediction: Latest Advances by Toxicity End Point

    Get PDF
    Machine learning (ML) models to predict the toxicity of small molecules have garnered great attention and have become widely used in recent years. Computational toxicity prediction is particularly advantageous in the early stages of drug discovery in order to filter out molecules with high probability of failing in clinical trials. This has been helped by the increase in the number of large toxicology databases available. However, being an area of recent application, a greater understanding of the scope and applicability of ML methods is still necessary. There are various kinds of toxic end points that have been predicted in silico. Acute oral toxicity, hepatotoxicity, cardiotoxicity, mutagenicity, and the 12 Tox21 data end points are among the most commonly investigated. Machine learning methods exhibit different performances on different data sets due to dissimilar complexity, class distributions, or chemical space covered, which makes it hard to compare the performance of algorithms over different toxic end points. The general pipeline to predict toxicity using ML has already been analyzed in various reviews. In this contribution, we focus on the recent progress in the area and the outstanding challenges, making a detailed description of the state-of-the-art models implemented for each toxic end point. The type of molecular representation, the algorithm, and the evaluation metric used in each research work are explained and analyzed. A detailed description of end points that are usually predicted, their clinical relevance, the available databases, and the challenges they bring to the field are also highlighted.Fil: Cavasotto, Claudio Norberto. Universidad Austral. Facultad de Ciencias Biomédicas. Instituto de Investigaciones en Medicina Traslacional. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones en Medicina Traslacional; ArgentinaFil: Scardino, Valeria. Universidad Austral; Argentin

    Predictive Toxicology: Modeling Chemical Induced Toxicological Response Combining Circular Fingerprints with Random Forest and Support Vector Machine

    Get PDF
    This Document is Protected by copyright and was first published by Frontiers. All rights reserved. it is reproduced with permission.Modern drug discovery and toxicological research are under pressure, as the cost of developing and testing new chemicals for potential toxicological risk is rising. Extensive evaluation of chemical products for potential adverse effects is a challenging task, due to the large number of chemicals and the possible hazardous effects on human health. Safety regulatory agencies around the world are dealing with two major challenges. First, the growth of chemicals introduced every year in household products and medicines that need to be tested, and second the need to protect public welfare. Hence, alternative and more efficient toxicological risk assessment methods are in high demand. The Toxicology in the 21st Century (Tox21) consortium a collaborative effort was formed to develop and investigate alternative assessment methods. A collection of 10,000 compounds composed of environmental chemicals and approved drugs were screened for interference in biochemical pathways and released for crowdsourcing data analysis. The physicochemical space covered by Tox21 library was explored, measured by Molecular Weight (MW) and the octanol/water partition coefficient (cLogP). It was found that on average chemical structures had MW of 272.6 Daltons. In case of cLogP the average value was 2.476. Next relationships between assays were examined based on compounds activity profiles across the assays utilizing the Pearson correlation coefficient r. A cluster was observed between the Androgen and Estrogen Receptors and their ligand bind domains accordingly indicating presence of cross talks among the receptors. The highest correlations observed were between NR.AR and NR.AR_LBD, where it was r = 0.66 and between NR.ER and NR.ER_LBD, where it was r = 0.5. Our approach to model the Tox21 data consisted of utilizing circular molecular fingerprints combined with Random Forest and Support Vector Machine by modeling each assay independently. In all of the 12 sub-challenges our modeling approach achieved performance equal to or higher than 0.7 ROC-AUC showing strong overall performance. Best performance was achieved in sub-challenges NR.AR_LBD, NR.ER_LDB and NR.PPAR_gamma, where ROC-AUC of 0.756, 0.790, and 0.803 was achieved accordingly. These results show that computational methods based on machine learning techniques are well suited to support and play critical role in toxicological research

    A nanoinformatics decision support tool for the virtual screening of gold nanoparticle cellular association using protein corona fingerprints

    Get PDF
    The increasing use of nanoparticles (NPs) in a wide range of consumer and industrial applications has necessitated significant effort to address the challenge of characterizing and quantifying the underlying nanostructure – biological response relationships to ensure that these novel materials can be exploited responsibly and safely. Such efforts demand reliable experimental data not only in terms of the biological dose-response, but also regarding the physicochemical properties of the NPs and their interaction with the biological environment. The latter has not been extensively studied, as a large surface to bind biological macromolecules is a unique feature of NPs that is not relevant for chemicals or pharmaceuticals, and thus only limited data have been reported in the literature quantifying the protein corona formed when NPs interact with a biological medium and linking this with NP cellular association/uptake. In this work we report the development of a predictive model for the assessment of the biological response (cellular association, which can include both internalized NPs and those attached to the cell surface) of surface-modified gold NPs, based on their physicochemical properties and protein corona fingerprints, utilizing a dataset of 105 unique NPs. Cellular association was chosen as the end-point for the original experimental study due to its relevance to inflammatory responses, biodistribution, and toxicity in vivo. The validated predictive model is freely available online through the Enalos Cloud Platform (http://enalos.insilicotox.com/NanoProteinCorona/) to be used as part of a regulatory or NP safe-by-design decision support system. This online tool will allow the virtual screening of NPs, based on a list of the significant NP descriptors, identifying those NPs that would warrant further toxicity testing on the basis of predicted NP cellular association.</p

    Machine Learning Approaches for Improving Prediction Performance of Structure-Activity Relationship Models

    Get PDF
    In silico bioactivity prediction studies are designed to complement in vivo and in vitro efforts to assess the activity and properties of small molecules. In silico methods such as Quantitative Structure-Activity/Property Relationship (QSAR) are used to correlate the structure of a molecule to its biological property in drug design and toxicological studies. In this body of work, I started with two in-depth reviews into the application of machine learning based approaches and feature reduction methods to QSAR, and then investigated solutions to three common challenges faced in machine learning based QSAR studies. First, to improve the prediction accuracy of learning from imbalanced data, Synthetic Minority Over-sampling Technique (SMOTE) and Edited Nearest Neighbor (ENN) algorithms combined with bagging as an ensemble strategy was evaluated. The Friedman’s aligned ranks test and the subsequent Bergmann-Hommel post hoc test showed that this method significantly outperformed other conventional methods. SMOTEENN with bagging became less effective when IR exceeded a certain threshold (e.g., \u3e40). The ability to separate the few active compounds from the vast amounts of inactive ones is of great importance in computational toxicology. Deep neural networks (DNN) and random forest (RF), representing deep and shallow learning algorithms, respectively, were chosen to carry out structure-activity relationship-based chemical toxicity prediction. Results suggest that DNN significantly outperformed RF (p \u3c 0.001, ANOVA) by 22-27% for four metrics (precision, recall, F-measure, and AUPRC) and by 11% for another (AUROC). Lastly, current features used for QSAR based machine learning are often very sparse and limited by the logic and mathematical processes used to compute them. Transformer embedding features (TEF) were developed as new continuous vector descriptors/features using the latent space embedding from a multi-head self-attention. The significance of TEF as new descriptors was evaluated by applying them to tasks such as predictive modeling, clustering, and similarity search. An accuracy of 84% on the Ames mutagenicity test indicates that these new features has a correlation to biological activity. Overall, the findings in this study can be applied to improve the performance of machine learning based Quantitative Structure-Activity/Property Relationship (QSAR) efforts for enhanced drug discovery and toxicology assessments

    The benefits of in silico modeling to identify possible small-molecule drugs and their off-target interactions

    Get PDF
    Accepted for publication in a future issue of Future Medicinal Chemistry.The research into the use of small molecules as drugs continues to be a key driver in the development of molecular databases, computer-aided drug design software and collaborative platforms. The evolution of computational approaches is driven by the essential criteria that a drug molecule has to fulfill, from the affinity to targets to minimal side effects while having adequate absorption, distribution, metabolism, and excretion (ADME) properties. A combination of ligand- and structure-based drug development approaches is already used to obtain consensus predictions of small molecule activities and their off-target interactions. Further integration of these methods into easy-to-use workflows informed by systems biology could realize the full potential of available data in the drug discovery and reduce the attrition of drug candidates.Peer reviewe

    PUFFIN: A Path-Unifying Feed-Forward Interfaced Network for Vapor Pressure Prediction

    Full text link
    Accurately predicting vapor pressure is vital for various industrial and environmental applications. However, obtaining accurate measurements for all compounds of interest is not possible due to the resource and labor intensity of experiments. The demand for resources and labor further multiplies when a temperature-dependent relationship for predicting vapor pressure is desired. In this paper, we propose PUFFIN (Path-Unifying Feed-Forward Interfaced Network), a machine learning framework that combines transfer learning with a new inductive bias node inspired by domain knowledge (the Antoine equation) to improve vapor pressure prediction. By leveraging inductive bias and transfer learning using graph embeddings, PUFFIN outperforms alternative strategies that do not use inductive bias or that use generic descriptors of compounds. The framework's incorporation of domain-specific knowledge to overcome the limitation of poor data availability shows its potential for broader applications in chemical compound analysis, including the prediction of other physicochemical properties. Importantly, our proposed machine learning framework is partially interpretable, because the inductive Antoine node yields network-derived Antoine equation coefficients. It would then be possible to directly incorporate the obtained analytical expression in process design software for better prediction and control of processes occurring in industry and the environment

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF
    corecore