198 research outputs found

    Predicting Skin Permeability by means of Computational Approaches : Reliability and Caveats in Pharmaceutical Studies

    Get PDF
    ยฉ 2019 American Chemical Society.The skin is the main barrier between the internal body environment and the external one. The characteristics of this barrier and its properties are able to modify and affect drug delivery and chemical toxicity parameters. Therefore, it is not surprising that permeability of many different compounds has been measured through several in vitro and in vivo techniques. Moreover, many different in silico approaches have been used to identify the correlation between the structure of the permeants and their permeability, to reproduce the skin behavior, and to predict the ability of specific chemicals to permeate this barrier. A significant number of issues, like interlaboratory variability, experimental conditions, data set building rationales, and skin site of origin and hydration, still prevent us from obtaining a definitive predictive skin permeability model. This review wants to show the main advances and the principal approaches in computational methods used to predict this property, to enlighten the main issues that have arisen, and to address the challenges to develop in future research.Peer reviewedFinal Accepted Versio

    Quantitative Structure-Property Relationship Modeling & Computer-Aided Molecular Design: Improvements & Applications

    Get PDF
    The objective of this work was to develop an integrated capability to design molecules with desired properties. An automated robust genetic algorithm (GA) module has been developed to facilitate the rapid design of new molecules. The generated molecules were scored for the relevant thermophysical properties using non-linear quantitative structure-property relationship (QSPR) models. The descriptor reduction and model development for the QSPR models were implemented using evolutionary algorithms (EA) and artificial neural networks (ANNs). QSPR models for octanol-water partition coefficients (Kow), melting points (MP), normal boiling points (NBP), Gibbs energy of formation, universal quasi-chemical (UNIQUAC) model parameters, and infinite-dilution activity coefficients of cyclohexane and benzene in various organic solvents were developed in this work. To validate the current design methodology, new chemical penetration enhancers (CPEs) for transdermal insulin delivery and new solvents for extractive distillation of the cyclohexane + benzene system were designed. In general, the use of non-linear QSPR models developed in this work provided predictions better than or as good as existing literature models. In particular, the current models for NBP, Gibbs energy of formation, UNIQUAC model parameters, and infinite-dilution activity coefficients have lower errors on external test sets than the literature models. The current models for MP and Kow are comparable with the best models in the literature. The GA-based design framework implemented in this work successfully identified new CPEs for transdermal delivery of insulin, with permeability values comparable to the best CPEs in the literature. Also, new solvents for extractive distillation of cyclohexane/benzene with selectivities two to four times that of the existing solvents were identified. These two case studies validate the ability of the current design framework to identify new molecules with desired target properties.Chemical Engineerin

    MI-NODES multiscale models of metabolic reactions, brain connectome, ecological, epidemic, world trade, and legal-social networks

    Get PDF
    [Abstract] Complex systems and networks appear in almost all areas of reality. We find then from proteins residue networks to Protein Interaction Networks (PINs). Chemical reactions form Metabolic Reactions Networks (MRNs) in living beings or Atmospheric reaction networks in planets and moons. Network of neurons appear in the worm C. elegans, in Human brain connectome, or in Artificial Neural Networks (ANNs). Infection spreading networks exist for contagious outbreaks networks in humans and in malware epidemiology for infection with viral software in internet or wireless networks. Social-legal networks with different rules evolved from swarm intelligence, to hunter-gathered societies, or citation networks of U.S. Supreme Court. In all these cases, we can see the same question. Can we predict the links based on structural information? We propose to solve the problem using Quantitative Structure-Property Relationship (QSPR) techniques commonly used in chemo-informatics. In so doing, we need software able to transform all types of networks/graphs like drug structure, drug-target interactions, protein structure, protein interactions, metabolic reactions, brain connectome, or social networks into numerical parameters. Consequently, we need to process in alignment-free mode multitarget, multiscale, and multiplexing, information. Later, we have to seek the QSPR model with Machine Learning techniques. MI-NODES is this type of software. Here we review the evolution of the software from chemoinformatics to bioinformatics and systems biology. This is an effort to develop a universal tool to study structure-property relationships in complex systems

    Greedy and linear ensembles of machine learning methods outperform single approaches for QSPR regression problems

    Get PDF
    The application of Machine Learning to cheminformatics is a large and active field of research, but there exist few papers which discuss whether ensembles of different Machine Learning methods can improve upon the performance of their component methodologies. Here we investigated a variety of methods, including kernel-based, tree, linear, neural networks, and both greedy and linear ensemble methods. These were all tested against a standardised methodology for regression with data relevant to the pharmaceutical development process. Thinvestigation focused on QSPR problems within drug-like chemical space. We aimed to investigate which methods perform best, and how the โ€˜wisdom of crowdsโ€™ principle can be applied to ensemble predictors. It was found that no single method performs best for all problems, but that a dynamic, well-structured ensemble predictor would perform very well across the board, usually providing an improvement in performance over the best single method. Its use of weighting factors allows the greedy ensemble to acquire a bigger contribution from the better performing models, and this helps the greedy ensemble generally to outperform the simpler linear ensemble. Choice of data pre-processing methodology was found to be crucial to performance of each method too.PostprintPeer reviewe

    Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection

    Get PDF
    The estimation of the accuracy of predictions is a critical problem in QSAR modeling. The "distance to model" can be defined as a metric that defines the similarity between the training set molecules and the test set compound for the given property in the context of a specific model. It could be expressed in many different ways, e.g., using Tanimoto coefficient, leverage, correlation in space of models, etc. In this paper we have used mixtures of Gaussian distributions as well as statistical tests to evaluate six types of distances to models with respect to their ability to discriminate compounds with small and large prediction errors. The analysis was performed for twelve QSAR models of aqueous toxicity against T. pyriformis obtained with different machine-learning methods and various types of descriptors. The distances to model based on standard deviation of predicted toxicity calculated from the ensemble of models afforded the best results. This distance also successfully discriminated molecules with low and large prediction errors for a mechanism-based model developed using log P and the Maximum Acceptor Superdelocalizability descriptors. Thus, the distance to model metric could also be used to augment mechanistic QSAR models by estimating their prediction errors. Moreover, the accuracy of prediction is mainly determined by the training set data distribution in the chemistry and activity spaces but not by QSAR approaches used to develop the models. We have shown that incorrect validation of a model may result in the wrong estimation of its performance and suggested how this problem could be circumvented. The toxicity of 3182 and 48774 molecules from the EPA High Production Volume (HPV) Challenge Program and EINECS (European chemical Substances Information System), respectively, was predicted, and the accuracy of prediction was estimated. The developed models are available online at http://www.qspr.org site

    ์‹ฌ์ธตํ•™์Šต์„ ์ด์šฉํ•œ ์•ก์ฒด๊ณ„์˜ ์„ฑ์งˆ ์˜ˆ์ธก

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ™”ํ•™๋ถ€,2020. 2. ์ •์—ฐ์ค€.์ตœ๊ทผ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ์ˆ ์˜ ๊ธ‰๊ฒฉํ•œ ๋ฐœ์ „๊ณผ ์ด์˜ ํ™”ํ•™ ๋ถ„์•ผ์— ๋Œ€ํ•œ ์ ์šฉ์€ ๋‹ค์–‘ํ•œ ํ™”ํ•™์  ์„ฑ์งˆ์— ๋Œ€ํ•œ ๊ตฌ์กฐ-์„ฑ์งˆ ์ •๋Ÿ‰ ๊ด€๊ณ„๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์˜ˆ์ธก ๋ชจํ˜•์˜ ๊ฐœ๋ฐœ์„ ๊ฐ€์†ํ•˜๊ณ  ์žˆ๋‹ค. ์šฉ๋งคํ™” ์ž์œ  ์—๋„ˆ์ง€๋Š” ๊ทธ๋Ÿฌํ•œ ๊ธฐ๊ณ„ํ•™์Šต์˜ ์ ์šฉ ์˜ˆ์ค‘ ํ•˜๋‚˜์ด๋ฉฐ ๋‹ค์–‘ํ•œ ์šฉ๋งค ๋‚ด์˜ ํ™”ํ•™๋ฐ˜์‘์—์„œ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๋Š” ๊ทผ๋ณธ์  ์„ฑ์งˆ ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ ์šฐ๋ฆฌ๋Š” ๋ชฉํ‘œ๋กœ ํ•˜๋Š” ์šฉ๋งคํ™” ์ž์œ  ์—๋„ˆ์ง€๋ฅผ ์›์ž๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์œผ๋กœ๋ถ€ํ„ฐ ๊ตฌํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ์‹ฌ์ธตํ•™์Šต ๊ธฐ๋ฐ˜ ์šฉ๋งคํ™” ๋ชจํ˜•์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ œ์•ˆ๋œ ์‹ฌ์ธตํ•™์Šต ๋ชจํ˜•์˜ ๊ณ„์‚ฐ ๊ณผ์ •์€ ์šฉ๋งค์™€ ์šฉ์งˆ ๋ถ„์ž์— ๋Œ€ํ•œ ๋ถ€ํ˜ธํ™” ํ•จ์ˆ˜๊ฐ€ ๊ฐ ์›์ž์™€ ๋ถ„์ž๋“ค์˜ ๊ตฌ์กฐ์  ์„ฑ์งˆ์— ๋Œ€ํ•œ ๋ฒกํ„ฐ ํ‘œํ˜„์„ ์ถ”์ถœํ•˜๋ฉฐ, ์ด๋ฅผ ํ† ๋Œ€๋กœ ์›์ž๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ณต์žกํ•œ ํผ์…‰ํŠธ๋ก  ์‹ ๊ฒฝ๋ง ๋Œ€์‹  ๋ฒกํ„ฐ๊ฐ„์˜ ๊ฐ„๋‹จํ•œ ๋‚ด์ ์œผ๋กœ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. 952๊ฐ€์ง€์˜ ์œ ๊ธฐ์šฉ์งˆ๊ณผ 147๊ฐ€์ง€์˜ ์œ ๊ธฐ์šฉ๋งค๋ฅผ ํฌํ•จํ•˜๋Š” 6,493๊ฐ€์ง€์˜ ์‹คํ—˜์น˜๋ฅผ ํ† ๋Œ€๋กœ ๊ธฐ๊ณ„ํ•™์Šต ๋ชจํ˜•์˜ ๊ต์ฐจ ๊ฒ€์ฆ ์‹œํ—˜์„ ์‹ค์‹œํ•œ ๊ฒฐ๊ณผ, ํ‰๊ท  ์ ˆ๋Œ€ ์˜ค์ฐจ ๊ธฐ์ค€ 0.2 kcal/mol ์ˆ˜์ค€์œผ๋กœ ๋งค์šฐ ๋†’์€ ์ •ํ™•๋„๋ฅผ ๊ฐ€์ง„๋‹ค. ์Šค์บํด๋“œ-๊ธฐ๋ฐ˜ ๊ต์ฐจ ๊ฒ€์ฆ์˜ ๊ฒฐ๊ณผ ์—ญ์‹œ 0.6 kcal/mol ์ˆ˜์ค€์œผ๋กœ, ์™ธ์‚ฝ์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋Š” ๋น„๊ต์  ์ƒˆ๋กœ์šด ๋ถ„์ž ๊ตฌ์กฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก์— ๋Œ€ํ•ด์„œ๋„ ์šฐ์ˆ˜ํ•œ ์ •ํ™•๋„๋ฅผ ๋ณด์ธ๋‹ค. ๋˜ํ•œ, ์ œ์•ˆ๋œ ํŠน์ • ๊ธฐ๊ณ„ํ•™์Šต ๋ชจํ˜•์€ ๊ทธ ๊ตฌ์กฐ ์ƒ ํŠน์ • ์šฉ๋งค์— ํŠนํ™”๋˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์— ๋†’์€ ์–‘๋„์„ฑ์„ ๊ฐ€์ง€๋ฉฐ ํ•™์Šต์— ์ด์šฉํ•  ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋ฅผ ๋Š˜์ด๋Š” ๋ฐ ์šฉ์ดํ•˜๋‹ค. ์›์ž๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์— ๋Œ€ํ•œ ๋ถ„์„์„ ํ†ตํ•ด ์ œ์•ˆ๋œ ์‹ฌ์ธตํ•™์Šต ๋ชจํ˜• ์šฉ๋งคํ™” ์ž์œ  ์—๋„ˆ์ง€์— ๋Œ€ํ•œ ๊ทธ๋ฃน-๊ธฐ์—ฌ๋„๋ฅผ ์ž˜ ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๊ธฐ๊ณ„ํ•™์Šต์„ ํ†ตํ•ด ๋‹จ์ˆœํžˆ ๋ชฉํ‘œ๋กœ ํ•˜๋Š” ์„ฑ์งˆ๋งŒ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์„ ๋„˜์–ด ๋”์šฑ ์ƒ์„ธํ•œ ๋ฌผ๋ฆฌํ™”ํ•™์  ์ดํ•ด๋ฅผ ํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•  ๊ฒƒ์ด๋ผ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ๋‹ค.Recent advances in machine learning technologies and their chemical applications lead to the developments of diverse structure-property relationship based prediction models for various chemical properties; the free energy of solvation is one of them and plays a dominant role as a fundamental measure of solvation chemistry. Here, we introduce a novel machine learning-based solvation model, which calculates the target solvation free energy from pairwise atomistic interactions. The novelty of our proposed solvation model involves rather simple architecture: two encoding function extracts vector representations of the atomic and the molecular features from the given chemical structure, while the inner product between two atomistic features calculates their interactions, instead of black-boxed perceptron networks. The cross-validation result on 6,493 experimental measurements for 952 organic solutes and 147 organic solvents achieves an outstanding performance, which is 0.2 kcal/mol in MUE. The scaffold-based split method exhibits 0.6 kcal/mol, which shows that the proposed model guarantees reasonable accuracy even for extrapolated cases. Moreover, the proposed model shows an excellent transferability for enlarging training data due to its solvent-non-specific nature. Analysis of the atomistic interaction map shows there is a great potential that our proposed model reproduces group contributions on the solvation energy, which makes us believe that the proposed model not only provides the predicted target property, but also gives us more detailed physicochemical insights.1. Introduction 1 2. Delfos: Deep Learning Model for Prediction of Solvation Free Energies in Generic Organic Solvents 7 2.1. Methods 7 2.1.1. Embedding of Chemical Contexts 7 2.1.2. Encoder-Predictor Network 9 2.2. Results and Discussions 13 2.2.1. Computational Setup and Results 13 2.2.2. Transferability of the Model for New Compounds 17 2.2.3. Visualization of Attention Mechanism 26 3. Group Contribution Method for the Solvation Energy Estimation with Vector Representations of Atom 29 3.1. Model Description 29 3.1.1. Word Embedding 29 3.1.2. Network Architecture 33 3.2. Results and Discussions 39 3.2.1. Computational Details 39 3.2.2. Prediction Accuracy 42 3.2.3. Model Transferability 44 3.2.4. Group Contributions of Solvation Energy 49 4. Empirical Structure-Property Relationship Model for Liquid Transport Properties 55 5. Concluding Remarks 61 A. Analyzing Kinetic Trapping as a First-Order Dynamical Phase Transition in the Ensemble of Stochastic Trajectories 65 A1. Introduction 65 A2. Theory 68 A3. Lattice Gas Model 70 A4. Mathematical Model 73 A5. Dynamical Phase Transitions 75 A6. Conclusion 82 B. Reaction-Path Thermodynamics of the Michaelis-Menten Kinetics 85 B1. Introduction 85 B2. Reaction Path Thermodynamics 88 B3. Fixed Observation Time 94 B4. Conclusions 101Docto
    • โ€ฆ
    corecore