186 research outputs found

    ADME prediction with KNIME: In silico aqueous solubility consensus model based on supervised recursive random forest approaches

    Get PDF
    In-silico prediction of aqueous solubility plays an important role during the drug discovery and development processes. For many years, the limited performance of in-silico solubility models has been attributed to the lack of high-quality solubility data for pharmaceutical molecules. However, some studies suggest that the poor accuracy of solubility prediction is not related to the quality of the experimental data and that more precise methodologies (algorithms and/or set of descriptors) are required for predicting aqueous solubility for pharmaceutical molecules. In this study a large and diverse database was generated with aqueous solubility values collected from two public sources; two new recursive machine-learning approaches were developed for data cleaning and variable selection, and a consensus model based on regression and classification algorithms was created. The modeling protocol, which includes the curation of chemical and experimental data, was implemented in KNIME, with the aim of obtaining an automated workflow for the prediction of new databases. Finally, we compared several methods or models available in the literature with our consensus model, showing results comparable or even outperforming previous published models.  </p

    Вклад водородного связывания в биодоступность лекарств: методы хемоинформатики

    Get PDF
    A review, based mainly on own publications, is devoted to methods of investigation of “structure-bioavailability” relationships. The first part of this review contains information about classification of hydrogen bond descriptors, original 2D hydrogen bond thermodynamic descriptors, program HYBOT, original 3D hydrogen bonding potentials, original hydrogen bond surface area descriptors. The second part includes the results of applications of the above mentioned of hydrogen bond descriptors for prediction of bioavailability components such as lipophilicity, solubility in water and in physiological fluids, absorption and blood-brain barrier permeability.Обзор, основанный преимущественно на собственных публикациях, посвящён выявлению количественных связей “структура-биодоступность”. В первой части обзора описывается схема классификации дескрипторов водородных связей, создание оригинальных 2D термодинамических дескрипторов водородных связей, разработка компьютерной программы HYBOT, создание оригинальных трёхмерных потенциалов водородных связей и HYBOT PSA дескрипторов. Во второй части обзора представлены конкретные результаты использования вышеуказанных дескрипторов при создании QSAR моделей предсказания свойств, связанных с биодоступностью: липофильности, растворимости в воде и физиологических средах, абсорбции и проницаемости лекарств через гематоэнцефалический барьер

    The calculation of physicochemical descriptors and their application in predicting properties of drugs and other compounds.

    Get PDF
    The work presented may be divided into two main sections: The first section focuses on the important aspect of compound descriptor determination. The method by which descriptors are obtained indirectly through compound solubility in organic solvents and direct water-solvent partition measurements is illustrated by example for drug compounds. This approach is extended through the derivation of gas-water and water-solvent partition equations for the n-alcohols which in the future will be available for use in descriptor determination. Importantly, the equation coefficients are also interpreted to deduce various physicochemical properties of the homologous series of alcohols. An alternative method to assign descriptors is probed through reversed-phase HPLC. Measurements are recorded for a series of solutes on several bonded phases and multivariate analysis is used to investigate the interrelationship between columns in an effort to isolate the most suitable phases. The second section is concerned with application of the Abraham General Solvation Equation to examine processes of special interest in drug design; aqueous solubility and intestinal absorption. An algorithm to predict water solubility is obtained containing an additional cross-term which is found to compensate at least partly for a melting point correction term. The amended equation is shown to be comparable in accuracy to commercially available packages for a test set of 268 structurally diverse compounds. Of further importance in drug delivery is the process of intestinal absorption. An extensive literature search provides evaluated absorption data for a large set of drug compounds and forms a strong basis for subsequent QSAR analysis. Intestinal absorption is found to be comparable in humans and rat, and predominantly dependent on the hydrogen-bonding capability of the drug. The mechanism of absorption is considered through transformation of the percent absorption data to an overall rate constant

    Blinded Predictions and Post Hoc Analysis of the Second Solubility Challenge Data: Exploring Training Data and Feature Set Selection for Machine and Deep Learning Models

    Get PDF
    Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state of the art, the American Chemical Society organized a "Second Solubility Challenge"in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019 but which have not previously been reported. These models were based on computationally inexpensive molecular descriptors and traditional machine learning algorithms and were trained on a relatively small data set of 300 molecules. In the second part of the article, to test the hypothesis that predictions would improve with more advanced algorithms and higher volumes of training data, we compare these original predictions with those made after the deadline using deep learning models trained on larger solubility data sets consisting of 2999 and 5697 molecules. The results show that there are several algorithms that are able to obtain near state-of-the-art performance on the solubility challenge data sets, with the best model, a graph convolutional neural network, resulting in an RMSE of 0.86 log units. Critical analysis of the models reveals systematic differences between the performance of models using certain feature sets and training data sets. The results suggest that careful selection of high quality training data from relevant regions of chemical space is critical for prediction accuracy but that other methodological issues remain problematic for machine learning solubility models, such as the difficulty in modeling complex chemical spaces from sparse training data sets

    Aqueous solubility of drug-like compounds

    Get PDF
    New effective experimental techniques in medicinal chemistry and pharmacology have resulted in a vast increase in the number of pharmacologically interesting compounds. However, the possibility of producing drug candidates with optimal biopharmaceutical and pharmacokinetic properties is still improvable. A large fraction of typical drug candidates is poorly soluble in water, which results in low drug concentrations in gastrointestinal fluids and related acceptable low drug absorption. Therefore, gaining knowledge to improve the solubility of compounds is an indispensable requirement for developing compounds with drug-like properties. The main objective of this thesis was to investigate whether computer-based models derived from calculated molecular descriptors and structural fragments can be used to predict aqueous solubility for drug-like compounds with similar structures. For this purpose, both experimental and computational studies were performed. In the experimental work, a novel crystallization method for weak acids and bases was developed and applied for European patent. The obtained crystalline materials could be used for solubility measurements. A novel recognition method was developed to evaluate the tendency of compounds to form amorphous forms. This method could be used to ensure that only solubilities of crystalline materials were collected for the development of solubility prediction. In the development of improved in silico solubility models, lipophilicity was confirmed as the major driving factor and crystal information related descriptors as the second important factor for solubility. Reasons for the limited precision of commercial solubility prediction tools were identified. A general solubility model of high accuracy was obtained for drug-like compounds in congeneric series when lipophilicity was used as descriptor in combination with the structural fragments. Rules were derived from the prediction models of solubility which could be used by chemists or interested scientists as a rough guideline on the contribution of structural fragments on solubility: Aliphatic and polar fragments with high dipole moments are always considered as solubility enhancing. Strong acids and bases usually have lower intrinsic solubility than neutral ones. In summary, an improved solubility prediction method for congeneric series was developed using high quality solubility results of drugs and drug precursors as input parameter. The derived model tried to overcome difficulties of commercially available prediction tools for solubility by focusing on structurally related series and showed higher predictive power for drug-like compounds in comparison to commercially available tools. Parts of the results of this work were protected by a patent application1, which was filed by F. Hoffmann-La Roche Ltd on August 30, 2005

    Blinded predictions and post-hoc analysis of the second solubility challenge data : exploring training data and feature set selection for machine and deep learning models

    Get PDF
    Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state-of-the-art, the American Chemical Society organised a “Second Solubility Challenge” in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019, but which have not previously been reported. These models were based on computationally inexpensive molecular descriptors and traditional machine learning algorithms, and were trained on a relatively small dataset of 300 molecules. In the second part of the article, to test the hypothesis that predictions would improve with more advanced algorithms and higher volumes of training data, we compare these original predictions with those made after the deadline using deep learning models trained on larger solubility datasets consisting of 2999 and 5697 molecules. The results show that there are several algorithms that are able to obtain near state-of-the-art performance on the solubility challenge datasets, with the best model, a graph convolutional neural network, resulting in a RMSE of 0.86 log units. Critical analysis of the models reveal systematic di↵erences between the performance of models using certain feature sets and training datasets. The results suggest that careful selection of high quality training data from relevant regions of chemical space is critical for prediction accuracy, but that other methodological issues remain problematic for machine learning solubility models, such as the difficulty in modelling complex chemical spaces from sparse training datasets

    Data mining methods for the prediction of intestinal absorption using QSAR

    Get PDF
    Oral administration is the most common route for administration of drugs. With the growing cost of drug discovery, the development of Quantitative Structure-Activity Relationships (QSAR) as computational methods to predict oral absorption is highly desirable for cost effective reasons. The aim of this research was to develop QSAR models that are highly accurate and interpretable for the prediction of oral absorption. In this investigation the problems addressed were datasets with unbalanced class distributions, feature selection and the effects of solubility and permeability towards oral absorption prediction. Firstly, oral absorption models were obtained by overcoming the problem of unbalanced class distributions in datasets using two techniques, under-sampling of compounds belonging to the majority class and the use of different misclassification costs for different types of misclassifications. Using these methods, models with higher accuracy were produced using regression and linear/non-linear classification techniques. Secondly, the use of several pre-processing feature selection methods in tandem with decision tree classification analysis – including misclassification costs – were found to produce models with better interpretability and higher predictive accuracy. These methods were successful to select the most important molecular descriptors and to overcome the problem of unbalanced classes. Thirdly, the roles of solubility and permeability in oral absorption were also investigated. This involved expansion of oral absorption datasets and collection of in vitro and aqueous solubility data. This work found that the inclusion of predicted and experimental solubility in permeability models can improve model accuracy. However, the impact of solubility on oral absorption prediction was not as influential as expected. Finally, predictive models of permeability and solubility were built to predict a provisional Biopharmaceutic Classification System (BCS) class using two multi-label classification techniques, binary relevance and classifier chain. The classifier chain method was shown to have higher predictive accuracy by using predicted solubility as a molecular descriptor for permeability models, and hence better final provisional BCS prediction. Overall, this research has resulted in predictive and interpretable models that could be useful in a drug discovery context

    Prediction of partition coefficients for systems of micelles using DFT

    Get PDF
    Programa de Doctorat en Química Teòrica i Modelització Computacional[eng] A compound’s solvent−water partition coefficient (log P) measures the equilibrium ratio of the compound’s concentrations in a two-phase system: as two solvents in contact or a system of micelles in an aqueous solution. In this thesis, the partition coefficient of three groups of small compounds (alcohol, ether, and hydrocarbons) in 10 different solvents (benzene, cyclohexane, hexane, n-Octane, toluene, carbon tetrachloride, heptane, trichloroethane, and octanol) was computed used DFT and B3LYP method with 6.31G(d), 6.311+G** and 6.311++G** basis sets. It is obtained that the partition coefficient of alcohol solutes in various solvents using the 6.31G(d) basis set indicates a satisfactory correlation with experimental values. The correlation between the experimental value and the partition coefficient of ether solutes in different solvents using the 6.311++G** basis set shows high agreement. The experimental data displayed a high correlation with the partition coefficient computed for hydrocarbon compounds in various solvents using all three basis sets: 6.31G(d), 6.311+G**, and 6.311++G**. In addition, we have studied the correlation of the experimental partition coefficients in Sodium Dodecyl Sulfate (SDS), Hexadecyltrimethylammonium bromide (HTAB), Sodium cholate (SC), and Lithium perfluoro octane sulfonate (LPFOS) micelles with ab initio calculated partition coefficients in 15 different organic solvents. Specifically, the partition coefficients of a series of 63 molecules in an aqueous system of SDS, SC, HTAB, and LPFOS micelles are correlated with the partition coefficient in heptane/water, cyclohexane/water, n-dodecane/water, pyridine/water, acetic acid/water, octanol/water, acetone/water, 1-propanol/water, 2-propanol/water, methanol/water, formic acid/water, diethyl sulfide/water, decan-1-ol/water, 1-2 ethane diol/water and dimethyl sulfoxide/water systems. All calculations were performed using the Gaussian 16 Quantum Chemistry package. Molecular structures were generated in the more extended conformation using Avogadro, and geometries of all molecules were optimized using Density Functional Theory (DFT) B3LYP and MO6-2X with 6-31++G** basis set by the continuum solvation model based on density (SMD). The obtained results show that calculated partition coefficients in the alcohol/water mixture give the best correlation to predict the experimental partition coefficients in SDS, SC, and LPFOS micelles. With respect to HTAB micelle systems, a new selection of molecules is created, excluding those containing N atoms and Urea atom groups. Interestingly, the partition coefficient of these chosen molecules exhibits a strong correlation with the experimental partition coefficient. Finally, the partition coefficient of flexible molecules was studied by the same protocol for two solvent combinations, octanol/water and cyclohexane/water. The calculated values were compared with the experimental partition coefficients. The average partition coefficient in octanol solvent exhibited a high correlation with the experimental data. However, for the 16 compounds in the cyclohexane solvent, their partition coefficients do not exhibit significant agreement with the experimental partition coefficients.[cat] S'ha desenvolupat una metodologia computacional per calcular el coeficient de partició de diferents tipus de molècules en sistemes micel·lars. En primer lloc, s'ha calculat el coeficient de partició de tres grups de compostos (alcohol, èter i hidrocarburs) utilitzant el mètode DFT amb el funcional B3LYP. S'han obtingut correlacions satisfactòries amb els valors experimentals. En aquesta tesi s'ha desenvolupat un procediment per calcular els coeficients de partició experimentals en micel·les de dodecilsulfat de sodi (SDS), bromur d'hexadeciltrimetilamoni (HTAB), colat de sodi (SC) i perfluorooctanosulfonat de liti (LPFOS). Específicament, els coeficients de partició d'una sèrie de 63 molècules en un sistema aquós de micel·les de SDS, SC, HTAB i LPFOS es correlacionen amb el coeficient de partició en deu barreges aquoses. Els resultats obtinguts mostren que els coeficients de partició calculats a la barreja alcohol/aigua donen la millor correlació per predir els coeficients de partició experimentals en micel·les SDS, SC i LPFOS. Pel que fa als sistemes micelars HTAB, es crea una nova selecció de molècules, excloent-ne aquelles que contenen àtoms de N aromàtics i grups d'urea. És interessant notar que el coeficient de partició d'aquestes molècules triades mostra una forta correlació amb el coeficient de partició experimental. Finalment, es va estudiar el coeficient de partició de molècules flexibles mitjançant el mateix protocol per a dues combinacions de dissolvents, octanol/aigua i ciclohexà/aigua. Els valors calculats es van comparar amb els coeficients de partició experimentals. El coeficient de partició mitjana en dissolvent octanol va mostrar una alta correlació amb les dades experimentals. Tot i això, per als 16 compostos en el dissolvent ciclohexà, els seus coeficients de partició no mostren una concordança significativa amb els coeficients de partició experimental

    Prediction of partition coefficients for systems of micelles using DFT

    Full text link
    [eng] A compound’s solvent−water partition coefficient (log P) measures the equilibrium ratio of the compound’s concentrations in a two-phase system: as two solvents in contact or a system of micelles in an aqueous solution. In this thesis, the partition coefficient of three groups of small compounds (alcohol, ether, and hydrocarbons) in 10 different solvents (benzene, cyclohexane, hexane, n-Octane, toluene, carbon tetrachloride, heptane, trichloroethane, and octanol) was computed used DFT and B3LYP method with 6.31G(d), 6.311+G** and 6.311++G** basis sets. It is obtained that the partition coefficient of alcohol solutes in various solvents using the 6.31G(d) basis set indicates a satisfactory correlation with experimental values. The correlation between the experimental value and the partition coefficient of ether solutes in different solvents using the 6.311++G** basis set shows high agreement. The experimental data displayed a high correlation with the partition coefficient computed for hydrocarbon compounds in various solvents using all three basis sets: 6.31G(d), 6.311+G**, and 6.311++G**. In addition, we have studied the correlation of the experimental partition coefficients in Sodium Dodecyl Sulfate (SDS), Hexadecyltrimethylammonium bromide (HTAB), Sodium cholate (SC), and Lithium perfluoro octane sulfonate (LPFOS) micelles with ab initio calculated partition coefficients in 15 different organic solvents. Specifically, the partition coefficients of a series of 63 molecules in an aqueous system of SDS, SC, HTAB, and LPFOS micelles are correlated with the partition coefficient in heptane/water, cyclohexane/water, n-dodecane/water, pyridine/water, acetic acid/water, octanol/water, acetone/water, 1-propanol/water, 2-propanol/water, methanol/water, formic acid/water, diethyl sulfide/water, decan-1-ol/water, 1-2 ethane diol/water and dimethyl sulfoxide/water systems. All calculations were performed using the Gaussian 16 Quantum Chemistry package. Molecular structures were generated in the more extended conformation using Avogadro, and geometries of all molecules were optimized using Density Functional Theory (DFT) B3LYP and MO6-2X with 6-31++G** basis set by the continuum solvation model based on density (SMD). The obtained results show that calculated partition coefficients in the alcohol/water mixture give the best correlation to predict the experimental partition coefficients in SDS, SC, and LPFOS micelles. With respect to HTAB micelle systems, a new selection of molecules is created, excluding those containing N atoms and Urea atom groups. Interestingly, the partition coefficient of these chosen molecules exhibits a strong correlation with the experimental partition coefficient. Finally, the partition coefficient of flexible molecules was studied by the same protocol for two solvent combinations, octanol/water and cyclohexane/water. The calculated values were compared with the experimental partition coefficients. The average partition coefficient in octanol solvent exhibited a high correlation with the experimental data. However, for the 16 compounds in the cyclohexane solvent, their partition coefficients do not exhibit significant agreement with the experimental partition coefficients.[cat] S'ha desenvolupat una metodologia computacional per calcular el coeficient de partició de diferents tipus de molècules en sistemes micel·lars. En primer lloc, s'ha calculat el coeficient de partició de tres grups de compostos (alcohol, èter i hidrocarburs) utilitzant el mètode DFT amb el funcional B3LYP. S'han obtingut correlacions satisfactòries amb els valors experimentals. En aquesta tesi s'ha desenvolupat un procediment per calcular els coeficients de partició experimentals en micel·les de dodecilsulfat de sodi (SDS), bromur d'hexadeciltrimetilamoni (HTAB), colat de sodi (SC) i perfluorooctanosulfonat de liti (LPFOS). Específicament, els coeficients de partició d'una sèrie de 63 molècules en un sistema aquós de micel·les de SDS, SC, HTAB i LPFOS es correlacionen amb el coeficient de partició en deu barreges aquoses. Els resultats obtinguts mostren que els coeficients de partició calculats a la barreja alcohol/aigua donen la millor correlació per predir els coeficients de partició experimentals en micel·les SDS, SC i LPFOS. Pel que fa als sistemes micelars HTAB, es crea una nova selecció de molècules, excloent-ne aquelles que contenen àtoms de N aromàtics i grups d'urea. És interessant notar que el coeficient de partició d'aquestes molècules triades mostra una forta correlació amb el coeficient de partició experimental. Finalment, es va estudiar el coeficient de partició de molècules flexibles mitjançant el mateix protocol per a dues combinacions de dissolvents, octanol/aigua i ciclohexà/aigua. Els valors calculats es van comparar amb els coeficients de partició experimentals. El coeficient de partició mitjana en dissolvent octanol va mostrar una alta correlació amb les dades experimentals. Tot i això, per als 16 compostos en el dissolvent ciclohexà, els seus coeficients de partició no mostren una concordança significativa amb els coeficients de partició experimental

    STUDIES OF SOLUBILIZATION OF POORLY WATER-SOLUBLE DRUGS DURING \u3ci\u3eIN VITRO\u3c/i\u3e LIPOLYSIS OF A MODEL LIPID-BASED DRUG DELIVERY SYSTEM AND IN MIXED MICELLES

    Get PDF
    Lipid-based drug delivery systems (LBDDSs) are becoming an increasingly popular approach to improve the oral absorption of poorly-water soluble drugs. Several possible mechanisms have been proposed to explain the means by which LBDDSs act in vivo to enhance absorption. The goal of the current dissertation is to provide a better understanding of one proposed mechanism; the capability of lipoidal components in LBDDS formulations to create and maintain a drug in a supersaturated state under simulated GI conditions. Moreover, molecular details of equilibrium solubilization of a drug in a series of model lipid assemblies were examined. The results of these studies will aid formulators in choosing the optimal LBDDS to improve oral absorption of poorly water-soluble drugs. Time-dependent solubilization behavior of progesterone, 17β-estradiol and nifedipine in a simple model LBDDS composed of Polysorbate 80 was assessed employing the in vitro dynamic lipolysis model. The results illustrated the extent to which the supersaturated state was dependent on the extent of lipolysis of Polysorbate 80 and the initial drug concentration. Area-under-the curve-supersaturation was proposed as a means of quantifying the time-dependent extent of supersaturation in LBDDSs in simulated intestinal conditions. Concurrently, a series of model mixed micellar solutions, composed of Polysorbate 80 and oleic acid, were prepared to represent the lipid assemblies produced during the lipolysis experiments. The ability of these aggregates to solubilize progesterone, 17β-estradiol and nifedipine were evaluated and the aggregate/water partition coefficients were determined. The Treinor model was found to successfully fit the partition coefficients of the drugs in a range of mixed micelles. The equilibrium solubility of drugs in the mixed micelles was calculated and compared to that found under lipolytic conditions. The best agreement between calculated and experimental conditions was observed for nifedipine. These studies have established a foundation for the evaluation of time-dependent extent of supersaturation with more complex LBDDS formulations exposed to lipolytic conditions
    corecore