947 research outputs found

    Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database

    Get PDF
    The accurate prediction of solubility of drugs is still problematic. It was thought for a long time that shortfalls had been due the lack of high-quality solubility data from the chemical space of drugs. This study considers the quality of solubility data, particularly of ionizable drugs. A database is described, comprising 6355 entries of intrinsic solubility for 3014 different molecules, drawing on 1325 citations. In an earlier publication, many factors affecting the quality of the measurement had been discussed, and suggestions were offered to improve ways of extracting more reliable information from legacy data. Many of the suggestions have been implemented in this study. By correcting solubility for ionization (i.e., deriving intrinsic solubility, S0) and by normalizing temperature (by transforming measurements performed in the range 10-50 °C to 25 °C), it can now be estimated that the average interlaboratory reproducibility is 0.17 log unit. Empirical methods to predict solubility at best have hovered around the root mean square error (RMSE) of 0.6 log unit. Three prediction methods are compared here: (a) Yalkowsky’s general solubility equation (GSE), (b) Abraham solvation equation (ABSOLV), and (c) Random Forest regression (RFR) statistical machine learning. The latter two methods were trained using the new database. The RFR method outperforms the other two models, as anticipated. However, the ability to predict the solubility of drugs to the level of the quality of data is still out of reach. The data quality is not the limiting factor in prediction. The statistical machine learning methodologies are probably up to the task. Possibly what’s missing are solubility data from a few sparsely-covered chemical space of drugs (particularly of research compounds). Also, new descriptors which can better differentiate the factors affecting solubility between molecules could be critical for narrowing the gap between the accuracy of the prediction models and that of the experimental data

    Anomalous salting-out, self-association and pKa effects in the practically-insoluble bromothymol blue

    Get PDF
    Background and Purpose The widely-used and practically insoluble diprotic acidic dye, bromothymol blue (BTB), is a neutral molecule in strongly acidic aqueous solutions. The Schill (1964) extensive solubility-pH measurement of bromothymol blue in 0.1 and 1.0 M NaCl solutions, with pH adjusted with HCl from 0.0 to 5.4, featured several unusual findings. The data suggest that the difference in solubility of the neutral-form molecule in 1M NaCl is more than 0.7 log unit lower than the solubility in pure water. This could be considered as uncharacteristically high for a salting-out effect. Also, the study reported two apparent values of pKa1, 1.48 and 1.00, in 0.1 M and 1.0 M NaCl solutions, respectively. The only other measured value found for pKa1 in the literature is -0.66 (Gupta and Cadwallader, 1968). Experimental Approach It was reasoned that the there can be only a single pKa1 for BTB. Also, it was hypothesized that salting-out alone might not account for such a large difference in solubility observed at the two levels of salt. A generalized mass action approach incorporating activity corrections for charged species using the Stokes-Robinson hydration equation and for neutral species using the Setschenow equation, was selected to analyze the Schill solubility-pH data to seek a rationalization of these unusual results. Key Results BTB reveals complex speciation chemistry in saturated aqueous solutions which had been poorly understood for many years. The appearance of two different values of pKa1 at different levels of NaCl and the anomalously high value of the empirical salting-out constant could be rationalized to normal values by invoking the formation of a very stable neutral dimer (log K2 = 10.0 ± 0.1 M-1). A ‘normal’ salting-out constant, 0.25 M-1 was then derived. It was also possible to estimate the ‘self-interaction’ constant. The data analysis in the present study critically depended on the pKa1 = -0.66 reported by Gupta and Cadwallader. Conclusion A more reasonable salting-out constant and a consistent single value for pKa1 have been determined by considering a self-interacting (aggregation) model involving an uncharged form of the molecule, which is likely a zwitterion, as suggested by literature spectrophotometric studies

    Multi-lab intrinsic solubility measurement reproducibility in CheqSol and shake-flask methods

    Get PDF
    This commentary compares 233 CheqSol intrinsic solubility values (log S0) reported in the Wiki-pS0 database for 145 different druglike molecules to the 838 log S0 values determined mostly by the saturation shake-flask (SSF) method for 124 of the molecules from the CheqSol set. The range of log S0 spans from -1.0 to -10.6 (log molar units), averaging at -3.8. The correlation plot between the two methods indicates r2 = 0.96, RMSE = 0.34 log unit, and a slight bias of -0.07 log unit. The average interlaboratory standard deviation (SDi) is slightly better for the CheqSol set than that of the SSF set: SDiCS = 0.15 and SDiSSF = 0.24. The intralaboratory errors reported in the CheqSol method (0.05 log) need to be multiplied by a factor of 3 to match the expected interlaboratory errors for the method. The scale factor, in part, relates to the hidden systematic errors in the single-lab values. It is expected that improved standardizations in the ‘gold standard’ SSF method, as suggested in the recent ‘white paper’ on solubility measurement methodology, should make the SDi of both methods be about ~0.15 log unit. The multi-lab averaged log S0 (and the corresponding SDi) values could be helpful additions to existing training-set molecules used to predict the intrinsic solubility of drugs and druglike molecules

    Do you know your r2?

    Get PDF
    The prediction of solubility of drugs usually calls on the use of several open-source/commercially-available computer programs in the various calculation steps. Popular statistics to indicate the strength of the prediction model include the coefficient of determination (r2), Pearson’s linear correlation coefficient (rPearson), and the root-mean-square error (RMSE), among many others. When a program calculates these statistics, slightly different definitions may be used. This commentary briefly reviews the definitions of three types of r2 and RMSE statistics (model validation, bias compensation, and Pearson) and how systematic errors due to shortcomings in solubility prediction models can be differently indicated by the choice of statistical indices. The indices we have employed in recently published papers on the prediction of solubility of druglike molecules were unclear, especially in cases of drugs from ‘beyond the Rule of 5’ chemical space, as simple prediction models showed distinctive ‘bias-tilt’ systematic type scatter

    Can small drugs predict the intrinsic aqueous solubility of ‘beyond Rule of 5’ big drugs?

    Get PDF
    The aim of the study was to explore to what extent small molecules (mostly from the Rule of 5 chemical space) can be used to predict the intrinsic aqueous solubility, S0, of big molecules from beyond the Rule of 5 (bRo5) space. It was demonstrated that the General Solubility Equation (GSE) and the Abraham Solvation Equation (ABSOLV) underpredict solubility in systematic but slightly ways. The Random Forest regression (RFR) method predicts solubility more accurately, albeit in the manner of a ‘black box.’ It was discovered that the GSE improves considerably in the case of big molecules when the coefficient of the log P term (octanol-water partition coefficient) in the equation is set to -0.4 instead of the traditional -1 value. The traditional GSE underpredicts solubility for molecules with experimental S0 < 50 ”M. In contrast, the ABSOLV equation (trained with small molecules) underpredicts the solubility of big molecules in all cases tested. It was found that the errors in the ABSOLV-predicted solubilities of big molecules correlate linearly with the number of rotatable bonds, which suggests that flexibility may be an important factor in differentiating solubility of small from big molecules. Notably, most of the 31 big molecules considered have negative enthalpy of solution: these big molecules become less soluble with increasing temperature, which is compatible with ‘molecular chameleon’ behavior associated with intramolecular hydrogen bonding. The X‑ray structures of many of these molecules reveal void spaces in their crystal lattices large enough to accommodate many water molecules when such solids are in contact with aqueous media. The water sorbed into crystals suspended in aqueous solution may enhance solubility by way of intra-lattice solute-water interactions involving the numerous H‑bond acceptors in the big molecules studied. A ‘Solubility Enhancement–Big Molecules’ index was defined, which embodies many of the above findings.</p

    Anomalous Solubility Behavior of Several Acidic Drugs

    Get PDF
    The “anomalous solubility behavior at higher pH values” of several acidic drugs originally studied by Higuchi et al. in 1953 [1], but hitherto not fully rationalized, has been re-analyzed using a novel solubility-pH analysis computer program, pDISOL-XTM. The program internally derives implicit solubility equations, given a set of proposed equilibria and constants (iteratively refined by weighted nonlinear regression), and does not require explicit Henderson-Hasselbalch equations. The re-analyzed original barbital, phenobarbital, oxytetracycline, and sulfathiazole solubility-pH data of Higuchi et al. is consistent with the presence of dimers in saturated solutions. In the case of barbital, phenobarbital and sulfathiazole, anionic dimers, reaching peak concentrations near pH 8. However, oxytetracycline indicated a pronounced tendency to form a cationic dimer, peaking near pH 2. Under the conditions of the original study, only barbital indicated a slight tendency to form a salt precipitate at pH > 6.8, with a highly unusual stoichiometry (consistent with a slope of 0.55 in the log S – pH plot): K+ + A2H- + 3HA KA5H4(s). Thus the “anomaly” in the Higuchi data can be rationalized by invoking specific aggregated species

    Mechanistically transparent models for predicting aqueous soluÂŹbility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression Alex Avdeef

    Get PDF
    Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Ω,B) models with the provided best-fit parameters, where Ω is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Ω < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Ω,B). For more flexible molecules, GSE(Ω,B) is recommended

    Anomalous salting-out, self-association and pKa effects in the practically-insoluble bromothymol blue

    Get PDF
    Background and Purpose: The widely-used and practically insoluble diprotic acidic dye, bromothymol blue (BTB), is a neutral molecule in strongly acidic aqueous solutions. The Schill (1964) extensive solubility-pH measurement of bromothymol blue in 0.1 and 1.0 M NaCl solutions, with pH adjusted with HCl from 0.0 to 5.4, featured several unusual findings. The data suggest that the difference in solubility of the neutral-form molecule in 1M NaCl is more than 0.7 log unit lower than the solubility in pure water. This could be considered as uncharacteristically high for a salting-out effect. Also, the study reported two apparent values of pKa1, 1.48 and 1.00, in 0.1 M and 1.0 M NaCl solutions, respectively. The only other measured value found for pKa1 in the literature is -0.66 (Gupta and Cadwallader, 1968). Experimental Approach: It was reasoned that the there can be only a single pKa1 for BTB.  Also, it was hypothesized that salting-out alone might not account for such a large difference in solubility observed at  the two levels of salt. A generalized mass action approach incorporating activity corrections for charged species using the Stokes-Robinson hydration equation and for neutral species using the Setschenow equation, was selected to analyze the Schill solubility-pH data to seek a rationalization of these unusual results. Key Results: BTB reveals complex speciation chemistry in saturated aqueous solutions which had been poorly understood for many years. The appear­ance of two different values of pKa1 at different levels of NaCl and the anomalously high value of the empirical salting-out constant could be rationalized to normal values by invoking the formation of a very stable neutral dimer (log K2 = 10.0 ± 0.1 M-1).  A ‘normal’ salting-out constant, 0.25 M-1 was then derived. It was also possible to estimate the ‘self-interaction’ constant.  The data analysis in the present study critically depended on the pKa1 = -0.66 reported by Gupta and Cadwallader. Conclusion: A more reasonable salting-out constant and a consistent single value for pKa1 have been determined by considering a self-interacting (aggregation) model involving an uncharged form of the molecule, which is likely a zwitterion, as suggested by literature spectrophotometric studies

    Cocrystal solubility-pH and drug solubilization capacity of sodium dodecyl sulfate – mass action model for data analysis and simulation to improve design of experiments

    Get PDF
    This review discusses the disposition of the anionic surfactant, sodium dodecyl sulfate (SDS; i.e., sodium lauryl sulfate), to solubilize sparingly-soluble drugs above the surfactant critical micelle concentration (CMC), as quantitated by the solubilization capacity (k). A compilation of 101 published SDS k values of mostly poorly-soluble drug molecules was used to develop a prediction model as a function of the drug’s intrinsic solubility, S0, and its calculated H-bond acceptor/donor potential. In almost all cases, the surfactant was found to solubilize the neutral form of the drug. Using the mass action model, the k values were converted to drug-micelle stoichiometric binding constants, Kn, corresponding to drug-micelle equilibria in drug-saturated solutions. An in-depth case study (data from published sources) considered the micellization reactions as a function of pH of a weak base, B, (pKa 3.58, S0 52 ÎŒg/mL), where at pH 1 the BH.SDS salt was predicted to precipitate both below and above the CMC. At low SDS concentrations, two drug salts were predicted to co-precipitate: BH.Cl and BH.SDS. Solubility products of both were determined from the analysis of the reported solubility-surfactant data. Above the CMC, in a rare example, the charged form of the drug (BH+) appeared to be strongly solubilized by the surfactant. The constant for that reaction was also determined. At pH 7, the reactions were simpler, as only the neutral form of the drug was solubilized, to a significantly lesser extent than at pH 1. Case studies also featured examples of solubilization of solids in the form of cocrystals. For many cocrystal systems studied in aqueous solution, the anticipated supersaturated state is not long-lasting, as the drug component precipitates to a thermodynamically stable form, thus lowering the amount of the active ingredient available for intestinal absorption. Use of surfactant can prevent this. A recently-described method for predicting the solubility product of cocrystals (coupled with predicted k values described here) allowed for simulations of solubility-pH speciation profiles of cocrystal systems in the presence of SDS. Well in advance of any actual measurements, these simulations can be used to probe conditions favorable to the design of cocrystal experiments where SDS stabilizes cocrystal suspensions against drug precipitation over a predicted range of pH values
    • 

    corecore