32 research outputs found

    Machine Learning with Physicochemical Relationships: Solubility Prediction in Organic Solvents and Water

    Get PDF
    Solubility prediction remains a critical challenge in drug development, synthetic route and chemical process design, extraction and crystallisation. Here we report a successful approach to solubility prediction in organic solvents and water using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry. Rational interpretation of dissolution process into a numerical problem led to a small set of selected descriptors and subsequent predictions which are independent of the applied machine learning method. These models gave significantly more accurate predictions compared to benchmarked open-access and commercial tools, achieving accuracy close to the expected level of noise in training data (LogS ± 0.7). Finally, they reproduced physicochemical relationship between solubility and molecular properties in different solvents, which led to rational approaches to improve the accuracy of each models

    Machine learning insights into predicting biogas separation in metal-organic frameworks

    Get PDF
    Breakthroughs in efficient use of biogas fuel depend on successful separation of carbon dioxide/methane streams and identification of appropriate separation materials. In this work, machine learning models are trained to predict biogas separation properties of metal-organic frameworks (MOFs). Training data are obtained using grand canonical Monte Carlo simulations of experimental MOFs which have been carefully curated to ensure data quality and structural viability. The models show excellent performance in predicting gas uptake and classifying MOFs according to the trade-off between gas uptake and selectivity, with R2 values consistently above 0.9 for the validation set. We make prospective predictions on an independent external set of hypothetical MOFs, and examine these predictions in comparison to the results of grand canonical Monte Carlo calculations. The best-performing trained models correctly filter out over 90% of low-performing unseen MOFs, illustrating their applicability to other MOF datasets

    AI4Green: An Open-Source ELN for Green and Sustainable Chemistry

    Get PDF
    An Electronic Laboratory Notebook (ELN) combining features, including data archival, collaboration tools, and green and sustainability metrics for organic chemistry, is presented. AI4Green is a web-based application, available as open-source code and free to use. It offers the core functionality of an ELN, namely the ability to store reactions securely and share them among different members of a research team. As users plan their reactions and record it in the ELN, green and sustainable chemistry is encouraged by automatically calculating green metrics and color-coding hazards, solvents, and reaction conditions. The interface links a database constructed from data extracted from PubChem, enabling the automatic collation of information for reactions. The application's design facilitates the development of auxiliary sustainability applications, such as our Solvent Guide. As more reaction data is captured, subsequent work will include providing "intelligent" sustainability suggestions to the user

    ML meets MLn: machine learning in ligand promoted homogeneous catalysis

    Get PDF
    The benefits of using machine learning approaches in the design, optimisation and understanding of homogeneous catalytic processes are being increasingly realised. We focus on the understanding and implementation of key concepts, which serve as conduits to more advanced chemical machine learning literature, much of which is (presently) outside the area of homogeneous catalysis. Potential pitfalls in the ‘workflow’ procedures needed in the machine learning process are identified and all the examples provided are in a chemical sciences context, including several from ‘real world’ catalyst systems. Finally, potential areas of expansion and impact for machine learning in homogeneous catalysis in the future are considered

    Activation of Fluoride Anion as Nucleophile in Water with Data-Guided Surfactant Selection

    Get PDF
    A principal component surfactant_map was developed for 91 commonly accessible surfactants for use in surfactant-enabled organic reactions in water, an important approach for sustainable chemical processes. This map was built using 22 experimental and theoretical descriptors relevant to the physicochemical nature of these surfactant-enabled reactions, and advanced principal component analysis algorithms. It is comprised of all classes of surfactants, i.e. cationic, anionic, zwitterionic and neutral surfactants, including designer surfactants. The value of this surfactant_map was demonstrated in activating simple inorganic fluoride salts as effective nucleophiles in water, with the right surfactant. This led to the rapid development (screening 13-15 surfactants) of two fluorination reactions for β-bromosulfides and sulfonyl chlorides in water. The latter was demonstrated in generating a sulfonyl fluoride with sufficient purity for direct use in label-ling of chymotrypsin, under physiological conditions

    Development of a healthy biscuit: an alternative approach to biscuit manufacture

    Get PDF
    OBJECTIVE: Obesity (BMI >30) and related health problems, including coronary heart disease (CHD), is without question a public health concern. The purpose of this study was to modify a traditional biscuit by the addition of vitamin B(6), vitamin B(12), Folic Acid, Vitamin C and Prebiotic fibre, while reducing salt and sugar. DESIGN: Development and commercial manufacture of the functional biscuit was carried out in collaboration with a well known and respected biscuit manufacturer of International reputation. The raw materials traditionally referred to as essential in biscuit manufacture, i.e. sugar and fat, were targeted for removal or reduction. In addition, salt was completely removed from the recipe. PARTICIPANTS: University students of both sexes (n = 25) agreed to act as subjects for the study. Ethical approval for the study was granted by the University ethics committee. The test was conducted as a single blind crossover design, and the modified and traditional biscuits were presented to the subjects under the same experimental conditions in a random fashion. RESULTS: No difference was observed between the original and the modified product for taste and consistency (P > 0.05). The modified biscuit was acceptable to the consumer in terms of eating quality, flavour and colour. Commercial acceptability was therefore established. CONCLUSION: This study has confirmed that traditional high-fat and high-sugar biscuits which are not associated with healthy diets by most consumers can be modified to produce a healthy alternative that can be manufactured under strict commercial conditions

    Advanced analytics and AI : impact, implementation, and the future of work

    No full text
    xviii, 286 p. ; 26 cm

    Solubility prediction in water and organic solvents through a combination of chemometrics and computational chemistry

    No full text
    Accurate solubility prediction is crucial across a range of scientific disciplines including drug discovery, protein engineering, drug and agrochemical process design, biochemistry, route prediction, crystallisation, and extraction. We herein report a successful approach to predicting solubility, not only in water but also in organic solvents (ethanol, benzene, and acetone), using a combination of machine learning and computational chemistry. Our new approach, named Causal Structure Property Relationship (CSPR), allowed examination of the physical chemistry behind dissolution to choose a small number of chemically relevant descriptors to produce highly interpretable models. These models gave significantly more accurate predictions than leading open-source and commercial solubility prediction tools, achieving accuracy (60-80 %) close to the expected level of noise in the training data (LogS±0.7). By reproducing the physicochemical relationship between solubility and molecular properties in different solvents, rational improvements to the models were explored. Subsequent improvements to the models included modifying the solvation energy and combining machine learning methods to provide a consensus prediction. A larger dataset in water provided the basis for the discussion of pKa and speciation in water. We conclude that gathering accurate solubility data across a range of solvents is crucial to expanding this work and promoting sustainable chemistry in the future. It is our hope that this methodology will be applied to other problems in chemistry and that our open-access datasets (the first of its kind for benzene and acetone) will stimulate further research in this field

    Homocysteine induced cardiovascular events: a consequence of long term anabolic-androgenic steroid (AAS) abuse

    Get PDF
    Objectives: The long term effects (>20 years) of anabolic-androgenic steroid (AAS) use on plasma concentrations of homocysteine (HCY), folate, testosterone, sex hormone binding globulin (SHBG), free androgen index, urea, creatinine, haematocrit (HCT), vitamin B12, and urinary testosterone/epitestosterone (T/E) ratio, were examined in a cohort of self-prescribing bodybuilders. Methods: Subjects (n = 40) were divided into four distinct groups: (1) AAS users still using AAS (SU; n = 10); (2) AAS users abstinent from AAS administration for 3 months (SA; n = 10); (3) non-drug using bodybuilding controls (BC; n = 10); and (4) sedentary male controls (SC; n = 10). Results: HCY levels were significantly higher in SU compared with BC and SC (p<0.01), and with SA (p<0.05). Fat free mass was significantly higher in both groups of AAS users (p<0.01). Daily energy intake (kJ) and daily protein intake (g/day) were significantly higher in SU and SA (p<0.05) compared with BC and SC, but were unlikely to be responsible for the observed HCY increases. HCT concentrations were significantly higher in the SU group (p<0.01). A significant linear inverse relationship was observed in the SU group between SHBG and HCY (r = –0.828, p<0.01), indicating a possible influence of the sex hormones in determining HCY levels. Conclusions: With mounting evidence linking AAS to adverse effects on some clotting factors, the significantly higher levels of HCY and HCT observed in the SU group suggest long term AAS users have increased risk of future thromboembolic events
    corecore