28 research outputs found

    Testing the limits of SMILES-based de novo molecular generation with curriculum and deep reinforcement learning

    Get PDF
    Deep reinforcement learning methods have been shown to be potentially powerful tools for de novo design. Recurrent-neural-network-based techniques are the most widely used methods in this space. In this work we examine the behaviour of recurrent-neural-network-based methods when there are few (or no) examples of molecules with the desired properties in the training data. We find that targeted molecular generation is usually possible, but the diversity of generated molecules is often reduced and it is not possible to control the composition of generated molecular sets. To help overcome these issues, we propose a new curriculum-learning-inspired recurrent iterative optimization procedure that enables the optimization of generated molecules for seen and unseen molecular profiles, and allows the user to control whether a molecular profile is explored or exploited. Using our method, we generate specific and diverse sets of molecules with up to 18 times more scaffolds than standard methods for the same sample size; however, our results also point to substantial limitations of one-dimensional molecular representations, as used in this space. We find that the success or failure of a given molecular optimization problem depends on the choice of simplified molecular-input line-entry system (SMILES)

    The ELF Honest Data Broker:Informatics enabling public-private collaboration in a precompetitive arena

    Get PDF
    New precompetitive ways of working in the pharmaceutical industry are driving the development of new informatics systems to enable their execution and management. The European Lead Factory (ELF) is a precompetitive, 30-partner collaboration between academic groups, small–medium enterprises and pharmaceutical companies created to discover small molecule hits against novel biological targets. A unique HTS screening and triage workflow has been developed to balance the intellectual property and scientific requirements of all the partners. Here, we describe the ELF Honest Data Broker, a cloud-based informatics system providing the scientific triage tools, fine-grained permissions and management tools required to implement the workflow

    Serum Biomarker Profile Including CCL1, CXCL10, VEGF, and Adenosine Deaminase Activity Distinguishes Active From Remotely Acquired Latent Tuberculosis

    Get PDF
    INTRODUCTION: There is an urgent medical need to differentiate active tuberculosis (ATB) from latent tuberculosis infection (LTBI) and prevent undertreatment and overtreatment. The aim of this study was to identify biomarker profiles that may support the differentiation between ATB and LTBI and to validate these signatures. MATERIALS AND METHODS: The discovery cohort included adult individuals classified in four groups: ATB (n = 20), LTBI without prophylaxis (untreated LTBI; n = 20), LTBI after completion of prophylaxis (treated LTBI; n = 20), and healthy controls (HC; n = 20). Their sera were analyzed for 40 cytokines/chemokines and activity of adenosine deaminase (ADA) isozymes. A prediction model was designed to differentiate ATB from untreated LTBI using sparse partial least squares (sPLS) and logistic regression analyses. Serum samples of two independent cohorts (national and international) were used for validation. RESULTS: sPLS regression analyses identified C-C motif chemokine ligand 1 (CCL1), C-reactive protein (CRP), C-X-C motif chemokine ligand 10 (CXCL10), and vascular endothelial growth factor (VEGF) as the most discriminating biomarkers. These markers and ADA(2) activity were significantly increased in ATB compared to untreated LTBI (p ≤ 0.007). Combining CCL1, CXCL10, VEGF, and ADA2 activity yielded a sensitivity and specificity of 95% and 90%, respectively, in differentiating ATB from untreated LTBI. These findings were confirmed in the validation cohort including remotely acquired untreated LTBI participants. CONCLUSION: The biomarker signature of CCL1, CXCL10, VEGF, and ADA2 activity provides a promising tool for differentiating patients with ATB from non-treated LTBI individuals

    An Open Drug Discovery Competition: Experimental Validation of Predictive Models in a Series of Novel Antimalarials.

    Get PDF
    The Open Source Malaria (OSM) consortium is developing compounds that kill the human malaria parasite, Plasmodium falciparum, by targeting PfATP4, an essential ion pump on the parasite surface. The structure of PfATP4 has not been determined. Here, we describe a public competition created to develop a predictive model for the identification of PfATP4 inhibitors, thereby reducing project costs associated with the synthesis of inactive compounds. Competition participants could see all entries as they were submitted. In the final round, featuring private sector entrants specializing in machine learning methods, the best-performing models were used to predict novel inhibitors, of which several were synthesized and evaluated against the parasite. Half possessed biological activity, with one featuring a motif that the human chemists familiar with this series would have dismissed as "ill-advised". Since all data and participant interactions remain in the public domain, this research project "lives" and may be improved by others

    Global variation in diabetes diagnosis and prevalence based on fasting glucose and hemoglobin A1c

    Get PDF
    Fasting plasma glucose (FPG) and hemoglobin A1c (HbA1c) are both used to diagnose diabetes, but these measurements can identify different people as having diabetes. We used data from 117 population-based studies and quantified, in different world regions, the prevalence of diagnosed diabetes, and whether those who were previously undiagnosed and detected as having diabetes in survey screening, had elevated FPG, HbA1c or both. We developed prediction equations for estimating the probability that a person without previously diagnosed diabetes, and at a specific level of FPG, had elevated HbA1c, and vice versa. The age-standardized proportion of diabetes that was previously undiagnosed and detected in survey screening ranged from 30% in the high-income western region to 66% in south Asia. Among those with screen-detected diabetes with either test, the age-standardized proportion who had elevated levels of both FPG and HbA1c was 29-39% across regions; the remainder had discordant elevation of FPG or HbA1c. In most low- and middle-income regions, isolated elevated HbA1c was more common than isolated elevated FPG. In these regions, the use of FPG alone may delay diabetes diagnosis and underestimate diabetes prevalence. Our prediction equations help allocate finite resources for measuring HbA1c to reduce the global shortfall in diabetes diagnosis and surveillance

    Global variations in diabetes mellitus based on fasting glucose and haemogloblin A1c

    Get PDF
    Fasting plasma glucose (FPG) and haemoglobin A1c (HbA1c) are both used to diagnose diabetes, but may identify different people as having diabetes. We used data from 117 population-based studies and quantified, in different world regions, the prevalence of diagnosed diabetes, and whether those who were previously undiagnosed and detected as having diabetes in survey screening had elevated FPG, HbA1c, or both. We developed prediction equations for estimating the probability that a person without previously diagnosed diabetes, and at a specific level of FPG, had elevated HbA1c, and vice versa. The age-standardised proportion of diabetes that was previously undiagnosed, and detected in survey screening, ranged from 30% in the high-income western region to 66% in south Asia. Among those with screen-detected diabetes with either test, the agestandardised proportion who had elevated levels of both FPG and HbA1c was 29-39% across regions; the remainder had discordant elevation of FPG or HbA1c. In most low- and middle-income regions, isolated elevated HbA1c more common than isolated elevated FPG. In these regions, the use of FPG alone may delay diabetes diagnosis and underestimate diabetes prevalence. Our prediction equations help allocate finite resources for measuring HbA1c to reduce the global gap in diabetes diagnosis and surveillance.peer-reviewe

    Identification of a Second Binding Site in the Estrogen Receptor

    No full text

    Theoretical study of the conformational isomerism of 2,4,6-substituted 1,3,5-trimethoxycalix[6]arenes

    Get PDF
    For 2,4,6-trisubstituted 1,3,5-trimethoxycalix[6]arenes 1, two competing interconversion pathways have been postulated in the literature for the Cone/1,2,3Alternate exchange, viz the “tert-butyl through the annulus” and “lower rim through the annulus” pathways. Both pathways were compared with molecular modeling with the conjugate peak refinement method. One variable-size atom (Sx) was introduced to represent the lower-rim substituents R, abstracting the “O−CH2−rigid group” motifs to one “O−CH2−Sx” group. Both the postulated mechanisms of Cone → 1,2,3Alternate isomerization are plausible. For large lower-rim substituents (Sx ≥ ≈6 Å), the “tert-butyl through the annulus” mechanism is preferred over the “Sx through the annulus” mechanism. The calculated upper free energy barrier for the isomerization process is 17.5 kcal mol-1, reasonably close to the experimental value of approximately 21 kcal mol-1 (van Duynhoven et al. J. Am. Chem. Soc.1994, 116, 5814)
    corecore