6 research outputs found

    Siteselective and Enantiocomplementary C(sp<sup>3</sup>)鈥揌 Oxyfunctionalization for Synthesis of 伪鈥慔ydroxy Acids

    No full text
    Oxyfunctionalization of abundant carboxylic acids represents a direct approach to synthesizing 伪-hydroxy acids, which are valuable intermediates of various active pharmaceutical ingredients. Although ideal, the transformation is yet to be accomplished. Herein, enantiocomplementary C(sp3)鈥揌 oxyfunctionalization for the synthesis of 伪-hydroxy acids was realized by a cooperative strategy of substrate engineering, homologue screening and protein engineering of 伪-ketoglutarate-dependent nonheme iron aryloxyalkanoate dioxygenases. The reaction provided concise synthetic routes toward three types of 67 伪-hydroxy acids with high efficiency and selectivity (yield up to 90% and ee up to >99%). The distinctive complementary reactions add to a growing repertoire of biocatalytic oxyfunctionalization reactions

    MolFilterGAN: a progressively augmented generative adversarial network for triaging AI-designed molecules

    No full text
    Abstract Artificial intelligence (AI)-based molecular design methods, especially deep generative models for generating novel molecule structures, have gratified our imagination to explore unknown chemical space without relying on brute-force exploration. However, whether designed by AI or human experts, the molecules need to be accessibly synthesized and biologically evaluated, and the trial-and-error process remains a resources-intensive endeavor. Therefore, AI-based drug design methods face a major challenge of how to prioritize the molecular structures with potential for subsequent drug development. This study indicates that common filtering approaches based on traditional screening metrics fail to differentiate AI-designed molecules. To address this issue, we propose a novel molecular filtering method, MolFilterGAN, based on a聽progressively augmented generative adversarial network. Comparative analysis shows that MolFilterGAN outperforms conventional screening approaches based on drug-likeness or synthetic ability metrics. Retrospective analysis of AI-designed discoidin domain receptor 1 (DDR1) inhibitors shows that MolFilterGAN significantly increases the efficiency of molecular triaging. Further evaluation of MolFilterGAN on eight external ligand sets suggests that MolFilterGAN is useful in triaging or enriching bioactive compounds across a wide range of target types. These results highlighted the importance of MolFilterGAN in evaluating molecules integrally and further accelerating molecular discovery especially combined with advanced AI generative models

    Fine-tuning Large Language Models for Chemical Text Mining

    No full text
    Extracting knowledge from complex and diverse chemical texts is a pivotal task for both experimental and computational chemists. The task is still considered to be extremely challenging due to the complexity of the chemical language and scientific literature. This study explored the power of fine-tuned large language models (LLMs) on five intricate chemical text mining tasks: compound entity recognition, reaction role labelling, metal-organic framework (MOF) synthesis information extraction, nuclear magnetic resonance spectroscopy (NMR) data extraction, and the conversion of reaction paragraph to action sequence. The fine-tuned LLMs models demonstrated impressive performance, significantly reducing the need for repetitive and extensive prompt engineering experiments. For comparison, we guided GPT-3.5 and GPT-4 with prompt engineering and fine-tuned GPT-3.5 as well as other open-source LLMs such as Llama2, T5, and BART. The results showed that the fine-tuned GPT models excelled in all tasks. It achieved exact accuracy levels ranging from 69% to 95% on these tasks with minimal annotated data. It even outperformed those task-adaptive pre-training and fine-tuning models that were based on a significantly larger amount of in-domain data. Given its versatility, robustness, and low-code capability, leveraging fine-tuned LLMs as flexible and effective toolkits for automated data acquisition could revolutionize chemical knowledge extraction

    Artificial intelligence in drug design

    No full text
    corecore