6 research outputs found

    AI-assisted curriculum learning

    Get PDF
    Deep reinforcement learning is widely applied in de novo molecular design to generate molecules with desired properties. This technique often has a sparse reward problem since the target properties usually exist for the minority of the generated molecules. With a sparse reward, the agent in a de novo design tool may fail to begin learning and waste much time exploring areas in the vast chemical space that are far away from the target area. A recent study successfully applied curriculum learning to mitigate the sparse reward problem. However, a chemist must hand-craft a curriculum for the generative agents, which requires domain knowledge and is time-consuming, especially as tasks grow in complexity. This thesis applies an AI assistance framework to assist in a curriculum design task by recommending actions. The AI assistant infers the private information of the chemist, including design objective function and chemist’s biases. Then, the AI tries to convince the chemist to adopt its advice. The chemist is free to choose action after receiving advice. This setting presents a significant improvement in AI safety. We demonstrate this method with a simulated chemist in a de novo design task, where the generated molecules should be predicted to be active against the dopamine type 2 receptor (DRD2). Our experiments show that the AI-assisted curriculum learning achieves a pronounced improvement on the sparse property (DRD2) and significantly outperforms unassisted curriculum learning

    A Multi-Index Generative Adversarial Network for Tool Wear Detection with Imbalanced Data

    No full text
    The scarcity of abnormal data leads to imbalanced data in the field of monitoring tool wear conditions. In this paper, a novel multi-index generative adversarial network (MI-GAN) is proposed to detect the tool wear conditions subject to imbalanced signal data. First, the generator in the MI-GAN is trained to produce fake normal signals, and the discriminator computes scores of testing signals and generated signals. Next, the generator detects abnormal signals based on the performance of imitating testing signals, and the discriminator will compute the scores of testing signals and generated signals. Subsequently, two indexes, i.e., L2-norm and temporal correlation coefficient (CORT), are put forward to measure the similarity between generated signals and testing signals. Finally, our decision-making function further combines L2-norm and CORT with two discriminator scores to determine the tool conditions. Experimental results show that our method obtains 97% accuracy in tool wear detection based on imbalanced data without manual feature extraction, which outperforms traditional machine learning methods

    Human-in-the-loop assisted de novo molecular design

    No full text
    A de novo molecular design workflow can be used together with technologies such as reinforcement learning to navigate the chemical space. A bottleneck in the workflow that remains to be solved is how to integrate human feedback in the exploration of the chemical space to optimize molecules. A human drug designer still needs to design the goal, expressed as a scoring function for the molecules that captures the designer’s implicit knowledge about the optimization task. Little support for this task exists and, consequently, a chemist usually resorts to iteratively building the objective function of multi-parameter optimization (MPO) in de novo design. We propose a principled approach to use human-in-the-loop machine learning to help the chemist to adapt the MPO scoring function to better match their goal. An advantage is that the method can learn the scoring function directly from the user’s feedback while they browse the output of the molecule generator, instead of the current manual tuning of the scoring function with trial and error. The proposed method uses a probabilistic model that captures the user’s idea and uncertainty about the scoring function, and it uses active learning to interact with the user. We present two case studies for this: In the first use-case, the parameters of an MPO are learned, and in the second use-case a non-parametric component of the scoring function to capture human domain knowledge is developed. The results show the effectiveness of the methods in two simulated example cases with an oracle, achieving significant improvement in less than 200 feedback queries, for the goals of a high QED score and identifying potent molecules for the DRD2 receptor, respectively. We further demonstrate the performance gains with a medicinal chemist interacting with the system

    Human-in-the-loop assisted de novo molecular design

    Get PDF
    Abstract A de novo molecular design workflow can be used together with technologies such as reinforcement learning to navigate the chemical space. A bottleneck in the workflow that remains to be solved is how to integrate human feedback in the exploration of the chemical space to optimize molecules. A human drug designer still needs to design the goal, expressed as a scoring function for the molecules that captures the designer’s implicit knowledge about the optimization task. Little support for this task exists and, consequently, a chemist usually resorts to iteratively building the objective function of multi-parameter optimization (MPO) in de novo design. We propose a principled approach to use human-in-the-loop machine learning to help the chemist to adapt the MPO scoring function to better match their goal. An advantage is that the method can learn the scoring function directly from the user’s feedback while they browse the output of the molecule generator, instead of the current manual tuning of the scoring function with trial and error. The proposed method uses a probabilistic model that captures the user’s idea and uncertainty about the scoring function, and it uses active learning to interact with the user. We present two case studies for this: In the first use-case, the parameters of an MPO are learned, and in the second use-case a non-parametric component of the scoring function to capture human domain knowledge is developed. The results show the effectiveness of the methods in two simulated example cases with an oracle, achieving significant improvement in less than 200 feedback queries, for the goals of a high QED score and identifying potent molecules for the DRD2 receptor, respectively. We further demonstrate the performance gains with a medicinal chemist interacting with the system. Graphical Abstrac
    corecore