104,756 research outputs found
Controlling an organic synthesis robot with machine learning to search for new reactivity
The discovery of chemical reactions is an inherently unpredictable and time-consuming process1. An attractive alternative is to predict reactivity, although relevant approaches, such as computer-aided reaction design, are still in their infancy2. Reaction prediction based on high-level quantum chemical methods is complex3, even for simple molecules. Although machine learning is powerful for data analysis4,5, its applications in chemistry are still being developed6. Inspired by strategies based on chemistsâ intuition7, we propose that a reaction system controlled by a machine learning algorithm may be able to explore the space of chemical reactions quickly, especially if trained by an expert8. Here we present an organic synthesis robot that can perform chemical reactions and analysis faster than they can be performed manually, as well as predict the reactivity of possible reagent combinations after conducting a small number of experiments, thus effectively navigating chemical reaction space. By using machine learning for decision making, enabled by binary encoding of the chemical inputs, the reactions can be assessed in real time using nuclear magnetic resonance and infrared spectroscopy. The machine learning system was able to predict the reactivity of about 1,000 reaction combinations with accuracy greater than 80 per cent after considering the outcomes of slightly over 10 per cent of the dataset. This approach was also used to calculate the reactivity of published datasets. Further, by using real-time data from our robot, these predictions were followed up manually by a chemist, leading to the discovery of four reactions
Machine learning activation energies of chemical reactions
Application of machine learning (ML) to the prediction of reaction activation barriers is a new and exciting field for these algorithms. The works covered here are specifically those in which ML is trained to predict the activation energies of homogeneous chemical reactions, where the activation energy is given by the energy difference between the reactants and transition state of a reaction. Particular attention is paid to works that have applied ML to directly predict reaction activation energies, the limitations that may be found in these studies, and where comparisons of different types of chemical features for ML models have been made. Also explored are models that have been able to obtain high predictive accuracies, but with reduced datasets, using the Gaussian process regression ML model. In these studies, the chemical reactions for which activation barriers are modeled include those involving small organic molecules, aromatic rings, and organometallic catalysts. Also provided are brief explanations of some of the most popular types of ML models used in chemistry, as a beginner's guide for those unfamiliar
Shift-Robust Molecular Relational Learning with Causal Substructure
Recently, molecular relational learning, whose goal is to predict the
interaction behavior between molecular pairs, got a surge of interest in
molecular sciences due to its wide range of applications. In this work, we
propose CMRL that is robust to the distributional shift in molecular relational
learning by detecting the core substructure that is causally related to
chemical reactions. To do so, we first assume a causal relationship based on
the domain knowledge of molecular sciences and construct a structural causal
model (SCM) that reveals the relationship between variables. Based on the SCM,
we introduce a novel conditional intervention framework whose intervention is
conditioned on the paired molecule. With the conditional intervention
framework, our model successfully learns from the causal substructure and
alleviates the confounding effect of shortcut substructures that are spuriously
correlated to chemical reactions. Extensive experiments on various tasks with
real-world and synthetic datasets demonstrate the superiority of CMRL over
state-of-the-art baseline models. Our code is available at
https://github.com/Namkyeong/CMRL.Comment: KDD 202
- âŠ