Search CORE

45 research outputs found

Barking up the right tree: An approach to search over molecule synthesis DAGs

Author: Bradshaw J
Hernández-Lobato JM
Kusner MJ
Paige B
Segler MHS
Publication venue: Advances in Neural Information Processing Systems
Publication date: 01/01/2020
Field of study

When designing new molecules with particular properties, it is not only important what to make but crucially how to make it. These instructions form a synthesis directed acyclic graph (DAG), describing how a large vocabulary of simple building blocks can be recursively combined through chemical reactions to create more complicated molecules of interest. In contrast, many current deep generative models for molecules ignore synthesizability. We therefore propose a deep generative model that better represents the real world process, by directly outputting molecule synthesis DAGs. We argue that this provides sensible inductive biases, ensuring that our model searches over the same chemical space that chemists would also have access to, as well as interpretability. We show that our approach is able to model chemical space well, producing a wide range of diverse molecules, and allows for unconstrained optimization of an inherently constrained problem: maximize certain chemical properties such that discovered molecules are synthesizable

arXiv.org e-Print Archive

Apollo (Cambridge)

A generative model for electron paths

Author: Bradshaw J
Hernández-Lobato JM
Kusner MJ
Paige B
Segler MHS
Publication venue: 7th International Conference on Learning Representations, ICLR 2019
Publication date: 01/01/2019
Field of study

Chemical reactions can be described as the stepwise redistribution of electrons in molecules. As such, reactions are often depicted using “arrow-pushing” diagrams which show this movement as a sequence of arrows. We propose an electron path prediction model (ELECTRO) to learn these sequences directly from raw reaction data. Instead of predicting product molecules directly from reactant molecules in one shot, learning a model of electron movement has the benefits of (a) being easy for chemists to interpret, (b) incorporating constraints of chemistry, such as balanced atom counts before and after the reaction, and (c) naturally encoding the sparsity of chemical reactions, which usually involve changes in only a small number of atoms in the reactants. We design a method to extract approximate reaction paths from any dataset of atom-mapped reaction SMILES strings. Our model achieves excellent performance on an important subset of the USPTO reaction dataset, comparing favorably to the strongest baselines. Furthermore, we show that our model recovers a basic knowledge of chemistry without being explicitly trained to do so.EPSR

arXiv.org e-Print Archive

UCL Discovery

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

A New Integer Linear Programming Formulation to the Inverse QSAR/QSPR for Acyclic Chemical Compounds Using Skeleton Trees

Author: A Kerber
C Rupakheti
H Fujiwara
H Ikebata
H Nagamochi
J Li
JL Reymond
K Roy
MHS Segler
MI Skvortsova
R Gómez-Bombarelli
T Akutsu
T Miyao
X Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2020
Field of study

33rd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2020, Kitakyushu, Japan, September 22-25, 2020.Computer-aided drug design is one of important application areas of intelligent systems. Recently a novel method has been proposed for inverse QSAR/QSPR using both artificial neural networks (ANN) and mixed integer linear programming (MILP), where inverse QSAR/QSPR is a major approach for drug design. This method consists of two phases: In the first phase, a feature function f is defined so that each chemical compound G is converted into a vector f(G) of several descriptors of G, and a prediction function ψ is constructed with an ANN so that ψ(f(G)) takes a value nearly equal to a given chemical property π for many chemical compounds G in a data set. In the second phase, given a target value y∗ of the chemical property π , a chemical structure G∗ is inferred in the following way. An MILP M is formulated so that M admits a feasible solution (x∗, y∗) if and only if there exist vectors x∗, y∗ and a chemical compound G∗ such that ψ(x∗)=y∗ and f(G∗)=x∗. The method has been implemented for inferring acyclic chemical compounds. In this paper, we propose a new MILP for inferring acyclic chemical compounds by introducing a novel concept, skeleton tree, and conducted computational experiments. The results suggest that the proposed method outperforms the existing method when the diameter of graphs is up to around 6 to 8. For an instance for inferring acyclic chemical compounds with 38 non-hydrogen atoms from C, O and S and diameter 6, our method was 5×104 times faster

Crossref

Kyoto University Research Information Repository

Automatic mapping of atoms across both simple and complex chemical reactions

Author: A Bøgevig
B Liu
ChW Plummer
CW Coley
CW Coley
DA Fooshee
EL First
G Li
GAP Gonzalez
H Kraut
HL Morgan
J Clemens
JD Crabtree
JJ McGregor
JN Wei
K Funatsu
L Chen
LI Palmer
LP Cordella
M Heinonen
M Latendresse
MF Lynch
MH Hopkins
MHS Segler
N Schneider
N Schneider
P Magnus
P Schwaller
PJ Hickford
R Körner
R Liu
S Szymkuć
SA Cook
SA Rahman
T Akutsu
T Klucznik
WL Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Mapping atoms across chemical reactions is important for substructure searches, automatic extraction of reaction rules, identification of metabolic pathways, and more. Unfortunately, the existing mapping algorithms can deal adequately only with relatively simple reactions but not those in which expert chemists would benefit from computer's help. Here we report how a combination of algorithmics and expert chemical knowledge significantly improves the performance of atom mapping, allowing the machine to deal with even the most mechanistically complex chemical and biochemical transformations. The key feature of our approach is the use of few but judiciously chosen reaction templates that are used to generate plausible "intermediate" atom assignments which then guide a graph-theoretical algorithm towards the chemically correct isomorphic mappings. The algorithm performs significantly better than the available state-of-the-art reaction mappers, suggesting its uses in database curation, mechanism assignments, and - above all - machine extraction of reaction rules underlying modern synthesis-planning programs

IBS Publications Repository

Crossref

Directory of Open Access Journals

ScholarWorks@UNIST

11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

Author: Abel R
Achenbach J
Adikwu UM
Ain QU
Al-Yamori R
Alhalabi Z
Aniceto N
Ansideri F
Baker D
Balducci A
Banting L
Barilla J
Barrett I
Basu D
Baumann K
Bender A
Bender A
Bender A
Berg E
Bergström F
Bermudez M
Bietz S
Bietz S
Bodnarchuk MS
Boeckler FM
Boeckler FM
Bojarski AJ
Bojarski AJ
Borbulevych OY
Buchholz M
Bulusu KC
Bureau R
Böckler FM
Böttcher S
Büttner FM
Cao Q
Cappel D
Cheeseright T
Clark RD
Clark T
Da Costa FB
Dahlgren M
De Graaf C
Demuth H-U
Dorfman R
Dubrucq K
Ecker GF
Edman K
Egelkraut-Holtus M
Eid S
Eigner-Pitto V
Engel J
Engkvist O
Epple M
Essex JW
Evers A
Exner TE
Fan T-P
Fechner U
Finkelmann AR
Firaha DS
Firth M
Fourches D
Fraaije JH
Frach R
Frach R
Fraczkiewicz R
Freitas A
Friedrich N-O
Friesner R
Fu X
Fuchs JE
Fulle S
Furtado F
Garg P
Gervasio FL
Ghafourian T
Glen R
Gracia RS
Grebner C
Guallar V
Göller AH
Günther MB
Günther S
Güssregen S
Haensele E
Heidrich J
Heil J
Hennig S
Herrmann G
Hessler G
Hilbig M
Himmler H-J
Hoffgaard F
Hogner A
Hollóczki O
Horinek D
Hošek P
Husch T
Ibezim A
Ihlenfeldt WD
Ihlenfeldt WD
Jardin C
Judson P
Jäger C
Kalinowski L
Kalliokoski T
Kast SM
Kast SM
Kast SM
Kibies P
Kibies P
Kirchmair J
Kirchner B
Kireeva N
Klute W
Koch O
Koch P
Kohlbacher O
Kolb P
Korth M
Kos A
Kramer C
Krilov G
Krotzky T
Krotzky T
Kuhn H
Kuhn MA
Kurczab R
Kühne R
Lange A
Lange A
Lanig H
Laufer S
Levine Z
Li X
Lifongo LL
Lin T
Lisurek M
Lokajíček MV
Mackey M
Masek BB
Mathea M
Matter H
Mbah CJ
Mbaze LM
McWilliams L
Mervin L
Mervin LH
Mittal S
Mohamad-Zobir SZ
Montanari F
Moser D
Mrugalla F
Mullen R
Murray DC
Nagy S
Nahum O
Naß A
Nguyen QD
Nogueira MS
Ntie-Kang F
Ntie-Kang F
Ntie-Kang F
Nwodo NJ
Oliveira Santos JS-D
Oliveira TB
Omoto K
Onlia I
Ostroumov D
Owen RM
Panecka J
Patel H
Pervov VS
Petrov A
Pisaková H
Pleik S
Polokoff M
Pongratz T
Pretzel J
Proschak E
Pryde DC
Pöhner IA
Rarey M
Rarey M
Rarey M
Rauh D
Renner G
Renner G
Richmond NJ
Rickmeyer T
Rippmann F
Ross GA
Ruff M
Rupp B
Saladino G
Saleh N
Sandmann A
Sandmann A
Schall C
Schmidt D
Schmidt TC
Schmidt TJ
Schmidtke P
Schneider G
Schomburg KT
Schram J
Schulz R
Schütter C
Segler MHS
Senderowitz H
Shaikh N
Shea J-E
Sherman W
Sievers-Engler A
Simoben CV
Simr P
Sippl W
Smith S
Solovev VP
Soltanshahi F
Sommer K
Sotriffer CA
Spiwok V
Stehle T
Steinbrecher TB
Steudle A
Sticht H
Strohfeldt S
Sánchez-García E
Tautermann CS
Torda AE
Torella R
Truszkowski A
Turk S
Tyrchan C
Tyrchan C
Ulander J
Ulander J
Van den Broek K
Van den Broek K
Van Oeyen A
Volkamer A
Wade RC
Waldman M
Waller MP
Wang L
Warszycki D
Weber J
Wessjohann L
Westerhoff LM
Whitley DC
Wieczorek V
Wolber G
Yosipof A
Zdrazil B
Zielesny A
Zimmermann MO
Zoufir A
Śmieja M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/03/2016
Field of study

Spiral - Imperial College Digital Repository

Machine learning for molecular and materials science

Author: A Agrawal
A Aspuru-Guzik
A Franceschetti
A Jain
A Pulido
A Steane
A Tropsha
A Walsh
AO Oliynyk
AP Bartók
Aron Walsh
AW Harrow
B Liu
BM Lake
C Hansch
C Kuhn
CE Calderon
CM Handley
CRA Catlow
D Bonchev
D Fourches
Daniel W. Davies
DB Boyd
DJ Hand
DW Davies
E Kim
EJ Corey
F Brockherde
F Legrain
F-X Coudert
FA Faber
FA Faber
G Hautier
G Pilania
G Shakhnarovich
H Altae-Tran
Hugh Cartwright
IV Tetko
J Behler
J Biamonte
J Carrasquilla
J Hachmann
J Hill
J Schmidhuber
J Shawe-Taylor
J Wellendorff
JA Pople
JC Cole
JC Snyder
JE Saal
JGP Wicker
JS Smith
K Lejaeghere
KA Wilkinson
Keith T. Butler
KT Schütt
L Ward
LM Ghiringhelli
M Arita
M Pillong
M Reiher
M Rupp
M Schmidt
M Ziatdinov
MHS Segler
MHS Segler
N Kireeva
N Mardirossian
NN Kiselyova
O Isayev
Olexandr Isayev
P Domingos
P Hohenberg
P Raccuglia
PAM Dirac
R Christensen
R Gómez-Bombarelli
S Curtarolo
S Szymkuć
SH Rudy
SJL Billinge
SV Kalinin
T Klucznik
T Moot
T Sterling
TI Oprea
V Dragone
V Dunjko
V Havu
VHC Albuquerque de
W Kohn
WWM Fleuren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/07/2018
Field of study

OPUS

Crossref

Pathways to cellular supremacy in biocomputing

Author: A Alaghi
A Church
A Condon
A Eldar
A Francesca Ceroni
A Goñi-Moreno
A Goñi-Moreno
A Goñi-Moreno
A Gyorgy
A Marais
A Pandi
A Scialdone
A Tamsir
A Urrios
AAK Nielsen
AE Friedland
AK Brödel
AM Turing
AW Harrow
AY Weiße
B Alberts
B Delépine
B Delépine
B Liu
B Scheres
B Wang
BJ MacLennan
C Lou
C Neill
CE Shannon
D Angluin
D David
D Liu
D Soloveichik
DJ Nicholson
DT Gillespie
DT Gillespie
E-M Nikolados
F Ceroni
F Ceroni
F Ciocchetta
F Jacob
F Zhang
G Fiore
G-M Lin
GS Engel
H Abelson
H Niederholtmeyer
H Rubin
H Zhao
HT Siegelmann
J Ana Solopova
J Hartmanis
J Macia
J Macia
J Naylor
J Preskill
J Ronald
J Sardanyés
J Von Neumann
J-C García-Betancur
JD Bekenstein
JJ Tabor
JL Peterson
K Brenner
K Mona
K Oishi
KM Esvelt
L Cai
L Lamport
L Qian
L Qian
LJ Clarke
LM Adleman
M Adriana
M Amos
M Amos
M Chavarría
M Tomazou
MA TerAvest
MB Elowitz
MHS Segler
MM Salek
N Goldman
N Kylilis
N Lambert
N Vladimirov
NS McCarty
O Kotte
P Dvorák
P Xu
PW Rothemund
PWK Rothemund
R Armstrong
R Chait
R Daniel
R Sarpeshkar
R Solé
RC Paton
RJ Lipton
S Ausländer
S Boixo
S Cardinale
S Kang
S Regot
S Slomovic
SK Aoki
SM Hoffer
SR Scott
SS Jang
SS Woo
T Danino
T Tschirhart
T Umedachi
TE Gorochowski
TE Gorochowski
TE Gorochowski
TJ Rudge
TK Lu
TS Gardner
V Chubukov
V de Lorenzo
V de Lorenzo
V de Lorenzo
W Heisenberg
W Ji
Y Chen
Y Yokobayashi
YY Chen
Z Swank
Á Goñi-Moreno
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/10/2019
Field of study

Synthetic biology uses living cells as the substrate for performing human-defined computations. Many current implementations of cellular computing are based on the “genetic circuit” metaphor, an approximation of the operation of silicon-based computers. Although this conceptual mapping has been relatively successful, we argue that it fundamentally limits the types of computation that may be engineered inside the cell, and fails to exploit the rich and diverse functionality available in natural living systems. We propose the notion of “cellular supremacy” to focus attention on domains in which biocomputing might offer superior performance over traditional computers. We consider potential pathways toward cellular supremacy, and suggest application areas in which it may be found.A.G.-M. was supported by the SynBio3D project of the UK Engineering and Physical Sciences Research Council (EP/R019002/1) and the European CSA on biological standardization BIOROBOOST (EU grant number 820699). T.E.G. was supported by a Royal Society University Research Fellowship (grant UF160357) and BrisSynBio, a BBSRC/ EPSRC Synthetic Biology Research Centre (grant BB/L01386X/1). P.Z. was supported by the EPSRC Portabolomics project (grant EP/N031962/1). P.C. was supported by SynBioChem, a BBSRC/EPSRC Centre for Synthetic Biology of Fine and Specialty Chemicals (grant BB/M017702/1) and the ShikiFactory100 project of the European Union’s Horizon 2020 research and innovation programme under grant agreement 814408

Crossref

Northumbria Research Link

Edinburgh Research Explorer

The University of Manchester - Institutional Repository

Digital.CSIC

Explore Bristol Research

Streamlining bioactive molecular discovery through integration and automation

Author: A Baranczak
A Buitrago Santanilla
A Nadin
AH Lipkus
B Desai
BJ Reizman
CA Lipinski
CW Coley
D Ghislieri
D Perera
D Reker
DC Blakemore
DE Scott
DG Brown
DJ Foley
DJ Newman
G Karageorgis
G Karageorgis
G Schneider
J Li
J Twilton
J Wang
JB Murray
K Troshin
KD Collins
L Guetzoyan
L Guetzoyan
M Carpintero
M Mondal
M Werner
M Yoshida
MHS Segler
N Schneider
NJ Gesmundo
PG Polishchuk
PM Murray
R Macarron
RA Goodnow Jr
RA Maplestone
RD Firn
RD Taylor
SD Pickett
SD Roughley
SM Paul
SR Langdon
SV Ley
SV Ley
SYF Hawkes
T Cernak
T Qin
TW Cooper
W Czechtizky
WP Walters
YJJ Hwang
YL Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/07/2018
Field of study

The discovery of bioactive small molecules is generally driven via iterative design–make–purify–test cycles. Automation is routinely harnessed at individual stages of these cycles to increase the productivity of drug discovery. Here, we describe recent progress to automate and integrate two or more adjacent stages within discovery workflows. Examples of such technologies include microfluidics, liquid-handling robotics and affinity-selection mass spectrometry. The value of integrated technologies is illustrated in the context of specific case studies in which modulators of targets, such as protein kinases, nuclear hormone receptors and protein–protein interactions, were discovered. We note that to maximize impact on the productivity of discovery, each of the integrated stages would need to have both high and matched throughput. We also consider the longer-term goal of realizing the fully autonomous discovery of bioactive small molecules through the integration and automation of all stages of discovery

Crossref

Edinburgh Research Explorer

White Rose Research Online

A model to search for synthesizable molecules

Author: Bradshaw J
Hernández-Lobato JM
Kusner MJ
Paige B
Segler MHS
Publication venue
Publication date: 01/01/2019
Field of study

Deep generative models are able to suggest new organic molecules by generating strings, trees, and graphs representing their structure. While such models allow one to generate molecules with desirable properties, they give no guarantees that the molecules can actually be synthesized in practice. We propose a new molecule generation model, mirroring a more realistic real-world process, where (a) reactants are selected, and (b) combined to form more complex molecules. More specifically, our generative model proposes a bag of initial reactants (selected from a pool of commercially-available molecules) and uses a reaction model to predict how they react together to generate new molecules. We first show that the model can generate diverse, valid and unique molecules due to the useful inductive biases of modeling reactions. Furthermore, our model allows chemists to interrogate not only the properties of the generated molecules but also the feasibility of the synthesis routes. We conclude by using our model to solve retrosynthesis problems, predicting a set of reactants that can produce a target product

arXiv.org e-Print Archive

UCL Discovery

CUED - Cambridge University Engineering Department

Generating molecules via chemical reactions

Author: Bradshaw J
Hernández-Lobato JM
Kusner MJ
Paige B
Segler MHS
Publication venue
Publication date: 01/01/2019
Field of study

© Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop.All right reserved. Over the last few years exciting work in deep generative models has produced models able to suggest new organic molecules by generating strings, trees, and graphs representing their structure. While such models are able to generate molecules with desirable properties, their utility in practice is limited due to the difficulty in knowing how to synthesize these molecules. We therefore propose a new molecule generation model, mirroring a more realistic real-world process, where reactants are selected and combined to form more complex molecules. More specifically, our generative model proposes a bag of initial reactants (selected from a pool of commercially-available molecules) and uses a reaction model to predict how they react together to generate new molecules. Modeling the entire process of constructing a molecule during generation offers a number of advantages. First, we show that such a model has the ability to generate a wide, diverse set of valid and unique molecules due to the useful inductive biases of modeling reactions. Second, modeling synthesis routes rather than final molecules offers practical advantages to chemists who are not only interested in new molecules but also suggestions on stable and safe synthetic routes. Third, we demonstrate the capabilities of our model to also solve one-step retrosynthesis problems, predicting a set of reactants that can produce a target product

UCL Discovery

CUED - Cambridge University Engineering Department