Search CORE

92 research outputs found

Constant Size Molecular Descriptors For Use With Machine Learning

Author: Carr S.
Christopher R. Collins
David J. Yaron
Duvenaud D. K.
Geoffrey J. Gordon
Koch W.
Montavon G.
O. Anatole von Lilienfeld
Pedregosa F.
Wong E.
Publication venue
Publication date: 23/01/2017
Field of study

A set of molecular descriptors whose length is independent of molecular size is developed for machine learning models that target thermodynamic and electronic properties of molecules. These features are evaluated by monitoring performance of kernel ridge regression models on well-studied data sets of small organic molecules. The features include connectivity counts, which require only the bonding pattern of the molecule, and encoded distances, which summarize distances between both bonded and non-bonded atoms and so require the full molecular geometry. In addition to having constant size, these features summarize information regarding the local environment of atoms and bonds, such that models can take advantage of similarities resulting from the presence of similar chemical fragments across molecules. Combining these two types of features leads to models whose performance is comparable to or better than the current state of the art. The features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules.Comment: 18 pages, 5 figure

arXiv.org e-Print Archive

Crossref

edoc

Compositional inductive biases in function learning.

Author: Duvenaud D
Gershman SJ
Schulz E
Speekenbrink M
Tenenbaum JB
Publication venue
Publication date: 01/12/2017
Field of study

How do people recognize and learn about complex functional structure? Taking inspiration from other areas of cognitive science, we propose that this is achieved by harnessing compositionality: complex structure is decomposed into simpler building blocks. We formalize this idea within the framework of Bayesian regression using a grammar over Gaussian process kernels, and compare this approach with other structure learning approaches. Participants consistently chose compositional (over non-compositional) extrapolations and interpolations of functions. Experiments designed to elicit priors over functional patterns revealed an inductive bias for compositional structure. Compositional functions were perceived as subjectively more predictable than non-compositional functions, and exhibited other signatures of predictability, such as enhanced memorability and reduced numerosity. Taken together, these results support the view that the human intuitive theory of functions is inherently compositional

UCL Discovery

MPG.PuRe

Convolutional Networks on Graphs for Learning Molecular Fingerprints.

Author: Adams Ryan Prescott
Aguilera-Iparraguire Jorge
Aspuru-Guzik Alan
Duvenaud David
Gomez-Bombarelli Rafael
Hirzel Timothy D.
Maclaurin Dougal
Publication venue: Neural Information Processing Systems Foundation, Inc.
Publication date: 03/11/2015
Field of study

We introduce a convolutional neural network that operates directly on graphs. These networks allow end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape. The architecture we present generalizes standard molecular feature extraction methods based on circular fingerprints. We show that these data-driven features are more interpretable, and have better predictive performance on a variety of tasks.Chemistry and Chemical Biolog

arXiv.org e-Print Archive

Harvard University - DASH

Autonomous discovery in the chemical sciences part II: Outlook

Author: Coley C. W.
Duvenaud D. K.
Ekins S.
Fayyad U.
Frawley W. J.
Gomez-Perez A.
Haghighatlari M.
Hofer T. S.
Lammey R.
Langley P.
Mockus J.
Norman T. C.
Schütt K. T.
Williams A. J.
Zheng S.
Publication venue: 'Wiley'
Publication date: 30/03/2020
Field of study

This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best automated systems have yet to ``discover'' despite being incredibly useful as laboratory assistants. We must carefully consider how they have been and can be applied to future problems of chemical discovery in order to effectively design and interact with future autonomous platforms. The majority of this article defines a large set of open research directions, including improving our ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether we are making progress toward the ultimate goal of autonomous discovery. Addressing these practical and methodological challenges will greatly advance the extent to which autonomous systems can make meaningful discoveries.Comment: Revised version available at 10.1002/anie.20190998

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Assessing the impact of a health intervention via user-generated Internet content

Author: A Culotta
A Monto
A Signorini
AC Hayward
AE Hoerl
AM Presanis
B Efron
B Efron
B Matérn
B O’Hara
C Chew
CE Rasmussen
CE Rasmussen
D Lazer
DJ Smith
DK Duvenaud
DM Morens
DR Olson
Elad Yom-Tov
G Boivin
GJ Milinovich
GJD Smith
H Zou
Ingemar J. Cox
J Bollen
J Ginsberg
JG Petrie
KE Jones
M Baguelin
MA Oliver
MJ Paul
ML Cohen
MT Osterholm
N Cristianini
P Zhao
PM Polgreen
R Tibshirani
RG Pebody
Richard Pebody
S Binder
S Briand
S Cook
T Hastie
V Lampos
Vasileios Lampos
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of user-generated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the prevalence of a health event in a population from Internet data. This model is applied to identify control location groups that correlate historically with the areas, where a specific intervention campaign has taken place. We then determine the impact of the intervention by inferring a projection of the disease rates that could have emerged in the absence of a campaign. Our case study focuses on the influenza vaccination program that was launched in England during the 2013/14 season, and our observations consist of millions of geo-located search queries to the Bing search engine and posts on Twitter. The impact estimates derived from the application of the proposed statistical framework support conventional assessments of the campaign

Crossref

Springer - Publisher Connector

UCL Discovery

Copenhagen University Research Information System

Assessing the impact of a health intervention via user-generated Internet content

Author: A Culotta
A Monto
A Signorini
AC Hayward
AE Hoerl
AM Presanis
B Efron
B Efron
B Matérn
B O’Hara
C Chew
CE Rasmussen
CE Rasmussen
D Lazer
DJ Smith
DK Duvenaud
DM Morens
DR Olson
Elad Yom-Tov
G Boivin
GJ Milinovich
GJD Smith
H Zou
Ingemar J. Cox
J Bollen
J Ginsberg
JG Petrie
KE Jones
M Baguelin
MA Oliver
MJ Paul
ML Cohen
MT Osterholm
N Cristianini
P Zhao
PM Polgreen
R Tibshirani
RG Pebody
Richard Pebody
S Binder
S Briand
S Cook
T Hastie
V Lampos
Vasileios Lampos
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

XGraphBoost: Extracting Graph Neural Network-Based Features for a Better Prediction of Molecular Properties

Author: Duvenaud D. K.
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

Additive Gaussian Processes

Author: Duvenaud D.
Nickisch H.
Rasmussen C.
Publication venue
Publication date: 01/01/2012
Field of study

We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks

MPG.PuRe

Additive Gaussian processes

Author: Duvenaud D
Nickisch H
Rasmussen CE
Publication venue
Publication date: 01/12/2011
Field of study

CUED - Cambridge University Engineering Department

Warped Mixtures for Nonparametric Cluster Shapes

Author: Duvenaud D
Ghahramani Z
Iwata T
Publication venue
Publication date
Field of study

A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters. To produce more appropriate clusterings, we introduce a model which warps a latent mixture of Gaussians to produce nonparametric cluster shapes. The possibly low-dimensional latent mixture model allows us to summarize the properties of the high-dimensional clusters (or density manifolds) describing the data. The number of manifolds, as well as the shape and dimension of each manifold is automatically inferred. We derive a simple inference scheme for this model which analytically integrates out both the mixture parameters and the warping function. We show that our model is effective for density estimation, performs better than infinite Gaussian mixture models at recovering the true number of clusters, and produces interpretable summaries of high-dimensional datasets

CUED - Cambridge University Engineering Department