Drug Discovery under Covariate Shift with Domain-Informed Prior
  Distributions over Functions

Deane, Charlotte; Klarner, Leo; Morris, Garrett M.; Reutlinger, Michael; Rudner, Tim G. J.; Schindler, Torsten; Teh, Yee Whye

Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Authors: Charlotte Deane
Leo Klarner
Garrett M. Morris
Michael Reutlinger
Tim G. J. Rudner
Torsten Schindler
Yee Whye Teh
Publication date: 14 July 2023
Publisher

Abstract

Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift\unicode{x2013}\unicode{x2013}a setting that poses a challenge to standard deep learning methods. In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques.Comment: Published in the Proceedings of the 40th International Conference on Machine Learning (ICML 2023

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.15073

Last time updated on 04/08/2023