1 research outputs found
Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry
Deep neural networks have achieved state of the art accuracy at classifying
molecules with respect to whether they bind to specific protein targets. A key
breakthrough would occur if these models could reveal the fragment
pharmacophores that are causally involved in binding. Extracting chemical
details of binding from the networks could potentially lead to scientific
discoveries about the mechanisms of drug actions. But doing so requires shining
light into the black box that is the trained neural network model, a task that
has proved difficult across many domains. Here we show how the binding
mechanism learned by deep neural network models can be interrogated, using a
recently described attribution method. We first work with carefully constructed
synthetic datasets, in which the 'fragment logic' of binding is fully known. We
find that networks that achieve perfect accuracy on held out test datasets
still learn spurious correlations due to biases in the datasets, and we are
able to exploit this non-robustness to construct adversarial examples that fool
the model. The dataset bias makes these models unreliable for accurately
revealing information about the mechanisms of protein-ligand binding. In light
of our findings, we prescribe a test that checks for dataset bias given a
hypothesis. If the test fails, it indicates that either the model must be
simplified or regularized and/or that the training dataset requires
augmentation