Improving Classical Substructure-Based Virtual Screening to Handle Extrapolation Challenges

Abstract

Target-oriented substructure-based virtual screening (sSBVS) of molecules is a promising approach in drug discovery. Yet, there are doubts whether sSBVS is suitable also for extrapolation, that is, for detecting molecules that are very different from those used for training. Herein, we evaluate the predictive power of classic virtual screening methods, namely, similarity searching using Tanimoto coefficient (MTC) and Naive Bayes (NB). As could be expected, these classic methods perform better in interpolation than in extrapolation tasks. Consequently, to enhance the predictive ability for extrapolation tasks, we introduce the Shadow approach, in which inclusion relations between substructures are considered, as opposed to the classic sSBVS methods that assume independence between substructures. Specifically, we discard contributions from substructures included in (“shaded” by) others which are, in turn, included in the molecule of interest. Indeed, the Shadow classifier significantly outperforms both MTC (<i>pValue</i> = 3.1 × 10<sup>–16</sup>) and NB (<i>pValue</i> = 3.5 × 10<sup>–9</sup>) in detecting hits sharing low similarity with the training active molecules

    Similar works

    Full text

    thumbnail-image

    Available Versions