Improving Classical Substructure-Based
Virtual Screening
to Handle Extrapolation Challenges
- Publication date
- Publisher
Abstract
Target-oriented substructure-based virtual screening
(sSBVS) of
molecules is a promising approach in drug discovery. Yet, there are
doubts whether sSBVS is suitable also for extrapolation, that is,
for detecting molecules that are very different from those used for
training. Herein, we evaluate the predictive power of classic virtual
screening methods, namely, similarity searching using Tanimoto coefficient
(MTC) and Naive Bayes (NB). As could be expected, these classic methods
perform better in interpolation than in extrapolation tasks. Consequently,
to enhance the predictive ability for extrapolation tasks, we introduce
the Shadow approach, in which inclusion relations between substructures
are considered, as opposed to the classic sSBVS methods that assume
independence between substructures. Specifically, we discard contributions
from substructures included in (“shaded” by) others
which are, in turn, included in the molecule of interest. Indeed,
the Shadow classifier significantly outperforms both MTC (<i>pValue</i> = 3.1 × 10<sup>–16</sup>) and NB (<i>pValue</i> = 3.5 × 10<sup>–9</sup>) in detecting
hits sharing low similarity with the training active molecules