1 research outputs found
Confidence Bands and Hypothesis Test Methods for Recall and Precision Curves at Extremely Small Fractions with Applications to Drug Discovery
In virtual screening for drug discovery, recall curves are used to assess the
performance of ranking algorithms, in which recall is a function of the
fraction of data prioritized for experimental testing. Unfortunately,
researchers almost never consider the uncertainty in the estimation of the
recall curve when benchmarking algorithms. We confirm that a recently developed
procedure for estimating pointwise confidence intervals for recall curves --
and closely related variants, such as precision curves -- can be applied to a
variety of simulated data sets representative of those typically encountered in
virtual screening. Since it is more desirable in benchmarks to present the
uncertainty of performance over a range of testing fractions, we extend the
pointwise confidence interval procedure to allow for the estimation of
confidence bands for these curves. We also present hypothesis test methods to
determine significant differences between the curves for competing algorithms.
We show these methods have high power to detect significant differences at a
range of small fractions typically tested, while maintaining control of type I
error rate. These methods enable statistically rigorous comparisons of virtual
screening algorithms using a metric that quantifies the aspect of performance
that is of primary interest.Comment: 41 pages, 7 figures, 13 supplementary figure