1 research outputs found
Predictive Value Generalization Bounds
In this paper, we study a bi-criterion framework for assessing scoring
functions in the context of binary classification. The positive and negative
predictive values (ppv and npv, respectively) are conditional probabilities of
the true label matching a classifier's predicted label. The usual
classification error rate is a linear combination of these probabilities, and
therefore, concentration inequalities for the error rate do not yield
confidence intervals for the two separate predictive values. We study
generalization properties of scoring functions with respect to predictive
values by deriving new distribution-free large deviation and uniform
convergence bounds. The latter bound is stated in terms of a measure of
function class complexity that we call the order coefficient; we relate this
combinatorial quantity to the VC-subgraph dimension.Comment: 20 pages, 3 figure