In machine learning, a question of great interest is understanding what
examples are challenging for a model to classify. Identifying atypical examples
helps inform safe deployment of models, isolates examples that require further
human inspection, and provides interpretability into model behavior. In this
work, we propose Variance of Gradients (VOG) as a proxy metric for detecting
outliers in the data distribution. We provide quantitative and qualitative
support that VOG is a meaningful way to rank data by difficulty and to surface
a tractable subset of the most challenging examples for human-in-the-loop
auditing. Data points with high VOG scores are more difficult for the model to
classify and over-index on examples that require memorization.Comment: Accepted to Workshop on Human Interpretability in Machine Learning
(WHI), ICML, 202