In high dimensions, most machine learning methods are brittle to even a small
fraction of structured outliers. To address this, we introduce a new
meta-algorithm that can take in a base learner such as least squares or
stochastic gradient descent, and harden the learner to be resistant to
outliers. Our method, Sever, possesses strong theoretical guarantees yet is
also highly scalable -- beyond running the base learner itself, it only
requires computing the top singular vector of a certain n×d matrix. We
apply Sever on a drug design dataset and a spam classification dataset, and
find that in both cases it has substantially greater robustness than several
baselines. On the spam dataset, with 1% corruptions, we achieved 7.4%
test error, compared to 13.4%−20.5% for the baselines, and 3% error on
the uncorrupted dataset. Similarly, on the drug design dataset, with 10%
corruptions, we achieved 1.42 mean-squared error test error, compared to
1.51-2.33 for the baselines, and 1.23 error on the uncorrupted dataset.Comment: To appear in ICML 201