PAC-Bayes Un-Expected Bernstein Inequality

Abstract

We present a new PAC-Bayesian generalization bound. Standard bounds contain a LnKL/n\sqrt{L_n \cdot \operatorname{KL}/n} complexity term which dominates unless LnL_n, the empirical error of the learning algorithm's randomized predictions, vanishes. We manage to replace LnL_n by a term which vanishes in many more situations, essentially whenever the employed learning algorithm is sufficiently stable on the dataset at hand. Our new bound consistently beats state-of-the-art bounds both on a toy example and on UCI datasets (with large enough nn). Theoretically, unlike existing bounds, our new bound can be expected to converge to 00 faster whenever a Bernstein/Tsybakov condition holds, thus connecting PAC-Bayesian generalization and excess risk bounds --- for the latter it has long been known that faster convergence can be obtained under Bernstein conditions. Our main technical tool is a new concentration inequality which is like Bernstein's but with X2X^2 taken outside its expectation

    Similar works