18 research outputs found

    PAC-Bayes bounds for stable algorithms with instance-dependent priors

    Full text link
    PAC-Bayes bounds have been proposed to get risk estimates based on a training sample. In this paper the PAC-Bayes approach is combined with stability of the hypothesis learned by a Hilbert space valued algorithm. The PAC-Bayes setting is used with a Gaussian prior centered at the expected output. Thus a novelty of our paper is using priors defined in terms of the data-generating distribution. Our main result estimates the risk of the randomized algorithm in terms of the hypothesis stability coefficients. We also provide a new bound for the SVM classifier, which is compared to other known bounds experimentally. Ours appears to be the first stability-based bound that evaluates to non-trivial values.Comment: 16 pages, discussion of theory and experiments in the main body, detailed proofs and experimental details in the appendice

    PAC-Bayes Analysis Beyond the Usual Bounds

    Get PDF
    We focus on a stochastic learning model where the learner observes a finite set of training examples and the output of the learning process is a data-dependent distribution over a space of hypotheses. The learned data-dependent distribution is then used to make randomized predictions, and the high-level theme addressed here is guaranteeing the quality of predictions on examples that were not seen during training, i.e. generalization. In this setting the unknown quantity of interest is the expected risk of the data-dependent randomized predictor, for which upper bounds can be derived via a PAC-Bayes analysis, leading to PAC-Bayes bounds. Specifically, we present a basic PAC-Bayes inequality for stochastic kernels, from which one may derive extensions of various known PAC-Bayes bounds as well as novel bounds. We clarify the role of the requirements of fixed 'data-free' priors, bounded losses, and i.i.d. data. We highlight that those requirements were used to upper-bound an exponential moment term, while the basic PAC-Bayes theorem remains valid without those restrictions. We present three bounds that illustrate the use of data-dependent priors, including one for the unbounded square loss.Comment: In NeurIPS 2020. Version 3 is the final published paper. Note that this paper is an enhanced version of the short paper with the same title that was presented at the NeurIPS 2019 Workshop on Machine Learning with Guarantees. Important update: the PAC-Bayes type inequality for unbounded loss functions (Section 2.3) is ne

    PAC-Bayes Unexpected Bernstein Inequality

    Get PDF
    We present a new PAC-Bayesian generalization bound. Standard bounds contain a \sqrt{L_n \cdot \KL/n} complexity term which dominates unless Ln, the empirical error of the learning algorithm's randomized predictions, vanishes. We manage to replace Ln by a term which vanishes in many more situations, essentially whenever the employed learning algorithm is sufficiently stable on the dataset at hand. Our new bound consistently beats state-of-the-art bounds both on a toy example and on UCI datasets (with large enough n). Theoretically, unlike existing bounds, our new bound can be expected to converge to 0 faster whenever a Bernstein/Tsybakov condition holds, thus connecting PAC-Bayesian generalization and {\em excess risk\/} bounds---for the latter it has long been known that faster convergence can be obtained under Bernstein conditions. Our main technical tool is a new concentration inequality which is like Bernstein's but with X2 taken outside its expectation