8 research outputs found
PAC-Bayesian Bound for the Conditional Value at Risk
Conditional Value at Risk (CVaR) is a family of "coherent risk measures"
which generalize the traditional mathematical expectation. Widely used in
mathematical finance, it is garnering increasing interest in machine learning,
e.g., as an alternate approach to regularization, and as a means for ensuring
fairness. This paper presents a generalization bound for learning algorithms
that minimize the CVaR of the empirical loss. The bound is of PAC-Bayesian type
and is guaranteed to be small when the empirical CVaR is small. We achieve this
by reducing the problem of estimating CVaR to that of merely estimating an
expectation. This then enables us, as a by-product, to obtain concentration
inequalities for CVaR even when the random variable in question is unbounded
PAC-Bayesian Bound for the Conditional Value at Risk
International audienceConditional Value at Risk (CVAR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation. Widely used in mathematical finance, it is garnering increasing interest in machine learning, e.g., as an alternate approach to regularization, and as a means for ensuring fairness. This paper presents a generalization bound for learning algorithms that minimize the CVAR of the empirical loss. The bound is of PAC-Bayesian type and is guaranteed to be small when the empirical CVAR is small. We achieve this by reducing the problem of estimating CVAR to that of merely estimating an expectation. This then enables us, as a by-product, to obtain concentration inequalities for CVAR even when the random variable in question is unbounded
PAC-Bayesian Bound for the Conditional Value at Risk
Conditional Value at Risk (CVAR) is a family of “coherent risk measures” which
generalize the traditional mathematical expectation. Widely used in mathematical
finance, it is garnering increasing interest in machine learning, e.g., as an alternate
approach to regularization, and as a means for ensuring fairness. This paper
presents a generalization bound for learning algorithms that minimize the CVAR
of the empirical loss. The bound is of PAC-Bayesian type and is guaranteed to be
small when the empirical CVAR is small. We achieve this by reducing the problem
of estimating CVAR to that of merely estimating an expectation. This then enables
us, as a by-product, to obtain concentration inequalities for CVAR even when the
random variable in question is unbounded
A PAC-Bayesian Perspective on Structured Prediction with Implicit Loss Embeddings
38 pagesMany practical machine learning tasks can be framed as Structured prediction problems, where several output variables are predicted and considered interdependent. Recent theoretical advances in structured prediction have focused on obtaining fast rates convergence guarantees, especially in the Implicit Loss Embedding (ILE) framework. PAC-Bayes has gained interest recently for its capacity of producing tight risk bounds for predictor distributions. This work proposes a novel PAC-Bayes perspective on the ILE Structured prediction framework. We present two generalization bounds, on the risk and excess risk, which yield insights into the behavior of ILE predictors. Two learning algorithms are derived from these bounds. The algorithms are implemented and their behavior analyzed, with source code available at https://github.com/theophilec/ PAC-Bayes-ILE-Structured-Prediction
Generalization Bounds: Perspectives from Information Theory and PAC-Bayes
A fundamental question in theoretical machine learning is generalization.
Over the past decades, the PAC-Bayesian approach has been established as a
flexible framework to address the generalization capabilities of machine
learning algorithms, and design new ones. Recently, it has garnered increased
interest due to its potential applicability for a variety of learning
algorithms, including deep neural networks. In parallel, an
information-theoretic view of generalization has developed, wherein the
relation between generalization and various information measures has been
established. This framework is intimately connected to the PAC-Bayesian
approach, and a number of results have been independently discovered in both
strands. In this monograph, we highlight this strong connection and present a
unified treatment of generalization. We present techniques and results that the
two perspectives have in common, and discuss the approaches and interpretations
that differ. In particular, we demonstrate how many proofs in the area share a
modular structure, through which the underlying ideas can be intuited. We pay
special attention to the conditional mutual information (CMI) framework;
analytical studies of the information complexity of learning algorithms; and
the application of the proposed methods to deep learning. This monograph is
intended to provide a comprehensive introduction to information-theoretic
generalization bounds and their connection to PAC-Bayes, serving as a
foundation from which the most recent developments are accessible. It is aimed
broadly towards researchers with an interest in generalization and theoretical
machine learning.Comment: 222 page
PAC-Bayesian Bound for the Conditional Value at Risk
International audienceConditional Value at Risk (CVAR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation. Widely used in mathematical finance, it is garnering increasing interest in machine learning, e.g., as an alternate approach to regularization, and as a means for ensuring fairness. This paper presents a generalization bound for learning algorithms that minimize the CVAR of the empirical loss. The bound is of PAC-Bayesian type and is guaranteed to be small when the empirical CVAR is small. We achieve this by reducing the problem of estimating CVAR to that of merely estimating an expectation. This then enables us, as a by-product, to obtain concentration inequalities for CVAR even when the random variable in question is unbounded
Adaptivity in Online and Statistical Learning
Many modern machine learning algorithms, though successful, are still based on heuristics. In a typical application, such heuristics may manifest in the choice of a specific Neural Network structure, its number of parameters, or the learning rate during training. Relying on these heuristics is not ideal from a computational perspective (often involving multiple runs of the algorithm), and can also lead to over-fitting in some cases. This motivates the following question: for which machine learning tasks/settings do there exist efficient algorithms that automatically adapt to the best parameters? Characterizing the settings where this is the case and designing corresponding (parameter-free) algorithms within the online learning framework constitutes one of this thesis' primary goals. Towards this end, we develop algorithms for constrained and unconstrained online convex optimization that can automatically adapt to various parameters of interest such as the Lipschitz constant, the curvature of the sequence of losses, and the norm of the comparator. We also derive new performance lower-bounds characterizing the limits of adaptivity for algorithms in these settings. Part of systematizing the choice of machine learning methods also involves having ``certificates'' for the performance of algorithms. In the statistical learning setting, this translates to having (tight) generalization bounds. Adaptivity can manifest here through data-dependent bounds that become small whenever the problem is ``easy''. In this thesis, we provide such data-dependent bounds for the expected loss (the standard risk measure) and other risk measures. We also explore how such bounds can be used in the context of risk-monotonicity