95,497 research outputs found
Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures
We study finite episodic Markov decision processes incorporating dynamic risk
measures to capture risk sensitivity. To this end, we present two model-based
algorithms applied to \emph{Lipschitz} dynamic risk measures, a wide range of
risk measures that subsumes spectral risk measure, optimized certainty
equivalent, distortion risk measures among others. We establish both regret
upper bounds and lower bounds. Notably, our upper bounds demonstrate optimal
dependencies on the number of actions and episodes, while reflecting the
inherent trade-off between risk sensitivity and sample complexity.
Additionally, we substantiate our theoretical results through numerical
experiments
Online and Stochastic Gradient Methods for Non-decomposable Loss Functions
Modern applications in sensitive domains such as biometrics and medicine
frequently require the use of non-decomposable loss functions such as
precision@k, F-measure etc. Compared to point loss functions such as
hinge-loss, these offer much more fine grained control over prediction, but at
the same time present novel challenges in terms of algorithm design and
analysis. In this work we initiate a study of online learning techniques for
such non-decomposable loss functions with an aim to enable incremental learning
as well as design scalable solvers for batch problems. To this end, we propose
an online learning framework for such loss functions. Our model enjoys several
nice properties, chief amongst them being the existence of efficient online
learning algorithms with sublinear regret and online to batch conversion
bounds. Our model is a provable extension of existing online learning models
for point loss functions. We instantiate two popular losses, prec@k and pAUC,
in our model and prove sublinear regret bounds for both of them. Our proofs
require a novel structural lemma over ranked lists which may be of independent
interest. We then develop scalable stochastic gradient descent solvers for
non-decomposable loss functions. We show that for a large family of loss
functions satisfying a certain uniform convergence property (that includes
prec@k, pAUC, and F-measure), our methods provably converge to the empirical
risk minimizer. Such uniform convergence results were not known for these
losses and we establish these using novel proof techniques. We then use
extensive experimentation on real life and benchmark datasets to establish that
our method can be orders of magnitude faster than a recently proposed cutting
plane method.Comment: 25 pages, 3 figures, To appear in the proceedings of the 28th Annual
Conference on Neural Information Processing Systems, NIPS 201
Optimal PAC Bounds Without Uniform Convergence
In statistical learning theory, determining the sample complexity of
realizable binary classification for VC classes was a long-standing open
problem. The results of Simon and Hanneke established sharp upper bounds in
this setting. However, the reliance of their argument on the uniform
convergence principle limits its applicability to more general learning
settings such as multiclass classification. In this paper, we address this
issue by providing optimal high probability risk bounds through a framework
that surpasses the limitations of uniform convergence arguments.
Our framework converts the leave-one-out error of permutation invariant
predictors into high probability risk bounds. As an application, by adapting
the one-inclusion graph algorithm of Haussler, Littlestone, and Warmuth, we
propose an algorithm that achieves an optimal PAC bound for binary
classification. Specifically, our result shows that certain aggregations of
one-inclusion graph algorithms are optimal, addressing a variant of a classic
question posed by Warmuth.
We further instantiate our framework in three settings where uniform
convergence is provably suboptimal. For multiclass classification, we prove an
optimal risk bound that scales with the one-inclusion hypergraph density of the
class, addressing the suboptimality of the analysis of Daniely and
Shalev-Shwartz. For partial hypothesis classification, we determine the optimal
sample complexity bound, resolving a question posed by Alon, Hanneke, Holzman,
and Moran. For realizable bounded regression with absolute loss, we derive an
optimal risk bound that relies on a modified version of the scale-sensitive
dimension, refining the results of Bartlett and Long. Our rates surpass
standard uniform convergence-based results due to the smaller complexity
measure in our risk bound.Comment: 27 page
Universal Convexification via Risk-Aversion
We develop a framework for convexifying a fairly general class of
optimization problems. Under additional assumptions, we analyze the
suboptimality of the solution to the convexified problem relative to the
original nonconvex problem and prove additive approximation guarantees. We then
develop algorithms based on stochastic gradient methods to solve the resulting
optimization problems and show bounds on convergence rates. %We show a simple
application of this framework to supervised learning, where one can perform
integration explicitly and can use standard (non-stochastic) optimization
algorithms with better convergence guarantees. We then extend this framework to
apply to a general class of discrete-time dynamical systems. In this context,
our convexification approach falls under the well-studied paradigm of
risk-sensitive Markov Decision Processes. We derive the first known model-based
and model-free policy gradient optimization algorithms with guaranteed
convergence to the optimal solution. Finally, we present numerical results
validating our formulation in different applications
- …