60 research outputs found
Complexity regularization via localized random penalties
In this article, model selection via penalized empirical loss minimization in
nonparametric classification problems is studied. Data-dependent penalties are
constructed, which are based on estimates of the complexity of a small subclass
of each model class, containing only those functions with small empirical loss.
The penalties are novel since those considered in the literature are typically
based on the entire model class. Oracle inequalities using these penalties are
established, and the advantage of the new penalties over those based on the
complexity of the whole model class is demonstrated.Comment: Published by the Institute of Mathematical Statistics
(http://www.imstat.org) in the Annals of Statistics
(http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/00905360400000046
Strategies for prediction under imperfect monitoring
We propose simple randomized strategies for sequential prediction under
imperfect monitoring, that is, when the forecaster does not have access to the
past outcomes but rather to a feedback signal. The proposed strategies are
consistent in the sense that they achieve, asymptotically, the best possible
average reward. It was Rustichini (1999) who first proved the existence of such
consistent predictors. The forecasters presented here offer the first
constructive proof of consistency. Moreover, the proposed algorithms are
computationally efficient. We also establish upper bounds for the rates of
convergence. In the case of deterministic feedback, these rates are optimal up
to logarithmic terms.Comment: Journal version of a COLT conference pape
Online Multi-task Learning with Hard Constraints
We discuss multi-task online learning when a decision maker has to deal
simultaneously with M tasks. The tasks are related, which is modeled by
imposing that the M-tuple of actions taken by the decision maker needs to
satisfy certain constraints. We give natural examples of such restrictions and
then discuss a general class of tractable constraints, for which we introduce
computationally efficient ways of selecting actions, essentially by reducing to
an on-line shortest path problem. We briefly discuss "tracking" and "bandit"
versions of the problem and extend the model in various ways, including
non-additive global losses and uncountably infinite sets of tasks
Minimax Policies for Combinatorial Prediction Games
We address the online linear optimization problem when the actions of the
forecaster are represented by binary vectors. Our goal is to understand the
magnitude of the minimax regret for the worst possible set of actions. We study
the problem under three different assumptions for the feedback: full
information, and the partial information models of the so-called "semi-bandit",
and "bandit" problems. We consider both -, and -type of
restrictions for the losses assigned by the adversary.
We formulate a general strategy using Bregman projections on top of a
potential-based gradient descent, which generalizes the ones studied in the
series of papers Gyorgy et al. (2007), Dani et al. (2008), Abernethy et al.
(2008), Cesa-Bianchi and Lugosi (2009), Helmbold and Warmuth (2009), Koolen et
al. (2010), Uchiya et al. (2010), Kale et al. (2010) and Audibert and Bubeck
(2010). We provide simple proofs that recover most of the previous results. We
propose new upper bounds for the semi-bandit game. Moreover we derive lower
bounds for all three feedback assumptions. With the only exception of the
bandit game, the upper and lower bounds are tight, up to a constant factor.
Finally, we answer a question asked by Koolen et al. (2010) by showing that the
exponentially weighted average forecaster is suboptimal against
adversaries
The on-line shortest path problem under partial monitoring
The on-line shortest path problem is considered under various models of
partial monitoring. Given a weighted directed acyclic graph whose edge weights
can change in an arbitrary (adversarial) way, a decision maker has to choose in
each round of a game a path between two distinguished vertices such that the
loss of the chosen path (defined as the sum of the weights of its composing
edges) be as small as possible. In a setting generalizing the multi-armed
bandit problem, after choosing a path, the decision maker learns only the
weights of those edges that belong to the chosen path. For this problem, an
algorithm is given whose average cumulative loss in n rounds exceeds that of
the best path, matched off-line to the entire sequence of the edge weights, by
a quantity that is proportional to 1/\sqrt{n} and depends only polynomially on
the number of edges of the graph. The algorithm can be implemented with linear
complexity in the number of rounds n and in the number of edges. An extension
to the so-called label efficient setting is also given, in which the decision
maker is informed about the weights of the edges corresponding to the chosen
path at a total of m << n time instances. Another extension is shown where the
decision maker competes against a time-varying path, a generalization of the
problem of tracking the best expert. A version of the multi-armed bandit
setting for shortest path is also discussed where the decision maker learns
only the total weight of the chosen path but not the weights of the individual
edges on the path. Applications to routing in packet switched networks along
with simulation results are also presented.Comment: 35 page
Moment inequalities for functions of independent random variables
A general method for obtaining moment inequalities for functions of
independent random variables is presented. It is a generalization of the
entropy method which has been used to derive concentration inequalities for
such functions [Boucheron, Lugosi and Massart Ann. Probab. 31 (2003)
1583-1614], and is based on a generalized tensorization inequality due to
Latala and Oleszkiewicz [Lecture Notes in Math. 1745 (2000) 147-168]. The new
inequalities prove to be a versatile tool in a wide range of applications. We
illustrate the power of the method by showing how it can be used to
effortlessly re-derive classical inequalities including Rosenthal and
Kahane-Khinchine-type inequalities for sums of independent random variables,
moment inequalities for suprema of empirical processes and moment inequalities
for Rademacher chaos and U-statistics. Some of these corollaries are apparently
new. In particular, we generalize Talagrand's exponential inequality for
Rademacher chaos of order 2 to any order. We also discuss applications for
other complex functions of independent random variables, such as suprema of
Boolean polynomials which include, as special cases, subgraph counting problems
in random graphs.Comment: Published at http://dx.doi.org/10.1214/009117904000000856 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …