484 research outputs found
A Stochastic View of Optimal Regret through Minimax Duality
We study the regret of optimal strategies for online convex optimization
games. Using von Neumann's minimax theorem, we show that the optimal regret in
this adversarial setting is closely related to the behavior of the empirical
minimization algorithm in a stochastic process setting: it is equal to the
maximum, over joint distributions of the adversary's action sequence, of the
difference between a sum of minimal expected losses and the minimal empirical
loss. We show that the optimal regret has a natural geometric interpretation,
since it can be viewed as the gap in Jensen's inequality for a concave
functional--the minimizer over the player's actions of expected loss--defined
on a set of probability distributions. We use this expression to obtain upper
and lower bounds on the regret of an optimal strategy for a variety of online
learning problems. Our method provides upper bounds without the need to
construct a learning algorithm; the lower bounds provide explicit optimal
strategies for the adversary
A Distributionally Robust Approach to Regret Optimal Control using the Wasserstein Distance
This paper proposes a distributionally robust approach to regret optimal
control of discrete-time linear dynamical systems with quadratic costs subject
to stochastic additive disturbance on the state process. The underlying
probability distribution of the disturbance process is unknown, but assumed to
lie in a given ball of distributions defined in terms of the type-2 Wasserstein
distance. In this framework, strictly causal linear disturbance feedback
controllers are designed to minimize the worst-case expected regret. The regret
incurred by a controller is defined as the difference between the cost it
incurs in response to a realization of the disturbance process and the cost
incurred by the optimal noncausal controller which has perfect knowledge of the
disturbance process realization at the outset. Building on a well-established
duality theory for optimal transport problems, we show how to equivalently
reformulate this minimax regret optimal control problem as a tractable
semidefinite program. The equivalent dual reformulation also allows us to
characterize a worst-case distribution achieving the worst-case expected regret
in relation to the distribution at the center of the Wasserstein ball.Comment: 6 page
Algorithm Engineering in Robust Optimization
Robust optimization is a young and emerging field of research having received
a considerable increase of interest over the last decade. In this paper, we
argue that the the algorithm engineering methodology fits very well to the
field of robust optimization and yields a rewarding new perspective on both the
current state of research and open research directions.
To this end we go through the algorithm engineering cycle of design and
analysis of concepts, development and implementation of algorithms, and
theoretical and experimental evaluation. We show that many ideas of algorithm
engineering have already been applied in publications on robust optimization.
Most work on robust optimization is devoted to analysis of the concepts and the
development of algorithms, some papers deal with the evaluation of a particular
concept in case studies, and work on comparison of concepts just starts. What
is still a drawback in many papers on robustness is the missing link to include
the results of the experiments again in the design
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio
The information-theoretic analysis by Russo and Van Roy (2014) in combination
with minimax duality has proved a powerful tool for the analysis of online
learning algorithms in full and partial information settings. In most
applications there is a tantalising similarity to the classical analysis based
on mirror descent. We make a formal connection, showing that the
information-theoretic bounds in most applications can be derived from existing
techniques for online convex optimisation. Besides this, for -armed
adversarial bandits we provide an efficient algorithm with regret that matches
the best information-theoretic upper bound and improve best known regret
guarantees for online linear optimisation on -balls and bandits with
graph feedback
- …