118 research outputs found
Non-stationary Stochastic Optimization
We consider a non-stationary variant of a sequential stochastic optimization
problem, in which the underlying cost functions may change along the horizon.
We propose a measure, termed variation budget, that controls the extent of said
change, and study how restrictions on this budget impact achievable
performance. We identify sharp conditions under which it is possible to achieve
long-run-average optimality and more refined performance measures such as rate
optimality that fully characterize the complexity of such problems. In doing
so, we also establish a strong connection between two rather disparate strands
of literature: adversarial online convex optimization; and the more traditional
stochastic approximation paradigm (couched in a non-stationary setting). This
connection is the key to deriving well performing policies in the latter, by
leveraging structure of optimal policies in the former. Finally, tight bounds
on the minimax regret allow us to quantify the "price of non-stationarity,"
which mathematically captures the added complexity embedded in a temporally
changing environment versus a stationary one
Non-stationary stochastic optimization of an oscillating water column
A non-stationary stochastic optimization methodology
is applied to an OWC (oscillating water column) to find the design
that maximizes the wave energy extraction. Different temporal cycles
are considered to represent the long-term variability of the wave
climate at the site in the optimization problem. The results of the
non-stationary stochastic optimization problem are compared against
those obtained by a stationary stochastic optimization problem. The
comparative analysis reveals that the proposed non-stationary
optimization provides designs with a better fit to reality. However,
the stationarity assumption can be adequate when looking at averaged
system response
Learning to Optimize under Non-Stationarity
We introduce algorithms that achieve state-of-the-art \emph{dynamic regret}
bounds for non-stationary linear stochastic bandit setting. It captures natural
applications such as dynamic pricing and ads allocation in a changing
environment. We show how the difficulty posed by the non-stationarity can be
overcome by a novel marriage between stochastic and adversarial bandits
learning algorithms. Defining and as the problem dimension, the
\emph{variation budget}, and the total time horizon, respectively, our main
contributions are the tuned Sliding Window UCB (\texttt{SW-UCB}) algorithm with
optimal dynamic regret, and the
tuning free bandit-over-bandit (\texttt{BOB}) framework built on top of the
\texttt{SW-UCB} algorithm with best
dynamic regret
- β¦