118 research outputs found

    Non-stationary Stochastic Optimization

    Full text link
    We consider a non-stationary variant of a sequential stochastic optimization problem, in which the underlying cost functions may change along the horizon. We propose a measure, termed variation budget, that controls the extent of said change, and study how restrictions on this budget impact achievable performance. We identify sharp conditions under which it is possible to achieve long-run-average optimality and more refined performance measures such as rate optimality that fully characterize the complexity of such problems. In doing so, we also establish a strong connection between two rather disparate strands of literature: adversarial online convex optimization; and the more traditional stochastic approximation paradigm (couched in a non-stationary setting). This connection is the key to deriving well performing policies in the latter, by leveraging structure of optimal policies in the former. Finally, tight bounds on the minimax regret allow us to quantify the "price of non-stationarity," which mathematically captures the added complexity embedded in a temporally changing environment versus a stationary one

    Non-stationary stochastic optimization of an oscillating water column

    Get PDF
    A non-stationary stochastic optimization methodology is applied to an OWC (oscillating water column) to find the design that maximizes the wave energy extraction. Different temporal cycles are considered to represent the long-term variability of the wave climate at the site in the optimization problem. The results of the non-stationary stochastic optimization problem are compared against those obtained by a stationary stochastic optimization problem. The comparative analysis reveals that the proposed non-stationary optimization provides designs with a better fit to reality. However, the stationarity assumption can be adequate when looking at averaged system response

    Learning to Optimize under Non-Stationarity

    Full text link
    We introduce algorithms that achieve state-of-the-art \emph{dynamic regret} bounds for non-stationary linear stochastic bandit setting. It captures natural applications such as dynamic pricing and ads allocation in a changing environment. We show how the difficulty posed by the non-stationarity can be overcome by a novel marriage between stochastic and adversarial bandits learning algorithms. Defining d,BT,d,B_T, and TT as the problem dimension, the \emph{variation budget}, and the total time horizon, respectively, our main contributions are the tuned Sliding Window UCB (\texttt{SW-UCB}) algorithm with optimal O~(d2/3(BT+1)1/3T2/3)\widetilde{O}(d^{2/3}(B_T+1)^{1/3}T^{2/3}) dynamic regret, and the tuning free bandit-over-bandit (\texttt{BOB}) framework built on top of the \texttt{SW-UCB} algorithm with best O~(d2/3(BT+1)1/4T3/4)\widetilde{O}(d^{2/3}(B_T+1)^{1/4}T^{3/4}) dynamic regret
    • …
    corecore