Adaptive Control of Unknown Time Varying Dynamical Systems with Regret Guarantees

Abstract

The study of online control of unknown time varying dynamical systems is a relatively under-explored topic. In this work, we present regret guarantee with respect to a stronger notion of system variability compared to \cite{minasyan2021online} and thus provide sub-linear regret guarantee for a much broader range of scenarios. Specifically, we give regret guarantee wtih respect to the number of changes compared to the average squared deviation of \cite{minasyan2021online}. The online control algorithm we propose continuously updates its estimate to track the changes and employs an online optimizer to simultaneously optimize the control policy. We show that our algorithm can achieve a sub-linear regret with respect to the number of changes under two settings: (i) matched disturbance system with general convex cost functions, (ii) general system with linear cost functions. Specifically, a regret of Ξ“T1/5T4/5\Gamma^{1/5}_TT^{4/5} can be achieved, where Ξ“T\Gamma_T is the number of changes in the underlying system and TT is the duration of the control episode

    Similar works

    Full text

    thumbnail-image

    Available Versions