The study of online control of unknown time varying dynamical systems is a
relatively under-explored topic. In this work, we present regret guarantee with
respect to a stronger notion of system variability compared to
\cite{minasyan2021online} and thus provide sub-linear regret guarantee for a
much broader range of scenarios. Specifically, we give regret guarantee wtih
respect to the number of changes compared to the average squared deviation of
\cite{minasyan2021online}. The online control algorithm we propose continuously
updates its estimate to track the changes and employs an online optimizer to
simultaneously optimize the control policy. We show that our algorithm can
achieve a sub-linear regret with respect to the number of changes under two
settings: (i) matched disturbance system with general convex cost functions,
(ii) general system with linear cost functions. Specifically, a regret of
ΞT1/5βT4/5 can be achieved, where ΞTβ is the number of
changes in the underlying system and T is the duration of the control
episode