This paper is concerned with understanding and countering the effects of
database attacks on a learning-based linear quadratic adaptive controller. This
attack targets neither sensors nor actuators, but just poisons the learning
algorithm and parameter estimator that is part of the regulation scheme. We
focus on the adaptive optimal control algorithm introduced by Abbasi-Yadkori
and Szepesvari and provide regret analysis in the presence of attacks as well
as modifications that mitigate their effects. A core step of this algorithm is
the self-regularized on-line least squares estimation, which determines a tight
confidence set around the true parameters of the system with high probability.
In the absence of malicious data injection, this set provides an appropriate
estimate of parameters for the aim of control design. However, in the presence
of attack, this confidence set is not reliable anymore. Hence, we first tackle
the question of how to adjust the confidence set so that it can compensate for
the effect of the poisonous data. Then, we quantify the deleterious effect of
this type of attack on the optimality of control policy by providing a measure
that we call attack regret.Comment: 10 page