41 research outputs found
Learning-based attacks in cyber-physical systems
We introduce the problem of learning-based attacks in a simple abstraction of
cyber-physical systems---the case of a discrete-time, linear, time-invariant
plant that may be subject to an attack that overrides the sensor readings and
the controller actions. The attacker attempts to learn the dynamics of the
plant and subsequently override the controller's actuation signal, to destroy
the plant without being detected. The attacker can feed fictitious sensor
readings to the controller using its estimate of the plant dynamics and mimic
the legitimate plant operation. The controller, on the other hand, is
constantly on the lookout for an attack; once the controller detects an attack,
it immediately shuts the plant off. In the case of scalar plants, we derive an
upper bound on the attacker's deception probability for any measurable control
policy when the attacker uses an arbitrary learning algorithm to estimate the
system dynamics. We then derive lower bounds for the attacker's deception
probability for both scalar and vector plants by assuming a specific
authentication test that inspects the empirical variance of the system
disturbance. We also show how the controller can improve the security of the
system by superimposing a carefully crafted privacy-enhancing signal on top of
the "nominal control policy." Finally, for nonlinear scalar dynamics that
belong to the Reproducing Kernel Hilbert Space (RKHS), we investigate the
performance of attacks based on nonlinear Gaussian-processes (GP) learning
algorithms
Data-Driven H-infinity Control with a Real-Time and Efficient Reinforcement Learning Algorithm: An Application to Autonomous Mobility-on-Demand Systems
Reinforcement learning (RL) is a class of artificial intelligence algorithms
being used to design adaptive optimal controllers through online learning. This
paper presents a model-free, real-time, data-efficient Q-learning-based
algorithm to solve the H control of linear discrete-time systems.
The computational complexity is shown to reduce from
in the literature to
in the proposed algorithm, where
is quadratic in the sum of the size of state variables, control inputs, and
disturbance. An adaptive optimal controller is designed and the parameters of
the action and critic networks are learned online without the knowledge of the
system dynamics, making the proposed algorithm completely model-free. Also, a
sufficient probing noise is only needed in the first iteration and does not
affect the proposed algorithm. With no need for an initial stabilizing policy,
the algorithm converges to the closed-form solution obtained by solving the
Riccati equation. A simulation study is performed by applying the proposed
algorithm to real-time control of an autonomous mobility-on-demand (AMoD)
system for a real-world case study to evaluate the effectiveness of the
proposed algorithm
On the Utility of Model Learning in HRI
Fundamental to robotics is the debate between model-based and model-free learning: should the robot build an explicit model of the world, or learn a policy directly? In the context of HRI, part of the world to be modeled is the human. One option is for the robot to treat the human as a black box and learn a policy for how they act directly. But it can also model the human as an agent, and rely on a “theory of mind” to guide or bias the learning (grey box). We contribute a characterization of the performance of these methods under the optimistic case of having an ideal theory of mind, as well as under different scenarios in which the assumptions behind the robot's theory of mind for the human are wrong, as they inevitably will be in practice. We find that there is a significant sample complexity advantage to theory of mind methods and that they are more robust to covariate shift, but that when enough interaction data is available, black box approaches eventually dominate