12 research outputs found
Data-Driven Control and Data-Poisoning attacks in Buildings: the KTH Live-In Lab case study
This work investigates the feasibility of using input-output data-driven
control techniques for building control and their susceptibility to
data-poisoning techniques. The analysis is performed on a digital replica of
the KTH Livein Lab, a non-linear validated model representing one of the KTH
Live-in Lab building testbeds. This work is motivated by recent trends showing
a surge of interest in using data-based techniques to control cyber-physical
systems. We also analyze the susceptibility of these controllers to
data-poisoning methods, a particular type of machine learning threat geared
towards finding imperceptible attacks that can undermine the performance of the
system under consideration. We consider the Virtual Reference Feedback Tuning
(VRFT), a popular data-driven control technique, and show its performance on
the KTH Live-In Lab digital replica. We then demonstrate how poisoning attacks
can be crafted and illustrate the impact of such attacks. Numerical experiments
reveal the feasibility of using data-driven control methods for finding
efficient control laws. However, a subtle change in the datasets can
significantly deteriorate the performance of VRFT
Recurrent Equilibrium Networks: Flexible Dynamic Models with Guaranteed Stability and Robustness
This paper introduces recurrent equilibrium networks (RENs), a new class of
nonlinear dynamical models for applications in machine learning, system
identification and control. The new model class has ``built in'' guarantees of
stability and robustness: all models in the class are contracting - a strong
form of nonlinear stability - and models can satisfy prescribed incremental
integral quadratic constraints (IQC), including Lipschitz bounds and
incremental passivity. RENs are otherwise very flexible: they can represent all
stable linear systems, all previously-known sets of contracting recurrent
neural networks and echo state networks, all deep feedforward neural networks,
and all stable Wiener/Hammerstein models. RENs are parameterized directly by a
vector in R^N, i.e. stability and robustness are ensured without parameter
constraints, which simplifies learning since generic methods for unconstrained
optimization can be used. The performance and robustness of the new model set
is evaluated on benchmark nonlinear system identification problems, and the
paper also presents applications in data-driven nonlinear observer design and
control with stability guarantees.Comment: Journal submission, extended version of conference paper (v1 of this
arxiv preprint
Comparative Evaluation for Effectiveness Analysis of Policy Based Deep Reinforcement Learning Approaches
Deep Reinforcement Learning (DRL) has proven to be a very strong technique with results in various applications in recent years. Especially the achievements in the studies in the field of robotics show that much more progress will be made in this field. Undoubtedly, policy choices and parameter settings play an active role in the success of DRL. In this study, an analysis has been made on the policies used by examining the DRL studies conducted in recent years. Policies used in the literature are grouped under three different headings: value-based, policy-based and actor-critic. However, the problem of moving a common target using Newton's law of motion of collaborative agents is presented. Trainings are carried out in a frictionless environment with two agents and one object using four different policies. Agents try to force an object in the environment by colliding it and try to move it out of the area it is in. Two-dimensional surface is used during the training phase. As a result of the training, each policy is reported separately and its success is observed. Test results are discussed in section 5. Thus, policies are tested together with an application by providing information about the policies used in deep reinforcement learning approaches
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies
In light of the burgeoning success of reinforcement learning (RL) in diverse
real-world applications, considerable focus has been directed towards ensuring
RL policies are robust to adversarial attacks during test time. Current
approaches largely revolve around solving a minimax problem to prepare for
potential worst-case scenarios. While effective against strong attacks, these
methods often compromise performance in the absence of attacks or the presence
of only weak attacks. To address this, we study policy robustness under the
well-accepted state-adversarial attack model, extending our focus beyond only
worst-case attacks. We first formalize this task at test time as a regret
minimization problem and establish its intrinsic hardness in achieving
sublinear regret when the baseline policy is from a general continuous policy
class, . This finding prompts us to \textit{refine} the baseline policy
class prior to test time, aiming for efficient adaptation within a finite
policy class \Tilde{\Pi}, which can resort to an adversarial bandit
subroutine. In light of the importance of a small, finite \Tilde{\Pi}, we
propose a novel training-time algorithm to iteratively discover
\textit{non-dominated policies}, forming a near-optimal and minimal
\Tilde{\Pi}, thereby ensuring both robustness and test-time efficiency.
Empirical validation on the Mujoco corroborates the superiority of our approach
in terms of natural and robust performance, as well as adaptability to various
attack scenarios.Comment: International Conference on Learning Representations (ICLR) 2024,
spotligh
A Survey on Reinforcement Learning Security with Application to Autonomous Driving
Reinforcement learning allows machines to learn from their own experience.
Nowadays, it is used in safety-critical applications, such as autonomous
driving, despite being vulnerable to attacks carefully crafted to either
prevent that the reinforcement learning algorithm learns an effective and
reliable policy, or to induce the trained agent to make a wrong decision. The
literature about the security of reinforcement learning is rapidly growing, and
some surveys have been proposed to shed light on this field. However, their
categorizations are insufficient for choosing an appropriate defense given the
kind of system at hand. In our survey, we do not only overcome this limitation
by considering a different perspective, but we also discuss the applicability
of state-of-the-art attacks and defenses when reinforcement learning algorithms
are used in the context of autonomous driving
Adversarial Patch Attacks on Deep Reinforcement Learning Algorithms
Adversarial patch attack has demonstrated that it can cause the misclassification of deep neural networks to the target label when the size of patch is relatively small to the size of input image; however, the effectiveness of adversarial patch attack has never been experimented on deep reinforcement learning algorithms. We design algorithms to generate adversarial patches to attack two types of deep reinforcement learning algorithms, including deep Q-networks (DQN) and proximal policy optimization (PPO). Our algorithms of generating adversarial patch consist of two parts: choosing attack position and training adversarial patch on that position. Under the same bound of total perturbation, adversarial patch attacks achieve comparable results as FGSM and PGD attack, on Atari and Procgen environments, for DQN and PPO respectively. In addition, We also design Context Re-Constructor to reconstruct state when the state is corrupted by the patch. Based on the reconstructed states, we can identify the patch position and then use mask defense and recover defense to defend against adversarial patch. Lastly, we also test the transferability of adversarial patch