Search CORE

12 research outputs found

Data-Driven Control and Data-Poisoning attacks in Buildings: the KTH Live-In Lab case study

Author: Molinari Marco
Proutiere Alexandre
Russo Alessio
Publication venue
Publication date: 01/01/2021
Field of study

This work investigates the feasibility of using input-output data-driven control techniques for building control and their susceptibility to data-poisoning techniques. The analysis is performed on a digital replica of the KTH Livein Lab, a non-linear validated model representing one of the KTH Live-in Lab building testbeds. This work is motivated by recent trends showing a surge of interest in using data-based techniques to control cyber-physical systems. We also analyze the susceptibility of these controllers to data-poisoning methods, a particular type of machine learning threat geared towards finding imperceptible attacks that can undermine the performance of the system under consideration. We consider the Virtual Reference Feedback Tuning (VRFT), a popular data-driven control technique, and show its performance on the KTH Live-In Lab digital replica. We then demonstrate how poisoning attacks can be crafted and illustrate the impact of such attacks. Numerical experiments reveal the feasibility of using data-driven control methods for finding efficient control laws. However, a subtle change in the datasets can significantly deteriorate the performance of VRFT

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Recurrent Equilibrium Networks: Flexible Dynamic Models with Guaranteed Stability and Robustness

Author: Manchester Ian R.
Revay Max
Wang Ruigang
Publication venue
Publication date: 13/07/2021
Field of study

This paper introduces recurrent equilibrium networks (RENs), a new class of nonlinear dynamical models for applications in machine learning, system identification and control. The new model class has ``built in'' guarantees of stability and robustness: all models in the class are contracting - a strong form of nonlinear stability - and models can satisfy prescribed incremental integral quadratic constraints (IQC), including Lipschitz bounds and incremental passivity. RENs are otherwise very flexible: they can represent all stable linear systems, all previously-known sets of contracting recurrent neural networks and echo state networks, all deep feedforward neural networks, and all stable Wiener/Hammerstein models. RENs are parameterized directly by a vector in R^N, i.e. stability and robustness are ensured without parameter constraints, which simplifies learning since generic methods for unconstrained optimization can be used. The performance and robustness of the new model set is evaluated on benchmark nonlinear system identification problems, and the paper also presents applications in data-driven nonlinear observer design and control with stability guarantees.Comment: Journal submission, extended version of conference paper (v1 of this arxiv preprint

arXiv.org e-Print Archive

Comparative Evaluation for Effectiveness Analysis of Policy Based Deep Reinforcement Learning Approaches

Author: Karaköse Mehmet
Tan Ziya
Publication venue: 'Asian Online Journals'
Publication date: 18/06/2021
Field of study

Deep Reinforcement Learning (DRL) has proven to be a very strong technique with results in various applications in recent years. Especially the achievements in the studies in the field of robotics show that much more progress will be made in this field. Undoubtedly, policy choices and parameter settings play an active role in the success of DRL. In this study, an analysis has been made on the policies used by examining the DRL studies conducted in recent years. Policies used in the literature are grouped under three different headings: value-based, policy-based and actor-critic. However, the problem of moving a common target using Newton's law of motion of collaborative agents is presented. Trainings are carried out in a frictionless environment with two agents and one object using four different policies. Agents try to force an object in the environment by colliding it and try to move it out of the area it is in. Two-dimensional surface is used during the training phase. As a result of the training, each policy is reported separately and its success is observed. Test results are discussed in section 5. Thus, policies are tested together with an application by providing information about the policies used in deep reinforcement learning approaches

International Journal of Computer and Information Technology

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

Author: Deng Chenghao
Huang Furong
Liang Yongyuan
Liu Xiangyu
Sun Yanchao
Publication venue
Publication date: 19/02/2024
Field of study

In light of the burgeoning success of reinforcement learning (RL) in diverse real-world applications, considerable focus has been directed towards ensuring RL policies are robust to adversarial attacks during test time. Current approaches largely revolve around solving a minimax problem to prepare for potential worst-case scenarios. While effective against strong attacks, these methods often compromise performance in the absence of attacks or the presence of only weak attacks. To address this, we study policy robustness under the well-accepted state-adversarial attack model, extending our focus beyond only worst-case attacks. We first formalize this task at test time as a regret minimization problem and establish its intrinsic hardness in achieving sublinear regret when the baseline policy is from a general continuous policy class,

\Pi

. This finding prompts us to \textit{refine} the baseline policy class

\Pi

prior to test time, aiming for efficient adaptation within a finite policy class \Tilde{\Pi}, which can resort to an adversarial bandit subroutine. In light of the importance of a small, finite \Tilde{\Pi}, we propose a novel training-time algorithm to iteratively discover \textit{non-dominated policies}, forming a near-optimal and minimal \Tilde{\Pi}, thereby ensuring both robustness and test-time efficiency. Empirical validation on the Mujoco corroborates the superiority of our approach in terms of natural and robust performance, as well as adaptability to various attack scenarios.Comment: International Conference on Learning Representations (ICLR) 2024, spotligh

arXiv.org e-Print Archive

A Survey on Reinforcement Learning Security with Application to Autonomous Driving

Author: Biggio Battista
Demetrio Luca
Demontis Ambra
Fang Chengfang
Grosse Kathrin
Lin Hsiao-Ying
Pintor Maura
Roli Fabio
Publication venue
Publication date: 12/12/2022
Field of study

Reinforcement learning allows machines to learn from their own experience. Nowadays, it is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent that the reinforcement learning algorithm learns an effective and reliable policy, or to induce the trained agent to make a wrong decision. The literature about the security of reinforcement learning is rapidly growing, and some surveys have been proposed to shed light on this field. However, their categorizations are insufficient for choosing an appropriate defense given the kind of system at hand. In our survey, we do not only overcome this limitation by considering a different perspective, but we also discuss the applicability of state-of-the-art attacks and defenses when reinforcement learning algorithms are used in the context of autonomous driving

arXiv.org e-Print Archive

Adversarial Patch Attacks on Deep Reinforcement Learning Algorithms

Author: Tong Peizhen
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2023
Field of study

Adversarial patch attack has demonstrated that it can cause the misclassification of deep neural networks to the target label when the size of patch is relatively small to the size of input image; however, the effectiveness of adversarial patch attack has never been experimented on deep reinforcement learning algorithms. We design algorithms to generate adversarial patches to attack two types of deep reinforcement learning algorithms, including deep Q-networks (DQN) and proximal policy optimization (PPO). Our algorithms of generating adversarial patch consist of two parts: choosing attack position and training adversarial patch on that position. Under the same bound of total perturbation, adversarial patch attacks achieve comparable results as FGSM and PGD attack, on Atari and Procgen environments, for DQN and PPO respectively. In addition, We also design Context Re-Constructor to reconstruct state when the state is corrupted by the patch. Based on the reconstructed states, we can identify the patch position and then use mask defense and recover defense to defend against adversarial patch. Lastly, we also test the transferability of adversarial patch

Washington University St. Louis: Open Scholarship