124 research outputs found

    Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information

    Get PDF
    Recent research on reinforcement learning (RL) has suggested that trained agents are vulnerable to maliciously crafted adversarial samples. In this work, we show how such samples can be generalised from White-box and Grey-box attacks to a strong Black-box case, where the attacker has no knowledge of the agents, their training parameters and their training methods. We use sequence-to-sequence models to predict a single action or a sequence of future actions that a trained agent will make. First, we show our approximation model, based on time-series information from the agent, consistently predicts RL agents' future actions with high accuracy in a Black-box setup on a wide range of games and RL algorithms. Second, we find that although adversarial samples are transferable from the target model to our RL agents, they often outperform random Gaussian noise only marginally. This highlights a serious methodological deficiency in previous work on such agents; random jamming should have been taken as the baseline for evaluation. Third, we propose a novel use for adversarial samplesin Black-box attacks of RL agents: they can be used to trigger a trained agent to misbehave after a specific time delay. This appears to be a genuinely new type of attack. It potentially enables an attacker to use devices controlled by RL agents as time bombs

    Adversarial jamming attacks and defense strategies via adaptive deep reinforcement learning

    Full text link
    As the applications of deep reinforcement learning (DRL) in wireless communications grow, sensitivity of DRL based wireless communication strategies against adversarial attacks has started to draw increasing attention. In order to address such sensitivity and alleviate the resulting security concerns, we in this paper consider a victim user that performs DRL-based dynamic channel access, and an attacker that executes DRLbased jamming attacks to disrupt the victim. Hence, both the victim and attacker are DRL agents and can interact with each other, retrain their models, and adapt to opponents' policies. In this setting, we initially develop an adversarial jamming attack policy that aims at minimizing the accuracy of victim's decision making on dynamic channel access. Subsequently, we devise defense strategies against such an attacker, and propose three defense strategies, namely diversified defense with proportional-integral-derivative (PID) control, diversified defense with an imitation attacker, and defense via orthogonal policies. We design these strategies to maximize the attacked victim's accuracy and evaluate their performances.Comment: 13 pages, 24 figure

    A Survey on Reinforcement Learning Security with Application to Autonomous Driving

    Full text link
    Reinforcement learning allows machines to learn from their own experience. Nowadays, it is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent that the reinforcement learning algorithm learns an effective and reliable policy, or to induce the trained agent to make a wrong decision. The literature about the security of reinforcement learning is rapidly growing, and some surveys have been proposed to shed light on this field. However, their categorizations are insufficient for choosing an appropriate defense given the kind of system at hand. In our survey, we do not only overcome this limitation by considering a different perspective, but we also discuss the applicability of state-of-the-art attacks and defenses when reinforcement learning algorithms are used in the context of autonomous driving

    Exploiting Structure for Scalable and Robust Deep Learning

    Get PDF
    Deep learning has seen great success training deep neural networks for complex prediction problems, such as large-scale image recognition, short-term time-series forecasting, and learning behavioral models for games with simple dynamics. However, neural networks have a number of weaknesses: 1) they are not sample-efficient and 2) they are often not robust against (adversarial) input perturbations. Hence, it is challenging to train neural networks for problems with exponential complexity, such as multi-agent games, complex long-term spatiotemporal dynamics, or noisy high-resolution image data. This thesis contributes methods to improve the sample efficiency, expressive power, and robustness of neural networks, by exploiting various forms of low-dimensional structure, such as spatiotemporal hierarchy and multi-agent coordination. We show the effectiveness of this approach in multiple learning paradigms: in both the supervised learning (e.g., imitation learning) and reinforcement learning settings. First, we introduce hierarchical neural networks that model both short-term actions and long-term goals from data, and can learn human-level behavioral models for spatiotemporal multi-agent games, such as basketball, using imitation learning. Second, in reinforcement learning, we show that behavioral policies with a hierarchical latent structure can efficiently learn forms of multi-agent coordination, which enables a form of structured exploration for faster learning. Third, we showcase tensor-train recurrent neural networks that can model high-order mutliplicative structure in dynamical systems (e.g., Lorenz dynamics). We show that this model class gives state-of-the-art long-term forecasting performance with very long time horizons for both simulation and real-world traffic and climate data. Finally, we demonstrate two methods for neural network robustness: 1) stability training, a form of stochastic data augmentation to make neural networks more robust, and 2) neural fingerprinting, a method that detects adversarial examples by validating the network’s behavior in the neighborhood of any given input. In sum, this thesis takes a step to enable machine learning for the next scale of problem complexity, such as rich spatiotemporal multi-agent games and large-scale robust predictions.</p

    Accelerated Policy Evaluation: Learning Adversarial Environments with Adaptive Importance Sampling

    Full text link
    The evaluation of rare but high-stakes events remains one of the main difficulties in obtaining reliable policies from intelligent agents, especially in large or continuous state/action spaces where limited scalability enforces the use of a prohibitively large number of testing iterations. On the other hand, a biased or inaccurate policy evaluation in a safety-critical system could potentially cause unexpected catastrophic failures during deployment. In this paper, we propose the Accelerated Policy Evaluation (APE) method, which simultaneously uncovers rare events and estimates the rare event probability in Markov decision processes. The APE method treats the environment nature as an adversarial agent and learns towards, through adaptive importance sampling, the zero-variance sampling distribution for the policy evaluation. Moreover, APE is scalable to large discrete or continuous spaces by incorporating function approximators. We investigate the convergence properties of proposed algorithms under suitable regularity conditions. Our empirical studies show that APE estimates rare event probability with a smaller variance while only using orders of magnitude fewer samples compared to baseline methods in both multi-agent and single-agent environments.Comment: 10 pages, 5 figure

    Assessment of the Robustness of Deep Neural Networks (DNNs)

    Get PDF
    In the past decade, Deep Neural Networks (DNNs) have demonstrated outstanding performance in various domains. However, recently, some researchers have shown that DNNs are surprisingly vulnerable to adversarial attacks. For instance, adding a small, human-imperceptible perturbation to an input image can fool DNNs, enabling the model to make an arbitrarily wrong prediction with high confidence. This raises serious concerns about the readiness of deep learning models, particularly in safety-critical applications, such as surveillance systems, autonomous vehicles, and medical applications. Hence, it is vital to investigate the performance of DNNs in an adversarial environment. In this thesis, we study the robustness of DNNs in three aspects: adversarial attacks, adversarial defence, and robustness verification. First, we address the robustness problems on video models and propose DeepSAVA, a sparse adversarial attack on video models. It aims to add human-imperceptible perturbations on the crucial frame of the input video to fool classifiers. Additionally, we construct a novel adversarial training framework based on the perturbations generated by DeepSAVA to increase the robustness of video classification models. The results show that DeepSAVA runs a relatively sparse attack on video models, yet achieves state-of-the-art performance in terms of attack success rate and adversarial transferability. Next, we address the challenges of robustness verification in two deep learning models: 3D point cloud models and cooperative multi-agent reinforcement learning models (c-MARLs). Robustness verification aims to provide solid proof of robustness within an input space to any adversarial attacks. To verify the robustness of 3D point cloud models, we propose an efficient verification framework, 3DVerifier, which tackles the challenges of cross-non-linearity operations in multiplication layers and the high computational complexity of high-dimensional point cloud inputs. We use a linear relaxation function to bound the multiplication layer and combine forward and backward propagation to compute the certified bounds of the outputs of the point cloud models. For certifying the c-MARLs, we propose a novel certification method, which is the first work to leverage a scalable approach for c-MARLs to determine actions with guaranteed certified bounds. The challenges of c-MARL certification are accumulated uncertainty as the number of agents increases and the potential lack of impact when changing the action of a single agent into a global team reward. These challenges prevent me from using existing algorithms directly. We employ the false discovery rate (FDR) controlling procedure, considering the importance of each agent to certify per-state robustness and propose a tree-search-based algorithm to find a lower bound of the global reward under the minimal certified perturbation. The experimental results show that the obtained certification bounds are much tighter than those of state-of-the-art RL certification solutions. In summary, this thesis focuses on assessing the robustness of deep learning models that are widely applied in safety-critical systems but rarely studied by the community. This thesis not only investigates the motivation and challenges of assessing the robustness of these deep learning models but also proposes novel and effective approaches to tackle these challenges
    • …
    corecore