39 research outputs found

    Safety-guided deep reinforcement learning via online gaussian process estimation

    Full text link
    An important facet of reinforcement learning (RL) has to do with how the agent goes about exploring the environment. Traditional exploration strategies typically focus on efficiency and ignore safety. However, for practical applications, ensuring safety of the agent during exploration is crucial since performing an unsafe action or reaching an unsafe state could result in irreversible damage to the agent. The main challenge of safe exploration is that characterizing the unsafe states and actions is difficult for large continuous state or action spaces and unknown environments. In this paper, we propose a novel approach to incorporate estimations of safety to guide exploration and policy search in deep reinforcement learning. By using a cost function to capture trajectory-based safety, our key idea is to formulate the state-action value function of this safety cost as a candidate Lyapunov function and extend control-theoretic results to approximate its derivative using online Gaussian Process (GP) estimation. We show how to use these statistical models to guide the agent in unknown environments to obtain high-performance control policies with provable stability certificates.Accepted manuscrip

    DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck

    Full text link
    Deep reinforcement learning (DRL) agents are often sensitive to visual changes that were unseen in their training environments. To address this problem, we leverage the sequential nature of RL to learn robust representations that encode only task-relevant information from observations based on the unsupervised multi-view setting. Specifically, we introduce an auxiliary objective based on the multi-view in-formation bottleneck (MIB) principle which quantifies the amount of task-irrelevant information and encourages learning representations that are both predictive of the future and less sensitive to task-irrelevant distractions. This enables us to train high-performance policies that are robust to visual distractions and can generalize to unseen environments. We demonstrate that our approach can achieve SOTA performance on diverse visual control tasks on the DeepMind Control Suite, even when the background is replaced with natural videos. In addition, we show that our approach outperforms well-established baselines for generalization to unseen environments on the Procgen benchmark. Our code is open-sourced and available at https://github.com/JmfanBU/DRIBO.Comment: 27 page

    Adversarial Training and Provable Robustness: A Tale of Two Objectives

    Full text link
    We propose a principled framework that combines adversarial training and provable robustness verification for training certifiably robust neural networks. We formulate the training problem as a joint optimization problem with both empirical and provable robustness objectives and develop a novel gradient-descent technique that can eliminate bias in stochastic multi-gradients. We perform both theoretical analysis on the convergence of the proposed technique and experimental comparison with state-of-the-arts. Results on MNIST and CIFAR-10 show that our method can consistently match or outperform prior approaches for provable l infinity robustness. Notably, we achieve 6.60% verified test error on MNIST at epsilon = 0.3, and 66.57% on CIFAR-10 with epsilon = 8/255.Comment: Accepted at AAAI 202

    Divide and Slide: Layer-Wise Refinement for Output Range Analysis of Deep Neural Networks

    Get PDF
    In this article, we present a layer-wise refinement method for neural network output range analysis. While approaches such as nonlinear programming (NLP) can directly model the high nonlinearity brought by neural networks in output range analysis, they are known to be difficult to solve in general. We propose to use a convex polygonal relaxation (overapproximation) of the activation functions to cope with the nonlinearity. This allows us to encode the relaxed problem into a mixed-integer linear program (MILP), and control the tightness of the relaxation by adjusting the number of segments in the polygon. Starting with a segment number of 1 for each neuron, which coincides with a linear programming (LP) relaxation, our approach selects neurons layer by layer to iteratively refine this relaxation. To tackle the increase of the number of integer variables with tighter refinement, we bridge the propagation-based method and the programming-based method by dividing and sliding the layerwise constraints. Specifically, given a sliding number s, for the neurons in layer l, we only encode the constraints of the layers between l - s and l. We show that our overall framework is sound and provides a valid overapproximation. Experiments on deep neural networks demonstrate significant improvement on output range analysis precision using our approach compared to the state-of-the-art

    POLAR: A Polynomial Arithmetic Framework for Verifying Neural-Network Controlled Systems

    Get PDF
    We propose POLAR, a \textbf{pol}ynomial \textbf{ar}ithmetic framework that leverages polynomial overapproximations with interval remainders for bounded-time reachability analysis of neural network-controlled systems (NNCSs). Compared with existing arithmetic approaches that use standard Taylor models, our framework uses a novel approach to iteratively overapproximate the neuron output ranges layer-by-layer with a combination of Bernstein polynomial interpolation for continuous activation functions and Taylor model arithmetic for the other operations. This approach can overcome the main drawback in the standard Taylor model arithmetic, i.e. its inability to handle functions that cannot be well approximated by Taylor polynomials, and significantly improve the accuracy and efficiency of reachable states computation for NNCSs. To further tighten the overapproximation, our method keeps the Taylor model remainders symbolic under the linear mappings when estimating the output range of a neural network. We show that POLAR can be seamlessly integrated with existing Taylor model flowpipe construction techniques, and demonstrate that POLAR significantly outperforms the current state-of-the-art techniques on a suite of benchmarks

    ReachNN: Reachability Analysis of Neural-Network Controlled Systems

    Get PDF
    Applying neural networks as controllers in dynamical systems has shown great promises. However, it is critical yet challenging to verify the safety of such control systems with neural-network controllers in the loop. Previous methods for verifying neural network controlled systems are limited to a few specific activation functions. In this work, we propose a new reachability analysis approach based on Bernstein polynomials that can verify neural-network controlled systems with a more general form of activation functions, i.e., as long as they ensure that the neural networks are Lipschitz continuous. Specifically, we consider abstracting feedforward neural networks with Bernstein polynomials for a small subset of inputs. To quantify the error introduced by abstraction, we provide both theoretical error bound estimation based on the theory of Bernstein polynomials and more practical sampling based error bound estimation, following a tight Lipschitz constant estimation approach based on forward reachability analysis. Compared with previous methods, our approach addresses a much broader set of neural networks, including heterogeneous neural networks that contain multiple types of activation functions. Experiment results on a variety of benchmarks show the effectiveness of our approach

    ARCH-COMP22 category report: Artificial intelligence and neural network control systems (AINNCS) for continuous and hybrid systems plants

    Get PDF
    This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with artificial intelligence (AI) components. Specifically, machine learning (ML) components in cyber-physical systems (CPS), such as feedforward neural networks used as feedback controllers in closed-loop systems are considered, which is a class of systems classically known as intelligent control systems, or in more modern and specific terms, neural network control systems (NNCS). We more broadly refer to this category as AI and NNCS (AINNCS). The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2022. In the fourth edition of this AINNCS category at ARCH-COMP, four tools have been applied to solve 10 different benchmark problems. There are two new participants: CORA and POLAR, and two previous participants: JuliaReach and NNV. The goal of this report is to be a snapshot of the current landscape of tools and the types of benchmarks for which these tools are suited. The results of this iteration significantly outperform those of any previous year, demonstrating the continuous advancement of this community in the past decade.</jats:p

    REGLO: Provable Neural Network Repair for Global Robustness Properties

    Get PDF
    We present REGLO, a novel methodology for repairing pretrained neural networks to satisfy global robustness and individual fairness properties. A neural network is said to be globally robust with respect to a given input region if and only if all the input points in the region are locally robust. This notion of global robustness also captures the notion of individual fairness as a special case. We prove that any counterexample to a global robustness property must exhibit a corresponding large gradient. For ReLU networks, this result allows us to efficiently identify the linear regions that violate a given global robustness property. By formulating and solving a suitable robust convex optimization problem, REGLO then computes a minimal weight change that will provably repair these violating linear regions.</jats:p

    POLAR-Express: Efficient and Precise Formal Reachability Analysis of Neural-Network Controlled Systems

    Get PDF
    Neural networks (NNs) playing the role of controllers have demonstrated impressive empirical performance on challenging control problems. However, the potential adoption of NN controllers in real-life applications has been significantly impeded by the growing concerns over the safety of these neural-network controlled systems (NNCSs). In this work, we present POLAR-Express, an efficient and precise formal reachability analysis tool for verifying the safety of NNCSs. POLAR-Express uses Taylor model arithmetic to propagate Taylor models (TMs) layer-by-layer across a neural network to compute an over-approximation of the neural network. It can be applied to analyze any feed-forward neural networks with continuous activation functions, such as ReLU, Sigmoid, and Tanh activation functions that cover the common benchmarks for NNCS reachability analysis. Compared with its earlier prototype POLAR, we develop a novel approach in POLAR-Express to propagate TMs more efficiently and precisely across ReLU activation functions, and provide parallel computation support for TM propagation, thus significantly improving the efficiency and scalability. Across the comparison with six other state-of-the-art tools on a diverse set of common benchmarks, POLAR-Express achieves the best verification efficiency and tightness in the reachable set analysis. POLAR-Express is publicly available at https://github.com/ChaoHuang2018/POLAR&#x005F;Tool

    Citrus sinensis MYB Transcription Factor CsMYB85 Induce Fruit Juice Sac Lignification Through Interaction With Other CsMYB Transcription Factors

    Get PDF
    Varieties of Citrus are commercially important fruits that are cultivated worldwide and are valued for being highly nutritious and having an appealing flavor. Lignification of citrus fruit juice sacs is a serious physiological disorder that occurs during postharvest storage, for which the underlying transcriptional regulatory mechanisms remain unclear. In this study, we identified and isolated a candidate MYB transcription factor, CsMYB85, that is involved in the regulation of lignin biosynthesis in Citrus sinensis, which has homologs in Arabidopsis and other plants. We found that during juice sac lignification, CsMYB85 expression levels increase significantly, and therefore, suspected that this gene may control lignin biosynthesis during the lignification process. Our results indicated that CsMYB85 binds the CsMYB330 promoter, regulates its expression, and interacts with CsMYB308 in transgenic yeast and tobacco. A transient expression assay indicated that Cs4CL1 expression levels and lignin content significantly increased in fruit juice sacs overexpressing CsMYB85. At4CL1 expression levels and lignin content were also significantly increased in Arabidopsis overexpressing CsMYB85. We accordingly present convincing evidence for the participation of the CsMYB85 transcription factor in fruit juice sac lignification, and thereby provide new insights into the transcriptional regulation of this process in citrus fruits
    corecore