1,835 research outputs found

    Output Reachable Set Estimation and Verification for Multi-Layer Neural Networks

    Get PDF
    In this paper, the output reachable estimation and safety verification problems for multi-layer perceptron neural networks are addressed. First, a conception called maximum sensitivity in introduced and, for a class of multi-layer perceptrons whose activation functions are monotonic functions, the maximum sensitivity can be computed via solving convex optimization problems. Then, using a simulation-based method, the output reachable set estimation problem for neural networks is formulated into a chain of optimization problems. Finally, an automated safety verification is developed based on the output reachable set estimation result. An application to the safety verification for a robotic arm model with two joints is presented to show the effectiveness of proposed approaches.Comment: 8 pages, 9 figures, to appear in TNNL

    On the distance between two neural networks and the stability of learning

    Get PDF
    How far apart are two neural networks? This is a foundational question in their theory. We derive a simple and tractable bound that relates distance in function space to distance in parameter space for a broad class of nonlinear compositional functions. The bound distills a clear dependence on depth of the composition. The theory is of practical relevance since it establishes a trust region for first-order optimisation. In turn, this suggests an optimiser that we call Frobenius matched gradient descent---or Fromage. Fromage involves a principled form of gradient rescaling and enjoys guarantees on stability of both the spectra and Frobenius norms of the weights. We find that the new algorithm increases the depth at which a multilayer perceptron may be trained as compared to Adam and SGD and is competitive with Adam for training generative adversarial networks. We further verify that Fromage scales up to a language transformer with over 10⁞ parameters. Please find code & reproducibility instructions at: https://github.com/jxbz/fromag

    Revealing quantum chaos with machine learning

    Full text link
    Understanding properties of quantum matter is an outstanding challenge in science. In this paper, we demonstrate how machine-learning methods can be successfully applied for the classification of various regimes in single-particle and many-body systems. We realize neural network algorithms that perform a classification between regular and chaotic behavior in quantum billiard models with remarkably high accuracy. We use the variational autoencoder for autosupervised classification of regular/chaotic wave functions, as well as demonstrating that variational autoencoders could be used as a tool for detection of anomalous quantum states, such as quantum scars. By taking this method further, we show that machine learning techniques allow us to pin down the transition from integrability to many-body quantum chaos in Heisenberg XXZ spin chains. For both cases, we confirm the existence of universal W shapes that characterize the transition. Our results pave the way for exploring the power of machine learning tools for revealing exotic phenomena in quantum many-body systems.Comment: 12 pages, 12 figure

    NeuralSens: Sensitivity Analysis of Neural Networks

    Full text link
    Neural networks are important tools for data-intensive analysis and are commonly applied to model non-linear relationships between dependent and independent variables. However, neural networks are usually seen as "black boxes" that offer minimal information about how the input variables are used to predict the response in a fitted model. This article describes the \pkg{NeuralSens} package that can be used to perform sensitivity analysis of neural networks using the partial derivatives method. Functions in the package can be used to obtain the sensitivities of the output with respect to the input variables, evaluate variable importance based on sensitivity measures and characterize relationships between input and output variables. Methods to calculate sensitivities are provided for objects from common neural network packages in \proglang{R}, including \pkg{neuralnet}, \pkg{nnet}, \pkg{RSNNS}, \pkg{h2o}, \pkg{neural}, \pkg{forecast} and \pkg{caret}. The article presents an overview of the techniques for obtaining information from neural network models, a theoretical foundation of how are calculated the partial derivatives of the output with respect to the inputs of a multi-layer perceptron model, a description of the package structure and functions, and applied examples to compare \pkg{NeuralSens} functions with analogous functions from other available \proglang{R} packages.Comment: 28 pages, 12 figures, submitted to Journal of Statistical Software (JSS) https://www.jstatsoft.org/inde

    Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization

    Full text link
    Post-hoc explanation methods for machine learning models have been widely used to support decision-making. One of the popular methods is Counterfactual Explanation (CE), also known as Actionable Recourse, which provides a user with a perturbation vector of features that alters the prediction result. Given a perturbation vector, a user can interpret it as an "action" for obtaining one's desired decision result. In practice, however, showing only a perturbation vector is often insufficient for users to execute the action. The reason is that if there is an asymmetric interaction among features, such as causality, the total cost of the action is expected to depend on the order of changing features. Therefore, practical CE methods are required to provide an appropriate order of changing features in addition to a perturbation vector. For this purpose, we propose a new framework called Ordered Counterfactual Explanation (OrdCE). We introduce a new objective function that evaluates a pair of an action and an order based on feature interaction. To extract an optimal pair, we propose a mixed-integer linear optimization approach with our objective function. Numerical experiments on real datasets demonstrated the effectiveness of our OrdCE in comparison with unordered CE methods.Comment: 20 pages, 5 figures, to appear in the 35th AAAI Conference on Artificial Intelligence (AAAI 2021

    Multilayer optical learning networks

    Get PDF
    A new approach to learning in a multilayer optical neural network based on holographically interconnected nonlinear devices is presented. The proposed network can learn the interconnections that form a distributed representation of a desired pattern transformation operation. The interconnections are formed in an adaptive and self-aligning fashioias volume holographic gratings in photorefractive crystals. Parallel arrays of globally space-integrated inner products diffracted by the interconnecting hologram illuminate arrays of nonlinear Fabry-Perot etalons for fast thresholding of the transformed patterns. A phase conjugated reference wave interferes with a backward propagating error signal to form holographic interference patterns which are time integrated in the volume of a photorefractive crystal to modify slowly and learn the appropriate self-aligning interconnections. This multilayer system performs an approximate implementation of the backpropagation learning procedure in a massively parallel high-speed nonlinear optical network

    ARCHITECTURE OPTIMIZATION, TRAINING CONVERGENCE AND NETWORK ESTIMATION ROBUSTNESS OF A FULLY CONNECTED RECURRENT NEURAL NETWORK

    Get PDF
    Recurrent neural networks (RNN) have been rapidly developed in recent years. Applications of RNN can be found in system identification, optimization, image processing, pattern reorganization, classification, clustering, memory association, etc. In this study, an optimized RNN is proposed to model nonlinear dynamical systems. A fully connected RNN is developed first which is modified from a fully forward connected neural network (FFCNN) by accommodating recurrent connections among its hidden neurons. In addition, a destructive structure optimization algorithm is applied and the extended Kalman filter (EKF) is adopted as a network\u27s training algorithm. These two algorithms can seamlessly work together to generate the optimized RNN. The enhancement of the modeling performance of the optimized network comes from three parts: 1) its prototype - the FFCNN has advantages over multilayer perceptron network (MLP), the most widely used network, in terms of modeling accuracy and generalization ability; 2) the recurrency in RNN network make it more capable of modeling non-linear dynamical systems; and 3) the structure optimization algorithm further improves RNN\u27s modeling performance in generalization ability and robustness. Performance studies of the proposed network are highlighted in training convergence and robustness. For the training convergence study, the Lyapunov method is used to adapt some training parameters to guarantee the training convergence, while the maximum likelihood method is used to estimate some other parameters to accelerate the training process. In addition, robustness analysis is conducted to develop a robustness measure considering uncertainties propagation through RNN via unscented transform. Two case studies, the modeling of a benchmark non-linear dynamical system and a tool wear progression in hard turning, are carried out to testify the development in this dissertation. The work detailed in this dissertation focuses on the creation of: (1) a new method to prove/guarantee the training convergence of RNN, and (2) a new method to quantify the robustness of RNN using uncertainty propagation analysis. With the proposed study, RNN and related algorithms are developed to model nonlinear dynamical system which can benefit modeling applications such as the condition monitoring studies in terms of robustness and accuracy in the future
    • 

    corecore