8,564 research outputs found
Maximum Resilience of Artificial Neural Networks
The deployment of Artificial Neural Networks (ANNs) in safety-critical
applications poses a number of new verification and certification challenges.
In particular, for ANN-enabled self-driving vehicles it is important to
establish properties about the resilience of ANNs to noisy or even maliciously
manipulated sensory input. We are addressing these challenges by defining
resilience properties of ANN-based classifiers as the maximal amount of input
or sensor perturbation which is still tolerated. This problem of computing
maximal perturbation bounds for ANNs is then reduced to solving mixed integer
optimization problems (MIP). A number of MIP encoding heuristics are developed
for drastically reducing MIP-solver runtimes, and using parallelization of
MIP-solvers results in an almost linear speed-up in the number (up to a certain
limit) of computing cores in our experiments. We demonstrate the effectiveness
and scalability of our approach by means of computing maximal resilience bounds
for a number of ANN benchmark sets ranging from typical image recognition
scenarios to the autonomous maneuvering of robots.Comment: Timestamp research work conducted in the project. version 2: fix some
typos, rephrase the definition, and add some more existing wor
Towards Fast Computation of Certified Robustness for ReLU Networks
Verifying the robustness property of a general Rectified Linear Unit (ReLU)
network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer
CAV17]. Although finding the exact minimum adversarial distortion is hard,
giving a certified lower bound of the minimum distortion is possible. Current
available methods of computing such a bound are either time-consuming or
delivering low quality bounds that are too loose to be useful. In this paper,
we exploit the special structure of ReLU networks and provide two
computationally efficient algorithms Fast-Lin and Fast-Lip that are able to
certify non-trivial lower bounds of minimum distortions, by bounding the ReLU
units with appropriate linear functions Fast-Lin, or by bounding the local
Lipschitz constant Fast-Lip. Experiments show that (1) our proposed methods
deliver bounds close to (the gap is 2-3X) exact minimum distortion found by
Reluplex in small MNIST networks while our algorithms are more than 10,000
times faster; (2) our methods deliver similar quality of bounds (the gap is
within 35% and usually around 10%; sometimes our bounds are even better) for
larger networks compared to the methods based on solving linear programming
problems but our algorithms are 33-14,000 times faster; (3) our method is
capable of solving large MNIST and CIFAR networks up to 7 layers with more than
10,000 neurons within tens of seconds on a single CPU core.
In addition, we show that, in fact, there is no polynomial time algorithm
that can approximately find the minimum adversarial distortion of a
ReLU network with a approximation ratio unless
=, where is the number of neurons in the network.Comment: Tsui-Wei Weng and Huan Zhang contributed equall
Mathematical optimization in deep learning
Mathematical Optimization plays a pillar role in Machine Learning (ML) and Neural Networks (NN) are amongst the most popular and effective ML architectures and are the subject of a very intense investigation. They have also been proven immensely powerful at solving prediction tasks in areas such as speech recognition, image classification, robotics and quantum physics. In this work we present the problem of training a Deep Neural Network (DNN), specifically the continuous optimization problem arising in Feed-Forward Networks
with Rectified Linear Unit (ReLU) activation. Then we will discuss the inverse problem, presenting a model for a trained DNN as a 0-1 Mixed Integer Linear Program (MILP). Some applications, such as feature visualization and the construction of adversarial examples will be outlined. Computational experiments are reported for both direct and inverse problem. The remainder of the text contains the AMPL codes used for solving the posed problems.La optimización matemática juega un papel fundamental en el aprendizaje automático (AA), y las redes neuronales (NN) se encuentran entre las estructuras más populares y efectivas dentro de este campo. Por ello, son objecto de una intensa investigación. Además, han demostrado ser inmensamente potentes resolviendo tareas de predicción en áreas como reconocimiento automático del habla, clasificación de imágenes, robótica y física cuántica. En este trabajo, se presenta el problema de entrenar una red neuronal profunda
(DNN), específicamente el problema de optimización continua que surge en las redes neuronales prealimentadas (FNN) con rectificador (ReLU) como función de activación. Posteriormente, se discutirá el problema inverso, presentaremos un modelo para una DNN que ya ha sido entrenada como un problema de programación lineal en enteros mixta. Describiremos algunas aplicaciones, como visualización de características y la construcción de ejemplos maliciosos. Se realizarán los experimentos computacionales para ambos problemas, el directo y el inverso. Los códigos de AMPL para los problemas planteados se encuentran al final del documento.Universidad de Sevilla. Doble Grado en Física y Matemática
Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods
Training neural networks is a challenging non-convex optimization problem,
and backpropagation or gradient descent can get stuck in spurious local optima.
We propose a novel algorithm based on tensor decomposition for guaranteed
training of two-layer neural networks. We provide risk bounds for our proposed
method, with a polynomial sample complexity in the relevant parameters, such as
input dimension and number of neurons. While learning arbitrary target
functions is NP-hard, we provide transparent conditions on the function and the
input for learnability. Our training method is based on tensor decomposition,
which provably converges to the global optimum, under a set of mild
non-degeneracy conditions. It consists of simple embarrassingly parallel linear
and multi-linear operations, and is competitive with standard stochastic
gradient descent (SGD), in terms of computational complexity. Thus, we propose
a computationally efficient method with guaranteed risk bounds for training
neural networks with one hidden layer.Comment: The tensor decomposition analysis is expanded, and the analysis of
ridge regression is added for recovering the parameters of last layer of
neural networ
On the Universal Approximation Property and Equivalence of Stochastic Computing-based Neural Networks and Binary Neural Networks
Large-scale deep neural networks are both memory intensive and
computation-intensive, thereby posing stringent requirements on the computing
platforms. Hardware accelerations of deep neural networks have been extensively
investigated in both industry and academia. Specific forms of binary neural
networks (BNNs) and stochastic computing based neural networks (SCNNs) are
particularly appealing to hardware implementations since they can be
implemented almost entirely with binary operations. Despite the obvious
advantages in hardware implementation, these approximate computing techniques
are questioned by researchers in terms of accuracy and universal applicability.
Also it is important to understand the relative pros and cons of SCNNs and BNNs
in theory and in actual hardware implementations. In order to address these
concerns, in this paper we prove that the "ideal" SCNNs and BNNs satisfy the
universal approximation property with probability 1 (due to the stochastic
behavior). The proof is conducted by first proving the property for SCNNs from
the strong law of large numbers, and then using SCNNs as a "bridge" to prove
for BNNs. Based on the universal approximation property, we further prove that
SCNNs and BNNs exhibit the same energy complexity. In other words, they have
the same asymptotic energy consumption with the growing of network size. We
also provide a detailed analysis of the pros and cons of SCNNs and BNNs for
hardware implementations and conclude that SCNNs are more suitable for
hardware.Comment: 9 pages, 3 figure
- …