215,699 research outputs found
Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients
While neuroevolution (evolving neural networks) has a successful track record
across a variety of domains from reinforcement learning to artificial life, it
is rarely applied to large, deep neural networks. A central reason is that
while random mutation generally works in low dimensions, a random perturbation
of thousands or millions of weights is likely to break existing functionality,
providing no learning signal even if some individual weight changes were
beneficial. This paper proposes a solution by introducing a family of safe
mutation (SM) operators that aim within the mutation operator itself to find a
degree of change that does not alter network behavior too much, but still
facilitates exploration. Importantly, these SM operators do not require any
additional interactions with the environment. The most effective SM variant
capitalizes on the intriguing opportunity to scale the degree of mutation of
each individual weight according to the sensitivity of the network's outputs to
that weight, which requires computing the gradient of outputs with respect to
the weights (instead of the gradient of error, as in conventional deep
learning). This safe mutation through gradients (SM-G) operator dramatically
increases the ability of a simple genetic algorithm-based neuroevolution method
to find solutions in high-dimensional domains that require deep and/or
recurrent neural networks (which tend to be particularly brittle to mutation),
including domains that require processing raw pixels. By improving our ability
to evolve deep neural networks, this new safer approach to mutation expands the
scope of domains amenable to neuroevolution
Privacy Aware Offloading of Deep Neural Networks
Deep neural networks require large amounts of resources which makes them hard
to use on resource constrained devices such as Internet-of-things devices.
Offloading the computations to the cloud can circumvent these constraints but
introduces a privacy risk since the operator of the cloud is not necessarily
trustworthy. We propose a technique that obfuscates the data before sending it
to the remote computation node. The obfuscated data is unintelligible for a
human eavesdropper but can still be classified with a high accuracy by a neural
network trained on unobfuscated images.Comment: ICML 2018 Privacy in Machine Learning and Artificial Intelligence
worksho
SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
We present Spline-based Convolutional Neural Networks (SplineCNNs), a variant
of deep neural networks for irregular structured and geometric input, e.g.,
graphs or meshes. Our main contribution is a novel convolution operator based
on B-splines, that makes the computation time independent from the kernel size
due to the local support property of the B-spline basis functions. As a result,
we obtain a generalization of the traditional CNN convolution operator by using
continuous kernel functions parametrized by a fixed number of trainable
weights. In contrast to related approaches that filter in the spectral domain,
the proposed method aggregates features purely in the spatial domain. In
addition, SplineCNN allows entire end-to-end training of deep architectures,
using only the geometric structure as input, instead of handcrafted feature
descriptors. For validation, we apply our method on tasks from the fields of
image graph classification, shape correspondence and graph node classification,
and show that it outperforms or pars state-of-the-art approaches while being
significantly faster and having favorable properties like domain-independence.Comment: Presented at CVPR 201
Learned Primal-dual Reconstruction
We propose the Learned Primal-Dual algorithm for tomographic reconstruction.
The algorithm accounts for a (possibly non-linear) forward operator in a deep
neural network by unrolling a proximal primal-dual optimization method, but
where the proximal operators have been replaced with convolutional neural
networks. The algorithm is trained end-to-end, working directly from raw
measured data and it does not depend on any initial reconstruction such as FBP.
We compare performance of the proposed method on low dose CT reconstruction
against FBP, TV, and deep learning based post-processing of FBP. For the
Shepp-Logan phantom we obtain >6dB PSNR improvement against all compared
methods. For human phantoms the corresponding improvement is 6.6dB over TV and
2.2dB over learned post-processing along with a substantial improvement in the
SSIM. Finally, our algorithm involves only ten forward-back-projection
computations, making the method feasible for time critical clinical
applications.Comment: 11 pages, 5 figure
- …
