2,035 research outputs found
Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review
The paper characterizes classes of functions for which deep learning can be
exponentially better than shallow learning. Deep convolutional networks are a
special case of these conditions, though weight sharing is not the main reason
for their exponential advantage
The laminar integration of sensory inputs with feedback signals in human cortex
The cortex constitutes the largest area of the human brain. Yet we have only a basic understanding of how the cortex performs one vital function: the integration of sensory signals (carried by feedforward pathways) with internal representations (carried by feedback pathways). A multi-scale, multi-species approach is essential for understanding the site of integration, computational mechanism and functional role of this processing. To improve our knowledge we must rely on brain imaging with improved spatial and temporal resolution and paradigms which can measure internal processes in the human brain, and on the bridging of disciplines in order to characterize this processing at cellular and circuit levels. We highlight apical amplification as one potential mechanism for integrating feedforward and feedback inputs within pyramidal neurons in the rodent brain. We reflect on the challenges and progress in applying this model neuronal process to the study of human cognition. We conclude that cortical-layer specific measures in humans will be an essential contribution for better understanding the landscape of information in cortical feedback, helping to bridge the explanatory gap
Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation
We introduce Equilibrium Propagation, a learning framework for energy-based
models. It involves only one kind of neural computation, performed in both the
first phase (when the prediction is made) and the second phase of training
(after the target or prediction error is revealed). Although this algorithm
computes the gradient of an objective function just like Backpropagation, it
does not need a special computation or circuit for the second phase, where
errors are implicitly propagated. Equilibrium Propagation shares similarities
with Contrastive Hebbian Learning and Contrastive Divergence while solving the
theoretical issues of both algorithms: our algorithm computes the gradient of a
well defined objective function. Because the objective function is defined in
terms of local perturbations, the second phase of Equilibrium Propagation
corresponds to only nudging the prediction (fixed point, or stationary
distribution) towards a configuration that reduces prediction error. In the
case of a recurrent multi-layer supervised network, the output units are
slightly nudged towards their target in the second phase, and the perturbation
introduced at the output layer propagates backward in the hidden layers. We
show that the signal 'back-propagated' during this second phase corresponds to
the propagation of error derivatives and encodes the gradient of the objective
function, when the synaptic update corresponds to a standard form of
spike-timing dependent plasticity. This work makes it more plausible that a
mechanism similar to Backpropagation could be implemented by brains, since
leaky integrator neural computation performs both inference and error
back-propagation in our model. The only local difference between the two phases
is whether synaptic changes are allowed or not
Power Optimizations in MTJ-based Neural Networks through Stochastic Computing
Artificial Neural Networks (ANNs) have found widespread applications in tasks
such as pattern recognition and image classification. However, hardware
implementations of ANNs using conventional binary arithmetic units are
computationally expensive, energy-intensive and have large area overheads.
Stochastic Computing (SC) is an emerging paradigm which replaces these
conventional units with simple logic circuits and is particularly suitable for
fault-tolerant applications. Spintronic devices, such as Magnetic Tunnel
Junctions (MTJs), are capable of replacing CMOS in memory and logic circuits.
In this work, we propose an energy-efficient use of MTJs, which exhibit
probabilistic switching behavior, as Stochastic Number Generators (SNGs), which
forms the basis of our NN implementation in the SC domain. Further, error
resilient target applications of NNs allow us to introduce Approximate
Computing, a framework wherein accuracy of computations is traded-off for
substantial reductions in power consumption. We propose approximating the
synaptic weights in our MTJ-based NN implementation, in ways brought about by
properties of our MTJ-SNG, to achieve energy-efficiency. We design an algorithm
that can perform such approximations within a given error tolerance in a
single-layer NN in an optimal way owing to the convexity of the problem
formulation. We then use this algorithm and develop a heuristic approach for
approximating multi-layer NNs. To give a perspective of the effectiveness of
our approach, a 43% reduction in power consumption was obtained with less than
1% accuracy loss on a standard classification problem, with 26% being brought
about by the proposed algorithm.Comment: Accepted in the 2017 IEEE/ACM International Conference on Low Power
Electronics and Desig
Continuous-variable quantum neural networks
We introduce a general method for building neural networks on quantum
computers. The quantum neural network is a variational quantum circuit built in
the continuous-variable (CV) architecture, which encodes quantum information in
continuous degrees of freedom such as the amplitudes of the electromagnetic
field. This circuit contains a layered structure of continuously parameterized
gates which is universal for CV quantum computation. Affine transformations and
nonlinear activation functions, two key elements in neural networks, are
enacted in the quantum network using Gaussian and non-Gaussian gates,
respectively. The non-Gaussian gates provide both the nonlinearity and the
universality of the model. Due to the structure of the CV model, the CV quantum
neural network can encode highly nonlinear transformations while remaining
completely unitary. We show how a classical network can be embedded into the
quantum formalism and propose quantum versions of various specialized model
such as convolutional, recurrent, and residual networks. Finally, we present
numerous modeling experiments built with the Strawberry Fields software
library. These experiments, including a classifier for fraud detection, a
network which generates Tetris images, and a hybrid classical-quantum
autoencoder, demonstrate the capability and adaptability of CV quantum neural
networks
Sampling-based probabilistic inference emerges from learning in neural circuits with a cost on reliability
Neural responses in the cortex change over time both systematically, due to
ongoing plasticity and learning, and seemingly randomly, due to various sources
of noise and variability. Most previous work considered each of these
processes, learning and variability, in isolation -- here we study neural
networks exhibiting both and show that their interaction leads to the emergence
of powerful computational properties. We trained neural networks on classical
unsupervised learning tasks, in which the objective was to represent their
inputs in an efficient, easily decodable form, with an additional cost for
neural reliability which we derived from basic biophysical considerations. This
cost on reliability introduced a tradeoff between energetically cheap but
inaccurate representations and energetically costly but accurate ones. Despite
the learning tasks being non-probabilistic, the networks solved this tradeoff
by developing a probabilistic representation: neural variability represented
samples from statistically appropriate posterior distributions that would
result from performing probabilistic inference over their inputs. We provide an
analytical understanding of this result by revealing a connection between the
cost of reliability, and the objective for a state-of-the-art Bayesian
inference strategy: variational autoencoders. We show that the same cost leads
to the emergence of increasingly accurate probabilistic representations as
networks become more complex, from single-layer feed-forward, through
multi-layer feed-forward, to recurrent architectures. Our results provide
insights into why neural responses in sensory areas show signatures of
sampling-based probabilistic representations, and may inform future deep
learning algorithms and their implementation in stochastic low-precision
computing systems
- …