144 research outputs found
Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation
We introduce Equilibrium Propagation, a learning framework for energy-based
models. It involves only one kind of neural computation, performed in both the
first phase (when the prediction is made) and the second phase of training
(after the target or prediction error is revealed). Although this algorithm
computes the gradient of an objective function just like Backpropagation, it
does not need a special computation or circuit for the second phase, where
errors are implicitly propagated. Equilibrium Propagation shares similarities
with Contrastive Hebbian Learning and Contrastive Divergence while solving the
theoretical issues of both algorithms: our algorithm computes the gradient of a
well defined objective function. Because the objective function is defined in
terms of local perturbations, the second phase of Equilibrium Propagation
corresponds to only nudging the prediction (fixed point, or stationary
distribution) towards a configuration that reduces prediction error. In the
case of a recurrent multi-layer supervised network, the output units are
slightly nudged towards their target in the second phase, and the perturbation
introduced at the output layer propagates backward in the hidden layers. We
show that the signal 'back-propagated' during this second phase corresponds to
the propagation of error derivatives and encodes the gradient of the objective
function, when the synaptic update corresponds to a standard form of
spike-timing dependent plasticity. This work makes it more plausible that a
mechanism similar to Backpropagation could be implemented by brains, since
leaky integrator neural computation performs both inference and error
back-propagation in our model. The only local difference between the two phases
is whether synaptic changes are allowed or not
Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems
Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been
demonstrated to perform efficiently in a variety of applications, such as
dimensionality reduction, feature learning, and classification. Their
implementation on neuromorphic hardware platforms emulating large-scale
networks of spiking neurons can have significant advantages from the
perspectives of scalability, power dissipation and real-time interfacing with
the environment. However the traditional RBM architecture and the commonly used
training algorithm known as Contrastive Divergence (CD) are based on discrete
updates and exact arithmetics which do not directly map onto a dynamical neural
substrate. Here, we present an event-driven variation of CD to train a RBM
constructed with Integrate & Fire (I&F) neurons, that is constrained by the
limitations of existing and near future neuromorphic hardware platforms. Our
strategy is based on neural sampling, which allows us to synthesize a spiking
neural network that samples from a target Boltzmann distribution. The recurrent
activity of the network replaces the discrete steps of the CD algorithm, while
Spike Time Dependent Plasticity (STDP) carries out the weight updates in an
online, asynchronous fashion. We demonstrate our approach by training an RBM
composed of leaky I&F neurons with STDP synapses to learn a generative model of
the MNIST hand-written digit dataset, and by testing it in recognition,
generation and cue integration tasks. Our results contribute to a machine
learning-driven approach for synthesizing networks of spiking neurons capable
of carrying out practical, high-level functionality.Comment: (Under review
Bidirectional Learning in Recurrent Neural Networks Using Equilibrium Propagation
Neurobiologically-plausible learning algorithms for recurrent neural networks that can perform supervised learning are a neglected area of study. Equilibrium propagation is a recent synthesis of several ideas in biological and artificial neural network research that uses a continuous-time, energy-based neural model with a local learning rule. However, despite dealing with recurrent networks, equilibrium propagation has only been applied to discriminative categorization tasks. This thesis generalizes equilibrium propagation to bidirectional learning with asymmetric weights. Simultaneously learning the discriminative as well as generative transformations for a set of data points and their corresponding category labels, bidirectional equilibrium propagation utilizes recurrence and weight asymmetry to share related but non-identical representations within the network. Experiments on an artificial dataset demonstrate the ability to learn both transformations, as well as the ability for asymmetric-weight networks to generalize their discriminative training to the untrained generative task
Learning Spiking Neural Systems with the Event-Driven Forward-Forward Process
We develop a novel credit assignment algorithm for information processing
with spiking neurons without requiring feedback synapses. Specifically, we
propose an event-driven generalization of the forward-forward and the
predictive forward-forward learning processes for a spiking neural system that
iteratively processes sensory input over a stimulus window. As a result, the
recurrent circuit computes the membrane potential of each neuron in each layer
as a function of local bottom-up, top-down, and lateral signals, facilitating a
dynamic, layer-wise parallel form of neural computation. Unlike spiking neural
coding, which relies on feedback synapses to adjust neural electrical activity,
our model operates purely online and forward in time, offering a promising way
to learn distributed representations of sensory data patterns with temporal
spike signals. Notably, our experimental results on several pattern datasets
demonstrate that the even-driven forward-forward (ED-FF) framework works well
for training a dynamic recurrent spiking system capable of both classification
and reconstruction
SuperSpike: Supervised learning in multi-layer spiking neural networks
A vast majority of computation in the brain is performed by spiking neural
networks. Despite the ubiquity of such spiking, we currently lack an
understanding of how biological spiking neural circuits learn and compute
in-vivo, as well as how we can instantiate such capabilities in artificial
spiking circuits in-silico. Here we revisit the problem of supervised learning
in temporally coding multi-layer spiking neural networks. First, by using a
surrogate gradient approach, we derive SuperSpike, a nonlinear voltage-based
three factor learning rule capable of training multi-layer networks of
deterministic integrate-and-fire neurons to perform nonlinear computations on
spatiotemporal spike patterns. Second, inspired by recent results on feedback
alignment, we compare the performance of our learning rule under different
credit assignment strategies for propagating output errors to hidden units.
Specifically, we test uniform, symmetric and random feedback, finding that
simpler tasks can be solved with any type of feedback, while more complex tasks
require symmetric feedback. In summary, our results open the door to obtaining
a better scientific understanding of learning and computation in spiking neural
networks by advancing our ability to train them to solve nonlinear problems
involving transformations between different spatiotemporal spike-time patterns
Biologically plausible deep learning -- but how far can we go with shallow networks?
Training deep neural networks with the error backpropagation algorithm is
considered implausible from a biological perspective. Numerous recent
publications suggest elaborate models for biologically plausible variants of
deep learning, typically defining success as reaching around 98% test accuracy
on the MNIST data set. Here, we investigate how far we can go on digit (MNIST)
and object (CIFAR10) classification with biologically plausible, local learning
rules in a network with one hidden layer and a single readout layer. The hidden
layer weights are either fixed (random or random Gabor filters) or trained with
unsupervised methods (PCA, ICA or Sparse Coding) that can be implemented by
local learning rules. The readout layer is trained with a supervised, local
learning rule. We first implement these models with rate neurons. This
comparison reveals, first, that unsupervised learning does not lead to better
performance than fixed random projections or Gabor filters for large hidden
layers. Second, networks with localized receptive fields perform significantly
better than networks with all-to-all connectivity and can reach backpropagation
performance on MNIST. We then implement two of the networks - fixed, localized,
random & random Gabor filters in the hidden layer - with spiking leaky
integrate-and-fire neurons and spike timing dependent plasticity to train the
readout layer. These spiking models achieve > 98.2% test accuracy on MNIST,
which is close to the performance of rate networks with one hidden layer
trained with backpropagation. The performance of our shallow network models is
comparable to most current biologically plausible models of deep learning.
Furthermore, our results with a shallow spiking network provide an important
reference and suggest the use of datasets other than MNIST for testing the
performance of future models of biologically plausible deep learning.Comment: 14 pages, 4 figure
- …