1,705 research outputs found
SuperSpike: Supervised learning in multi-layer spiking neural networks
A vast majority of computation in the brain is performed by spiking neural
networks. Despite the ubiquity of such spiking, we currently lack an
understanding of how biological spiking neural circuits learn and compute
in-vivo, as well as how we can instantiate such capabilities in artificial
spiking circuits in-silico. Here we revisit the problem of supervised learning
in temporally coding multi-layer spiking neural networks. First, by using a
surrogate gradient approach, we derive SuperSpike, a nonlinear voltage-based
three factor learning rule capable of training multi-layer networks of
deterministic integrate-and-fire neurons to perform nonlinear computations on
spatiotemporal spike patterns. Second, inspired by recent results on feedback
alignment, we compare the performance of our learning rule under different
credit assignment strategies for propagating output errors to hidden units.
Specifically, we test uniform, symmetric and random feedback, finding that
simpler tasks can be solved with any type of feedback, while more complex tasks
require symmetric feedback. In summary, our results open the door to obtaining
a better scientific understanding of learning and computation in spiking neural
networks by advancing our ability to train them to solve nonlinear problems
involving transformations between different spatiotemporal spike-time patterns
Signal Propagation in Feedforward Neuronal Networks with Unreliable Synapses
In this paper, we systematically investigate both the synfire propagation and
firing rate propagation in feedforward neuronal network coupled in an
all-to-all fashion. In contrast to most earlier work, where only reliable
synaptic connections are considered, we mainly examine the effects of
unreliable synapses on both types of neural activity propagation in this work.
We first study networks composed of purely excitatory neurons. Our results show
that both the successful transmission probability and excitatory synaptic
strength largely influence the propagation of these two types of neural
activities, and better tuning of these synaptic parameters makes the considered
network support stable signal propagation. It is also found that noise has
significant but different impacts on these two types of propagation. The
additive Gaussian white noise has the tendency to reduce the precision of the
synfire activity, whereas noise with appropriate intensity can enhance the
performance of firing rate propagation. Further simulations indicate that the
propagation dynamics of the considered neuronal network is not simply
determined by the average amount of received neurotransmitter for each neuron
in a time instant, but also largely influenced by the stochastic effect of
neurotransmitter release. Second, we compare our results with those obtained in
corresponding feedforward neuronal networks connected with reliable synapses
but in a random coupling fashion. We confirm that some differences can be
observed in these two different feedforward neuronal network models. Finally,
we study the signal propagation in feedforward neuronal networks consisting of
both excitatory and inhibitory neurons, and demonstrate that inhibition also
plays an important role in signal propagation in the considered networks.Comment: 33pages, 16 figures; Journal of Computational Neuroscience
(published
Inherent Weight Normalization in Stochastic Neural Networks
Multiplicative stochasticity such as Dropout improves the robustness and
generalizability of deep neural networks. Here, we further demonstrate that
always-on multiplicative stochasticity combined with simple threshold neurons
are sufficient operations for deep neural networks. We call such models Neural
Sampling Machines (NSM). We find that the probability of activation of the NSM
exhibits a self-normalizing property that mirrors Weight Normalization, a
previously studied mechanism that fulfills many of the features of Batch
Normalization in an online fashion. The normalization of activities during
training speeds up convergence by preventing internal covariate shift caused by
changes in the input distribution. The always-on stochasticity of the NSM
confers the following advantages: the network is identical in the inference and
learning phases, making the NSM suitable for online learning, it can exploit
stochasticity inherent to a physical substrate such as analog non-volatile
memories for in-memory computing, and it is suitable for Monte Carlo sampling,
while requiring almost exclusively addition and comparison operations. We
demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and
event-based classification benchmarks (N-MNIST and DVS Gestures). Our results
show that NSMs perform comparably or better than conventional artificial neural
networks with the same architecture
Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation
We introduce Equilibrium Propagation, a learning framework for energy-based
models. It involves only one kind of neural computation, performed in both the
first phase (when the prediction is made) and the second phase of training
(after the target or prediction error is revealed). Although this algorithm
computes the gradient of an objective function just like Backpropagation, it
does not need a special computation or circuit for the second phase, where
errors are implicitly propagated. Equilibrium Propagation shares similarities
with Contrastive Hebbian Learning and Contrastive Divergence while solving the
theoretical issues of both algorithms: our algorithm computes the gradient of a
well defined objective function. Because the objective function is defined in
terms of local perturbations, the second phase of Equilibrium Propagation
corresponds to only nudging the prediction (fixed point, or stationary
distribution) towards a configuration that reduces prediction error. In the
case of a recurrent multi-layer supervised network, the output units are
slightly nudged towards their target in the second phase, and the perturbation
introduced at the output layer propagates backward in the hidden layers. We
show that the signal 'back-propagated' during this second phase corresponds to
the propagation of error derivatives and encodes the gradient of the objective
function, when the synaptic update corresponds to a standard form of
spike-timing dependent plasticity. This work makes it more plausible that a
mechanism similar to Backpropagation could be implemented by brains, since
leaky integrator neural computation performs both inference and error
back-propagation in our model. The only local difference between the two phases
is whether synaptic changes are allowed or not
Nearly extensive sequential memory lifetime achieved by coupled nonlinear neurons
Many cognitive processes rely on the ability of the brain to hold sequences
of events in short-term memory. Recent studies have revealed that such memory
can be read out from the transient dynamics of a network of neurons. However,
the memory performance of such a network in buffering past information has only
been rigorously estimated in networks of linear neurons. When signal gain is
kept low, so that neurons operate primarily in the linear part of their
response nonlinearity, the memory lifetime is bounded by the square root of the
network size. In this work, I demonstrate that it is possible to achieve a
memory lifetime almost proportional to the network size, "an extensive memory
lifetime", when the nonlinearity of neurons is appropriately utilized. The
analysis of neural activity revealed that nonlinear dynamics prevented the
accumulation of noise by partially removing noise in each time step. With this
error-correcting mechanism, I demonstrate that a memory lifetime of order
can be achieved.Comment: 21 pages, 5 figures, the manuscript has been accepted for publication
in Neural Computatio
Incremental construction of LSTM recurrent neural network
Long Short--Term Memory (LSTM) is a recurrent neural network that
uses structures called memory blocks to allow the net remember
significant events distant in the past input sequence in order to
solve long time lag tasks, where other RNN approaches fail.
Throughout this work we have performed experiments using LSTM
networks extended with growing abilities, which we call GLSTM.
Four methods of training growing LSTM has been compared. These
methods include cascade and fully connected hidden layers as well
as two different levels of freezing previous weights in the
cascade case. GLSTM has been applied to a forecasting problem in a biomedical domain, where the input/output behavior of five
controllers of the Central Nervous System control has to be
modelled. We have compared growing LSTM results against other
neural networks approaches, and our work applying conventional
LSTM to the task at hand.Postprint (published version
- …