2,540 research outputs found
On-chip Few-shot Learning with Surrogate Gradient Descent on a Neuromorphic Processor
Recent work suggests that synaptic plasticity dynamics in biological models
of neurons and neuromorphic hardware are compatible with gradient-based
learning (Neftci et al., 2019). Gradient-based learning requires iterating
several times over a dataset, which is both time-consuming and constrains the
training samples to be independently and identically distributed. This is
incompatible with learning systems that do not have boundaries between training
and inference, such as in neuromorphic hardware. One approach to overcome these
constraints is transfer learning, where a portion of the network is pre-trained
and mapped into hardware and the remaining portion is trained online. Transfer
learning has the advantage that pre-training can be accelerated offline if the
task domain is known, and few samples of each class are sufficient for learning
the target task at reasonable accuracies. Here, we demonstrate on-line
surrogate gradient few-shot learning on Intel's Loihi neuromorphic research
processor using features pre-trained with spike-based gradient
backpropagation-through-time. Our experimental results show that the Loihi chip
can learn gestures online using a small number of shots and achieve results
that are comparable to the models simulated on a conventional processor
Error-triggered Three-Factor Learning Dynamics for Crossbar Arrays
Recent breakthroughs suggest that local, approximate gradient descent
learning is compatible with Spiking Neural Networks (SNNs). Although SNNs can
be scalably implemented using neuromorphic VLSI, an architecture that can learn
in-situ as accurately as conventional processors is still missing. Here, we
propose a subthreshold circuit architecture designed through insights obtained
from machine learning and computational neuroscience that could achieve such
accuracy. Using a surrogate gradient learning framework, we derive local,
error-triggered learning dynamics compatible with crossbar arrays and the
temporal dynamics of SNNs. The derivation reveals that circuits used for
inference and training dynamics can be shared, which simplifies the circuit and
suppresses the effects of fabrication mismatch. We present SPICE simulations on
XFAB 180nm process, as well as large-scale simulations of the spiking neural
networks on event-based benchmarks, including a gesture recognition task. Our
results show that the number of updates can be reduced hundred-fold compared to
the standard rule while achieving performances that are on par with the
state-of-the-art
- …