88 research outputs found
Biologically plausible deep learning -- but how far can we go with shallow networks?
Training deep neural networks with the error backpropagation algorithm is
considered implausible from a biological perspective. Numerous recent
publications suggest elaborate models for biologically plausible variants of
deep learning, typically defining success as reaching around 98% test accuracy
on the MNIST data set. Here, we investigate how far we can go on digit (MNIST)
and object (CIFAR10) classification with biologically plausible, local learning
rules in a network with one hidden layer and a single readout layer. The hidden
layer weights are either fixed (random or random Gabor filters) or trained with
unsupervised methods (PCA, ICA or Sparse Coding) that can be implemented by
local learning rules. The readout layer is trained with a supervised, local
learning rule. We first implement these models with rate neurons. This
comparison reveals, first, that unsupervised learning does not lead to better
performance than fixed random projections or Gabor filters for large hidden
layers. Second, networks with localized receptive fields perform significantly
better than networks with all-to-all connectivity and can reach backpropagation
performance on MNIST. We then implement two of the networks - fixed, localized,
random & random Gabor filters in the hidden layer - with spiking leaky
integrate-and-fire neurons and spike timing dependent plasticity to train the
readout layer. These spiking models achieve > 98.2% test accuracy on MNIST,
which is close to the performance of rate networks with one hidden layer
trained with backpropagation. The performance of our shallow network models is
comparable to most current biologically plausible models of deep learning.
Furthermore, our results with a shallow spiking network provide an important
reference and suggest the use of datasets other than MNIST for testing the
performance of future models of biologically plausible deep learning.Comment: 14 pages, 4 figure
Dimensions of Timescales in Neuromorphic Computing Systems
This article is a public deliverable of the EU project "Memory technologies
with multi-scale time constants for neuromorphic architectures" (MeMScales,
https://memscales.eu, Call ICT-06-2019 Unconventional Nanoelectronics, project
number 871371). This arXiv version is a verbatim copy of the deliverable
report, with administrative information stripped. It collects a wide and varied
assortment of phenomena, models, research themes and algorithmic techniques
that are connected with timescale phenomena in the fields of computational
neuroscience, mathematics, machine learning and computer science, with a bias
toward aspects that are relevant for neuromorphic engineering. It turns out
that this theme is very rich indeed and spreads out in many directions which
defy a unified treatment. We collected several dozens of sub-themes, each of
which has been investigated in specialized settings (in the neurosciences,
mathematics, computer science and machine learning) and has been documented in
its own body of literature. The more we dived into this diversity, the more it
became clear that our first effort to compose a survey must remain sketchy and
partial. We conclude with a list of insights distilled from this survey which
give general guidelines for the design of future neuromorphic systems
An optimised deep spiking neural network architecture without gradients
We present an end-to-end trainable modular event-driven neural architecture
that uses local synaptic and threshold adaptation rules to perform
transformations between arbitrary spatio-temporal spike patterns. The
architecture represents a highly abstracted model of existing Spiking Neural
Network (SNN) architectures. The proposed Optimized Deep Event-driven Spiking
neural network Architecture (ODESA) can simultaneously learn hierarchical
spatio-temporal features at multiple arbitrary time scales. ODESA performs
online learning without the use of error back-propagation or the calculation of
gradients. Through the use of simple local adaptive selection thresholds at
each node, the network rapidly learns to appropriately allocate its neuronal
resources at each layer for any given problem without using a real-valued error
measure. These adaptive selection thresholds are the central feature of ODESA,
ensuring network stability and remarkable robustness to noise as well as to the
selection of initial system parameters. Network activations are inherently
sparse due to a hard Winner-Take-All (WTA) constraint at each layer. We
evaluate the architecture on existing spatio-temporal datasets, including the
spike-encoded IRIS and TIDIGITS datasets, as well as a novel set of tasks based
on International Morse Code that we created. These tests demonstrate the
hierarchical spatio-temporal learning capabilities of ODESA. Through these
tests, we demonstrate ODESA can optimally solve practical and highly
challenging hierarchical spatio-temporal learning tasks with the minimum
possible number of computing nodes.Comment: 18 pages, 6 figure
ReStoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing
In this work, we propose ReStoCNet, a residual stochastic multilayer convolutional Spiking Neural Network (SNN) composed of binary kernels, to reduce the synaptic memory footprint and enhance the computational efficiency of SNNs for complex pattern recognition tasks. ReStoCNet consists of an input layer followed by stacked convolutional layers for hierarchical input feature extraction, pooling layers for dimensionality reduction, and fully-connected layer for inference. In addition, we introduce residual connections between the stacked convolutional layers to improve the hierarchical feature learning capability of deep SNNs. We propose Spike Timing Dependent Plasticity (STDP) based probabilistic learning algorithm, referred to as Hybrid-STDP (HB-STDP), incorporating Hebbian and anti-Hebbian learning mechanisms, to train the binary kernels forming ReStoCNet in a layer-wise unsupervised manner. We demonstrate the efficacy of ReStoCNet and the presented HB-STDP based unsupervised training methodology on the MNIST and CIFAR-10 datasets. We show that residual connections enable the deeper convolutional layers to self-learn useful high-level input features and mitigate the accuracy loss observed in deep SNNs devoid of residual connections. The proposed ReStoCNet offers >20 Ă— kernel memory compression compared to full-precision (32-bit) SNN while yielding high enough classification accuracy on the chosen pattern recognition tasks
- …