76 research outputs found
Composing Recurrent Spiking Neural Networks using Locally-Recurrent Motifs and Risk-Mitigating Architectural Optimization
In neural circuits, recurrent connectivity plays a crucial role in network
function and stability. However, existing recurrent spiking neural networks
(RSNNs) are often constructed by random connections without optimization. While
RSNNs can produce rich dynamics that are critical for memory formation and
learning, systemic architectural optimization of RSNNs is still an open
challenge. We aim to enable systematic design of large RSNNs via a new scalable
RSNN architecture and automated architectural optimization. We compose RSNNs
based on a layer architecture called Sparsely-Connected Recurrent Motif Layer
(SC-ML) that consists of multiple small recurrent motifs wired together by
sparse lateral connections. The small size of the motifs and sparse inter-motif
connectivity leads to an RSNN architecture scalable to large network sizes. We
further propose a method called Hybrid Risk-Mitigating Architectural Search
(HRMAS) to systematically optimize the topology of the proposed recurrent
motifs and SC-ML layer architecture. HRMAS is an alternating two-step
optimization process by which we mitigate the risk of network instability and
performance degradation caused by architectural change by introducing a novel
biologically-inspired "self-repairing" mechanism through intrinsic plasticity.
The intrinsic plasticity is introduced to the second step of each HRMAS
iteration and acts as unsupervised fast self-adaptation to structural and
synaptic weight modifications introduced by the first step during the RSNN
architectural "evolution". To the best of the authors' knowledge, this is the
first work that performs systematic architectural optimization of RSNNs. Using
one speech and three neuromorphic datasets, we demonstrate the significant
performance improvement brought by the proposed automated architecture
optimization over existing manually-designed RSNNs.Comment: 20 pages, 7 figure
The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks
Brains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks
An optimised deep spiking neural network architecture without gradients
We present an end-to-end trainable modular event-driven neural architecture
that uses local synaptic and threshold adaptation rules to perform
transformations between arbitrary spatio-temporal spike patterns. The
architecture represents a highly abstracted model of existing Spiking Neural
Network (SNN) architectures. The proposed Optimized Deep Event-driven Spiking
neural network Architecture (ODESA) can simultaneously learn hierarchical
spatio-temporal features at multiple arbitrary time scales. ODESA performs
online learning without the use of error back-propagation or the calculation of
gradients. Through the use of simple local adaptive selection thresholds at
each node, the network rapidly learns to appropriately allocate its neuronal
resources at each layer for any given problem without using a real-valued error
measure. These adaptive selection thresholds are the central feature of ODESA,
ensuring network stability and remarkable robustness to noise as well as to the
selection of initial system parameters. Network activations are inherently
sparse due to a hard Winner-Take-All (WTA) constraint at each layer. We
evaluate the architecture on existing spatio-temporal datasets, including the
spike-encoded IRIS and TIDIGITS datasets, as well as a novel set of tasks based
on International Morse Code that we created. These tests demonstrate the
hierarchical spatio-temporal learning capabilities of ODESA. Through these
tests, we demonstrate ODESA can optimally solve practical and highly
challenging hierarchical spatio-temporal learning tasks with the minimum
possible number of computing nodes.Comment: 18 pages, 6 figure
Network Plasticity as Bayesian Inference
General results from statistical learning theory suggest to understand not
only brain computations, but also brain plasticity as probabilistic inference.
But a model for that has been missing. We propose that inherently stochastic
features of synaptic plasticity and spine motility enable cortical networks of
neurons to carry out probabilistic inference by sampling from a posterior
distribution of network configurations. This model provides a viable
alternative to existing models that propose convergence of parameters to
maximum likelihood values. It explains how priors on weight distributions and
connection probabilities can be merged optimally with learned experience, how
cortical networks can generalize learned information so well to novel
experiences, and how they can compensate continuously for unforeseen
disturbances of the network. The resulting new theory of network plasticity
explains from a functional perspective a number of experimental data on
stochastic aspects of synaptic plasticity that previously appeared to be quite
puzzling.Comment: 33 pages, 5 figures, the supplement is available on the author's web
page http://www.igi.tugraz.at/kappe
Algorithm Hardware Codesign for High Performance Neuromorphic Computing
Driven by the massive application of Internet of Things (IoT), embedded system and Cyber Physical System (CPS) etc., there is an increasing demand to apply machine intelligence on these power limited scenarios. Though deep learning has achieved impressive performance on various realistic and practical tasks such as anomaly detection, pattern recognition, machine vision etc., the ever-increasing computational complexity and model size of Deep Neural Networks (DNN) make it challenging to deploy them onto aforementioned scenarios where computation, memory and energy resource are all limited. Early studies show that biological systems\u27 energy efficiency can be orders of magnitude higher than that of digital systems. Hence taking inspiration from biological systems, neuromorphic computing and Spiking Neural Network (SNN) have drawn attention as alternative solutions for energy-efficient machine intelligence.
Though believed promising, neuromorphic computing are hardly used for real world applications. A major problem is that the performance of SNN is limited compared with DNNs due to the lack of efficient training algorithm. In SNN, neuron\u27s output is spike, which is represented by Dirac Delta function mathematically. Becauase of the non-differentiable nature of spike, gradient descent cannot be directly used to train SNN. Hence algorithm level innovation is desirable. Next, as an emerging computing paradigm, hardware and architecture level innovation is also required to support new algorithms and to explore the potential of neuromorphic computing.
In this work, we present a comprehensive algorithm-hardware codesign for neuromorphic computing. On the algorithm side, we address the training difficulty. We first derive a flexible SNN model that retains critical neural dynamics, and then develop algorithm to train SNN to learn temporal patterns. Next, we apply proposed algorithm to multivariate time series classification tasks to demonstrate its advantages. On hardware level, we develop a systematic solution on FPGA that is optimized for proposed SNN model to enable high performance inference. In addition, we also explore emerging devices, a memristor-based neuromorphic design is proposed. We carry out a neuron and synapse circuit which can replicate the important neural dynamics such as filter effect and adaptive threshold
T-NGA: Temporal Network Grafting Algorithm for Learning to Process Spiking Audio Sensor Events
Spiking silicon cochlea sensors encode sound as an asynchronous stream of
spikes from different frequency channels. The lack of labeled training datasets
for spiking cochleas makes it difficult to train deep neural networks on the
outputs of these sensors. This work proposes a self-supervised method called
Temporal Network Grafting Algorithm (T-NGA), which grafts a recurrent network
pretrained on spectrogram features so that the network works with the cochlea
event features. T-NGA training requires only temporally aligned audio
spectrograms and event features. Our experiments show that the accuracy of the
grafted network was similar to the accuracy of a supervised network trained
from scratch on a speech recognition task using events from a software spiking
cochlea model. Despite the circuit non-idealities of the spiking silicon
cochlea, the grafted network accuracy on the silicon cochlea spike recordings
was only about 5% lower than the supervised network accuracy using the
N-TIDIGITS18 dataset. T-NGA can train networks to process spiking audio sensor
events in the absence of large labeled spike datasets.Comment: 5 pages, 4 figures; accepted at IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), Singapore, 202
- …