17 research outputs found
Temporal Pattern Coding in Deep Spiking Neural Networks
Deep Artificial Neural Networks (ANNs) employ a simplified analog neuron model that mimics the rate transfer function of integrate-and-fire neurons. In Spiking Neural Networks (SNNs), the predominant information transmission method is based on rate codes. This code is inefficient from a hardware perspective because the number of transmitted spikes is proportional to the encoded analog value. Alternate codes such as temporal codes that are based on single spikes are difficult to scale up for large networks due to their sensitivity to spike timing noise. Here we present a study of an encoding scheme based on temporal spike patterns. This scheme inherits the efficiency of temporal codes but retains the robustness of rate codes. The pattern code is evaluated on MNIST, CIFAR-10, and ImageNet image classification tasks. We compare the network performance of ANNs, rate-coded SNNs, and temporal-coded SNNs, using the classification error and operation count as performance metrics. We also estimate the power consumption of the digital logic needed for the operations associated with each encoding type, and the impact of the bit precision of the weights and activations. On ImageNet, the temporal pattern code achieves up to 35× reduction in the estimated power consumption compared to the rate-coded SNN, and 42× compared to the ANN. The classification error of the pattern-coded SNN is increased by <1% compared to the ANN, and decreased by 2% compared to the rate-coded SNN
Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization
Advances in reinforcement learning (RL) often rely on massive compute
resources and remain notoriously sample inefficient. In contrast, the human
brain is able to efficiently learn effective control strategies using limited
resources. This raises the question whether insights from neuroscience can be
used to improve current RL methods. Predictive processing is a popular
theoretical framework which maintains that the human brain is actively seeking
to minimize surprise. We show that recurrent neural networks which predict
their own sensory states can be leveraged to minimise surprise, yielding
substantial gains in cumulative reward. Specifically, we present the Predictive
Processing Proximal Policy Optimization (P4O) agent; an actor-critic
reinforcement learning agent that applies predictive processing to a recurrent
variant of the PPO algorithm by integrating a world model in its hidden state.
Even without hyperparameter tuning, P4O significantly outperforms a baseline
recurrent variant of the PPO algorithm on multiple Atari games using a single
GPU. It also outperforms other state-of-the-art agents given the same
wall-clock time and exceeds human gamer performance on multiple games including
Seaquest, which is a particularly challenging environment in the Atari domain.
Altogether, our work underscores how insights from the field of neuroscience
may support the development of more capable and efficient artificial agents.Comment: 24 pages, 8 figure
Fast temporal decoding from large-scale neural recordings in monkey visual cortex
With new developments in electrode and nanoscale technology, a large-scale multi-electrode cortical neural prosthesis with thousands of stimulation and recording electrodes is becoming viable. Such a system will be useful as both a neuroscience tool and a neuroprosthesis.
In the context of a visual neuroprosthesis, a rudimentary form of vision can be presented to the visually impaired by stimulating the electrodes to induce phosphene patterns. Additional feedback in a closed-loop system can be provided by rapid decoding of recorded responses from relevant brain areas. This work looks at temporal decoding results from a dataset of 1024 electrode recordings collected from the V1 and V4 areas of a primate performing a visual discrimination task. By applying deep learning models, the peak decoding accuracy from the V1 data can be obtained by a moving time window of 150 ms across the 800 ms phase of stimulus presentation. The peak accuracy from the V4 data is achieved at a larger latency and by using a larger moving time window of 300 ms. Decoding using a running window of 30 ms on the V1 data showed only a 4\% drop in peak accuracy. We also determined the robustness of the decoder to electrode failure by choosing a subset of important electrodes using a previously reported algorithm for scaling the importance of inputs to a network. Results show that the accuracy of 91.1\% from a network trained on the selected subset of 256 electrodes is close to the accuracy of 91.7\% from using all 1024 electrodes
Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification
Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications
A 128-channel real-time VPDNN stimulation system for a visual cortical neuroprosthesis
With the recent progress in developing large-scale micro-electrodes, cortical neuroprotheses supporting hundreds of electrodes will be viable in the near future. We describe work in building a visual stimulation system that receives camera input images and produces stimulation patterns for driving a large set of electrodes. The system consists of a convolutional neural network FPGA accelerator and a recording and stimulation Application-Specific Integrated Circuit (ASIC) that produces the stimulation patterns. It is aimed at restoring visual perception in visually impaired subjects. The FPGA accelerator, VPDNN, runs a visual prosthesis network that generates an output used to create stimulation patterns, which are then converted by the ASIC into current pulses to drive a multi-electrode array. The accelerator exploits spatial sparsity and the use of reduced bit precision parameters for reduced computation, memory and power for portability. Experimental results from the VPDNN show that the 94.5K parameter 14-layer CNN receiving an input of 128 × 128 has an inference frame rate of 83 frames per sec (FPS) and uses only an incremental power of 0.1 W, which is at least 10× lower than that measured from a Jetson Nano. The ASIC adds a maximum delay of 2ms, however it does not impact the FPS thanks to double-buffered memory.
Index Terms—Visual prosthesis, convolutional neural network, FPGA Accelerator, stimulation and recording ASI
Conversion of analog to spiking neural networks using sparse temporal coding
The activations of an analog neural network (ANN) are usually treated as representing an analog firing rate. When mapping the ANN onto an equivalent spiking neural network (SNN), this rate-based conversion can lead to undesired increases in computation cost and memory access, if firing rates are high. This work presents an efficient temporal encoding scheme, where the analog activation of a neuron in the ANN is treated as the instantaneous firing rate given by the time-to-first-spike (TTFS) in the converted SNN. By making use of temporal information carried by a single spike, we show a new spiking network model that uses 7-10× fewer operations than the original rate-based analog model on the MNIST handwritten dataset, with an accuracy loss of <; 1%
Contraction of Dynamically Masked Deep Neural Networks for Efficient Video Processing
Sequential data such as video are characterized by spatio-temporal redundancies. As of yet, few deep learning algorithms exploit them to decrease the often massive cost during inference. This work leverages correlations in video data to reduce the size and run-time cost of deep neural networks. Drawing upon the simplicity of the typically used ReLU activation function, we replace this function by dynamically updating masks. The resulting network is a simple chain of matrix multiplications and bias additions, which can be contracted into a single weight matrix and bias vector. Inference then reduces to an affine transformation of the input sample with these contracted parameters. We show that the method is akin to approximating the neural network with a first-order Taylor expansion around a dynamically updating reference point. For triggering these updates, one static and three data-driven mechanisms are analyzed. We evaluate the proposed algorithm on a range of tasks, including pose estimation on surveillance data, road detection on KITTI driving scenes, object detection on ImageNet videos, as well as denoising MNIST digits, and obtain compression rates up to 3.6×
Brain-inspired Learning Drives Advances in Neuromorphic Computing
The success of deep learning is founded on learning rules with biologically implausible properties, entailing high memory and energy costs. At the Donders Institute in Nijmegen, NL, we have developed GAIT-Prop, a learning method for large-scale neural networks that alleviates some of the biologically unrealistic attributes of conventional deep learning. By localising weight updates in space and time, our method reduces computational complexity and illustrates how powerful learning rules can be implemented within the constraints on connectivity and communication present in the brain.ISSN:0926-498
Theory and tools for the conversion of analog to spiking convolutional neural networks
Deep convolutional neural networks (CNNs) have shown great potential for numerous real-world machine learning applications, but performing inference in large CNNs in real-time remains a challenge. We have previously demonstrated that traditional CNNs can be converted into deep spiking neural networks (SNNs), which exhibit similar accuracy while reducing both latency and computational load as a consequence of their data-driven, event-based style of computing. Here we provide a novel theory that explains why this conversion is successful, and derive from it several new tools to convert a larger and more powerful class of deep networks into SNNs. We identify the main sources of approximation errors in previous conversion methods, and propose simple mechanisms to fix these issues. Furthermore, we develop spiking implementations of common CNN operations such as max-pooling, softmax, and batch-normalization, which allow almost loss-less conversion of arbitrary CNN architectures into the spiking domain. Empirical evaluation of different network architectures on the MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date
LiteEdge: Lightweight Semantic Edge Detection Network
Scene parsing is a critical component for understanding complex scenes in applications such as autonomous driving. Semantic segmentation networks are typically reported for scene parsing but semantic edge networks have also become of interest because of the sparseness of the segmented maps. This work presents an end-to-end trained lightweight deep semantic edge detection architecture called LiteEdge suitable for edge deployment. By utilizing hierarchical supervision and a new weighted multi-label loss function to balance different edge classes during training, LiteEdge predicts with high accuracy category-wise binary edges. Our LiteEdge network with only ≈ 3M parameters, has a semantic edge prediction accuracy of 52.9% mean maximum F (MF) score on the Cityscapes dataset. This accuracy was evaluated on the network trained to produce a low resolution edge map. The network can be quantized to 6-bit weights and 8-bit activations and shows only a 2% drop in the mean MF score. This quantization leads to a memory footprint savings of 6X for an edge device