FPGA Implementation of Simplified Spiking Neural Network by Gupta, Shikhar et al.
FPGA Implementation of Simplified Spiking Neural Network
Shikhar Gupta, Indian Institute of Technology, Guwahati
Arpan Vyas, Indian Institute of Technology, Guwahati
Gaurav Trivedi, Indian Institute of Technology, Guwahati
Email: guptashikhar20@gmail.com, arpanvyas@gmail.com, trivedi@iitg.ac.in
Abstract—Spiking Neural Networks (SNN) are third-
generation Artificial Neural Networks (ANN) which are close
to the biological neural system. In recent years SNN has become
popular in the area of robotics and embedded applications,
therefore, it has become imperative to explore its real-time and
energy-efficient implementations. SNNs are more powerful than
their predecessors because they encode temporal information
and use biologically plausible plasticity rules. In this paper, a
simpler and computationally efficient SNN model using FPGA
architecture is described. The proposed model is validated on a
Xilinx Virtex 6 FPGA and analyzes a fully connected network
which consists of 800 neurons and 12,544 synapses in real-time.
I. INTRODUCTION
With Moore’s law being transformed into more than Moore,
the scientific community is exploring alternate avenues for
faster, cheaper, and more efficient computing, and Neuro-
morphic Computing is found to be one of the viable re-
placements of the present computing paradigm. Neuromorphic
computing can be realized efficiently by utilizing Spiking
Neural Networks (SNN). It aims at realizing the architecture
and performance of a brain in silicon. The brain, being the
most efficient system present, processes complex information
much faster than any existing computer. In recent years, the
popularity and application of spiking neural networks have
increased considerably. SNN is prominently used in many
applications such as event detection, classification, speech
recognition, spatial navigation, and autonomous motor control.
It has demonstrated its effectiveness in detecting analog signals
from sensors [1]; designing controllers for autonomous robots
[2]; performing detection and recognition tasks [3]; processing
cortical data [4] and tactile form-based recognition [5].
With the advent of autonomous robots and self-driving
vehicles and due to the rise in the realtime applications of em-
bedded systems, it has become imperative to realize machine
learning models on compact and energy-efficient platforms.
The existing neural network models [6], are computationally
intensive and require huge memory for their realization, there-
fore making them unsuitable for realtime and energy-efficient
applications. Although SNN has been realized in Applications
Specific Integrated Circuits (ASIC) such as SpiNNaker [7],
BrainScaleS [8], SyNAPSE [9] Neuropipe-chip [10], etc. their
objective is mainly to provide a solution for large scale
simulations rather than for low power embedded applications.
The SNN model proposed in this paper is aimed towards
its energy-efficient, portable and realtime implementation for
improvising the performance of electronic systems.
Conventionally, Field Programmable Gate Arrays (FPGA)
are utilized for the validation of electronic systems as well
as in the implementation of time-critical systems. FPGAs
are also well-suited for providing a low power solution for
massively parallel and computationally less complex models
[11]. Although parallelism can be achieved using Graphics
Processing Units (GPU) as well, FPGA implementations are
advantageous where power consumption is an issue [12].
In today’s era, it is becoming mandatory for the intelligent
system to learn efficiently in realtime from its surroundings.
Existing FPGA models of SNN [13], [14] do not support the
training of the network and, if the weights must be changed,
the hardware has to be reprogrammed, rendering it unsuitable
for realtime applications. The proposed solution combines the
neuron membrane model and on-line spike-time dependent
plasticity (STDP) learning.
II. RELATED WORK
Multiple hardware accelerators for SNN have been designed
and implemented. The central aim of these architectures is to
eradicate the limitations of software simulators like BRIAN
[15], NEST [16], and NEURON [17], which are widely
accepted in the research community. However, they hit scaling
issues and become too slow for large scale networks due to
lack of parallel computations [18]. In this section, we present
an account of the previously implemented architectures. Table
I summarizes the similar works and corresponding time resolu-
tion, number of neurons, and synapses simulated and resources
consumed per neuron. Comparing resources used per neurons
takes into account the size of the network and is favorable for
bigger networks that consume more resources.
The FPGA architecture by [19] simulates a fully connected
network of 1,440 Izhikevich (IZ) neurons. The design updates
all neurons at every time step regardless of the spiking activity.
The time resolution recorded is 0.1 ms. Our proposed design
is event-driven and updates a neuron only if there is any
spiking activity. This implementation is more energy-efficient
and gives two orders of magnitude lesser time resolution (2.5
us). In terms of resource usage, our architecture (800 neurons,
12,544 synapses) is far more efficient than theirs (1,440
neurons, 16 synapses). Also, their fixed-point representation
of weights provides lesser flexibility than our floating-point
representation. Although floating-point has higher latency,
simplified equations compensated for it.
Another implementation is NeuroFlow [20], which can
simulate an impressive number of neuronal units (600,000),
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers
or lists, or reuse of any copyrighted component of this work in other works.
ar
X
iv
:2
01
0.
01
20
0v
1 
 [c
s.N
E]
  2
 O
ct 
20
20
2020 27th IEEE International Conference on Electronics, Circuits & Systems
TABLE I
RESULTS OF RELEVANT STATE-OF-THE-ART SNN HARDWARE ACCELERATORS
Work Platform Model Time Resolution Neurons Synapses/Neuron Resources/Neuron (FF, LUT)
Upegui et al., 2005 FPGA Custom 1ms 30 30 100, 100
Pearson et al., 2007 FPGA LIF 0.5ms 1120 912/112 –
Cassidy et al., 2007 FPGA LIF 320ns 51 128 146, 230
Jin et al., 2008 Multiprocessor (ARM) IZ 1ms 1000 100 –
Thomas and Luk, 2009 FPGA IZ 10us 1024 1024 39, 19
Ambroise et al., 2013 FPGA IZ 1ms 117 117 8, 17
Cheung et al., 2016 FPGA IZ 1ms > 98,000 1,000 - 10,000 –
Pani et al., 2017 FPGA IZ 0.1ms 1440 1440 37, 39
This work FPGA Simplified LIF 1ms 800 12,544 29, 70
both LIF and IZ, on a 6-FPGA system. When compared to
this setup, our design provides a sampling rate of 400kHz
against their 1kHz.
The same applies to large scale hardware implementations
like SpiNNaker [7], which are not well suited for low-power,
compact embedded applications and cost very much. Our
implementation aims at providing a comparatively small scale
and energy-efficient solution. On the other hand, several neu-
ral implementations based on Application Specific Integrated
Circuits (ASIC) have also been proposed [21]. Although they
exhibit better performance and are more energy-efficient, there
is a growing interest in the use of FPGA for this. FPGA
provides the user the freedom to reconfigure fully or partially
the configuration bitstream. Also, they facilitate the creation
of multiprocessor systems (as on ASIC) due to the presence
of IP cores [19].
III. SIMPLIFIED NEURAL NETWORK MODEL
As far as classic leaky integrate-and-fire (LIF) neuron model
[6] is concerned, it is computationally very complex and has
large memory requirements. In the literature [22], is described
a simplified version of the LIF model with computationally
less complex membrane potential update equations. Let Pt be
the membrane potential (function of time) which is altered by
each incoming spike Sit, i = [1...n] by a value of synapse
weight Wi, along with the voltage leakage factor, D,
Pt =

Pt−1 +
n∑
i=1
SitWi −D if Pmin > Pt−1 < Pthreshold
Prefract if Pt−1 ≥ Pthreshold
RP if Pt−1 ≤ Pmin
(1)
At every time constant t, the membrane potential decreases
by a fixed factor Pt = Pt−1 - D given that Pt−1 is greater
than Rp, the resting potential. This simplified equation can be
easily implemented in hardware unlike classic models which
require large number of look up tables. When Pt > Pthreshold
a spike is fired and the neuron transits into refractory phase
where it blocks any input for a duration of trefract.
IV. SPIKE TIME DEPENDENT PLASTICITY
Spike Time Dependent Plasticity (STDP) is a biological
process, first discovered by Bi and Poo [23], which changes the
connection strengths between neurons (synapses) in the brain.
It is an unsupervised learning algorithm that operates upon
the time difference between post-synaptic and pre-synaptic
spikes. In a given synapse, if the post-synaptic spike occurs
in a specific time window (STDP window) after pre-synaptic
spike, the synaptic strength is increased, and if it occurs before
pre-synaptic spike, the strength is decreased.
Equation 2 describes the simplified weight change rule.
STDP window is described as t ∈ [2, 20] in both directions.
Weights are kept within the range wmin < w < wmax.
wnew =
{
wold + σ∆w(wmax − wold) if ∆w > 0
wold + σ∆w(wold − wmin) if ∆w ≤ 0
(2)
∆w is calculated using the exponential reinforcement curve
employed in classic LIF. It has only 19 entries on either
side and, hence, requires very few memory resources (lookup
tables) for implementation. Equation 3 describes the determi-
nation of change in weights; ∆t is the time difference between
pre-synaptic and post-synaptic spikes and A+ and A− are the
constants for positive and negative ∆t values, respectively; τ+
and τ− are steepness time constants in both directions. Weights
are kept within the range wmin < w < wmax.
∆w =

A− exp(∆tτ− ) if ∆t ≤ −2
0 if 2 < ∆t < 2
A+ exp(∆tτ+ ) if ∆t ≥ −2
(3)
V. KEY DESIGN ELEMENTS
Spike Time Dependent Plasticity is the backbone of the
spiking neural networks as it enables the learning process.
A complete system aiming to replicate the biological neural
model requires feature extraction, relativity of neural activity
based on input strength, and competition among neurons for
a particular class. These are described as follows.
A. Visual Receptive Field
Receptive Field is the area of an image which produces the
input for a visual neuron. As the neural layers go deeper the
receptive fields get bigger, convolving inputs from the previous
layers. It can be expressed as a convolution filter for a variety
of operations like edge detection, sharpening, and blurring. In
our system we have implemented the receptive field, RF as a
low pass blurring filter on the image.
2
2020 27th IEEE International Conference on Electronics, Circuits & Systems
Fig. 1. Full Mesh Spiking Neural Network
B. Spike Generation
In a biological neuron, the input excitation is transmitted
as time-domain impulses. The frequency of the impulses is
proportional to the excitation. We have replicated this by
taking the output of the receptive field RF , the maximum
membrane potential Rmax, and the minimum refractory period
RPmin to generate firing rate of the neuron FR.
FR =
{
1
RPmin∗ RFRmax
if RF > 0
0 if RF ≤ 0
(4)
C. Variable Threshold and Lateral Inhibition
The voltage threshold at which a neuron fire is kept variable
for each image. This is to make sure that images irrespective of
their relative brightness produce uniform output for subsequent
layers. The voltage threshold is kept as one third the maximum
input spike frequency of the layer.
Lateral Inhibition is used to induce competition among
output layer neurons for a particular class. When the first
output neuron fires for a particular image, the potential of all
the other neurons is reduced by half the threshold potential.
This ’winner takes all’ strategy ensures selectivity of the
winner neuron for the class as the firing activity of other
contending neurons is suppressed. Lateral inhibition is from
biological neural networks where strengthening of one neural
pathway weakens the neighbouring ones.
D. Hardware Algorithm
Spiking neural networks rely on the temporal information
carried by spike trains and hence require a notion of time. A
Time Unit block has been used in our system to keep track
of time steps during training and classification. Although the
number of time units is fixed for every data sample in training
or classification phases, the number clocks to process one is
highly adaptive as shown in Fig.2.
Fig. 2. Breakdown of a Time Unit into processes under different scenarios. X
axis denotes time and y axis points to the scenarios. Clock cycles consumed
by a Time Unit differ depending upon input and output, being the least when
there is neither an input nor an output spike and the most when both input
and output spikes are present. tsc, tpa, tpd and twc are the actual times taken
by each process. The figure is according to scale i.e tsc = tpd <tpa <twc
Fig. 3. Block Level view of the system
VI. RESULTS
A. Resource Usage
The network with 784 input (I) and 16 output (O) neu-
rons and with 12,544 synapses using 24 bits to represent a
membrane potential and weights (W ) was implemented on
Xilinx XC6VLX240T device. A single Block RAM of size
36Kb was used as FIFO for each output neuron to store
synapse weights. Additionally, four DSP48E1 slices capable of
addition and multiplication were used for single output neuron
in its Potential Adder and Weight change blocks. The Table
II provides resource usage for the complete system.
3
2020 27th IEEE International Conference on Electronics, Circuits & Systems
TABLE II
RESOURCE USAGE
Resource Used Available % Used Resource Complexity
Flip-flops 23,238 301,440 8% I.O.W
Slice LUTs 56,230 150,720 37% I.O.W
BRAMs 16 416 4% O
DSP48E1 64 768 8% O
B. Timing Analysis
The proposed architecture has exploited the sparsity of
spiking neural network for improvement in speed by morphing
Time Units dynamic according to the spiking activity in input
and output neurons (Fig.2). Table III and Table IV provides
timing analysis for classification and training of N images at
an operating frequency of 100MHz.
TABLE III
TIMING ANALYSIS FOR CLASSIFICATION
Operation Time Time Complexity
Time Unit - Maximum 8.5us I.O.W
Time Unit - Average 2.5us I.O.W
Single Image 0.5ms I.O.W
Classifying 10,000 images 5s N.(I.O.W )
TABLE IV
TIMING ANALYSIS FOR TRAINING
Operation Time Time Complexity
Time Unit - Maximum 17us I.O.W
Time Unit - Average 5us I.O.W
Single Image 1.1ms I.O.W
Training 60,000 images 65s N.(I.O.W )
Table V presents the comparison of training and classifica-
tion timings of a single image from the MNIST dataset over
FPGA and CPU. Our FPGA architecture showcases a speedup
of 256x for classification and 187x for training. The results
account for the massive parallelism provided by FPGA at a
low energy cost.
TABLE V
COMPARISON WITH SERIAL COMPUTATION
Operation FPGA CPU
Classification 0.5ms 128ms
Training 1.08ms 202ms
VII. CONCLUSION
In this paper, we described an architecture for Simplified
Spiking Neural Network which is implemented on FPGA and
optimized for low power embedded applications with real-time
learning. The simplification of the STDP algorithm doesn’t
compromise with the classification and learning capabilities
of the network and rather reduces the computation complexity
which in turn helps in developing a hardware accelerator with
minimum resource usage. The system is designed to take
advantage of the sparsity of the network and fabricate each
time unit according to the activity in the network. Also, an
account of parameter analysis is presented which showcases
our methodology of choosing efficient parameter values.
REFERENCES
[1] J. J. Lovelace, J. T. Rickard, and K. J. Cios, “A spiking neural network
alternative for the analog to digital converter,” The 2010 International
Joint Conference on Neural Networks (IJCNN), 2010.
[2] F. Alnajjar and K. Murase, “A Simple Aplysia-Like Spiking Neural
Network to Generate Adaptive Behavior in Autonomous Robots ,”
Adaptive Behavior, 2008.
[3] J. A. Perez-Carrasco, B. Acha, C. Serrano, L. Camunas-Mesa,
T. Serrano-Gotarredona, and B. Linares-Barranco, “Fast vision through
frameless event-based sensing and convolutional processing: Application
to texture recognition,” Neural Networks IEEE Transactions, 2010.
[4] H. Fang, Y. Wang, and J. He, “Spiking neural networks for cortical
neuronal spike train decoding,” Neural Comput. 2010.
[5] S. Ratnasingam and T. McGinnity, “A spiking neural network for
tactile form based object recognition,” International Joint Conference
on Neural Networks (IJCNN), 2011.
[6] W. Gerstner and W. Kistler, “Spiking Neuron Models: Single Neurons,
Populations, Plasticity,” Cambridge University Press, 2002.
[7] Painkras et al., “SpiNNaker, A 1-w 18-core system-on-chip for
massively-parallel neural network simulation,” IEEE J. Solid-State Cir-
cuits, 2013.
[8] Schemmel et al., “ A scaled-down version of the brainscales wafer-scale
neuromorphic system,” IEEE International Symposium on Circuits and
Systems (ISCAS). New Jersey, USA, 2012.
[9] Hylton T., “Systems of neuromorphic adaptive plastic scalable electron-
ics,” http://www.scribd.com/doc/76634068/Darpa-Baa-Synapse, 2008.
[10] Schoenauer et al., “Neuropipe-chip: a digital neuro-processor for spiking
neural networks,” Neural Networks, IEEE Transactions, 2002.
[11] Omondi, A. R, Rajapakse, and J. Chandana, FPGA implementations of
neural networks. Springer, 2006, vol. 365.
[12] Betkaoui et al., “Comparing performance and energy efficiency of fpgas
and gpus for high productivity computing,” International Conference on
Field-Programmable Technology, 2010.
[13] A. Rosado-Mun˜oz, M. Bataller-Mompea´n, and J. Guerrero-Martı´nez,
“FPGA implementation of Spiking Neural Networks,” IFAC Proceedings
Volumes, Vol. 45, 2012.
[14] A. Rosado-Mun˜oz, A. F. kowski, M. Bataller-Mompea´n, and J. Guerrero-
Martı´nez, “FPGA implementation of Spiking Neural Networks sup-
ported by a Software Design Environment ,” 18th IFAC World Congress
Milano (Italy), 2011.
[15] Goodman DF and B. R, “The Brian simulator,” Front Neurosci, 2009.
[16] Linssen and Charl et al, “NEST,” 2018.
[17] Carnevale N. and H. M., “The NEURON Book,” Cambridge: Cambridge
University Press, 2006.
[18] Rast et al, “Scalable event-driven native parallel processing: the spin-
naker neuromimetic system,” Proceedings of the 7th ACM International
Conference on Computing Frontiers, 2010.
[19] Pani et al, “An FPGA Platform for Real-Time Simulation of Spiking
Neuronal Networks,” Front Neurosci., 2017.
[20] Cheung K, S. SR, and L. W, “NeuroFlow: A General Purpose Spiking
Neural Network Simulation Platform using Customizable Processors.”
Front Neurosci., 2015.
[21] Hofstoetter et al., “The cerebellum chip: an analog VLSI implementation
of a cerebellar model of classical conditioning,” Advances in Neural
Information Processing Systems, 2005.
[22] T. Iakymchuk, A. Rosado-Mun˜oz, J. F. Guerrero-Martı´nez, M. Bataller-
Mompea´n, and J. V. France´s-Vı´llora, “Simplified spiking neural network
architecture and STDP learning algorithm applied to image classifica-
tion,” EURASIP Journal on Image and Video Processing, 2018.
[23] Bi G-Q and P. M-M, “Synaptic modifications in cultured hippocampal
neurons: dependence on spike timing, synaptic strength, and postsynap-
tic cell type,” J. Neurosci., 1998.
4
