FSpiNN: An Optimization Framework for Memory- and Energy-Efficient
  Spiking Neural Networks by Putra, Rachmad Vidya Wicaksana & Shafique, Muhammad
TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE) 1
FSpiNN: An Optimization Framework for Memory-
and Energy-Efficient Spiking Neural Networks
Rachmad Vidya Wicaksana Putra and Muhammad Shafique, Senior Member, IEEE
Abstract—Spiking Neural Networks (SNNs) are gaining
interest due to their event-driven processing which potentially
consumes low power/energy computations in hardware platforms,
while offering unsupervised learning capability due to
the spike-timing-dependent plasticity (STDP) rule. However,
state-of-the-art SNNs require a large memory footprint to
achieve high accuracy, thereby making them difficult to be
deployed on embedded systems, for instance on battery-powered
mobile devices and IoT Edge nodes. Towards this, we propose
FSpiNN, an optimization framework for obtaining memory- and
energy-efficient SNNs for training and inference processing, with
unsupervised learning capability while maintaining accuracy.
It is achieved by (1) reducing the computational requirements
of neuronal and STDP operations, (2) improving the accuracy
of STDP-based learning, (3) compressing the SNN through a
fixed-point quantization, and (4) incorporating the memory and
energy requirements in the optimization process. FSpiNN reduces
the computational requirements by reducing the number of
neuronal operations, the STDP-based synaptic weight updates,
and the STDP complexity. To improve the accuracy of learning,
FSpiNN employs timestep-based synaptic weight updates, and
adaptively determines the STDP potentiation factor and the
effective inhibition strength. The experimental results show that,
as compared to the state-of-the-art work, FSpiNN achieves 7.5x
memory saving, and improves the energy-efficiency by 3.5x on
average for training and by 1.8x on average for inference, across
MNIST and Fashion MNIST datasets, with no accuracy loss
for a network with 4900 excitatory neurons, thereby enabling
energy-efficient SNNs for edge devices/embedded systems.
Index Terms—Framework, optimization, spiking neural
networks, SNNs, spike-timing-dependent plasticity, STDP,
unsupervised learning, adaptivity, memory, energy-efficiency,
edge devices, embedded systems.
I. INTRODUCTION
THE spiking neural networks (SNNs) are rapidlygaining research interest since they have shown great
potential in completing various machine learning tasks,
while exhibiting high biological plausibility [1]–[5]. That
is, the SNNs mimic the behavior of biological spiking
networks through (i) event-driven processing, and (ii)
spike-timing-dependent plasticity (STDP)-based unsupervised
learning. The event-driven processing potentially enables low
power/energy computation in the neuromorphic hardware,
Manuscript received April 18, 2020; revised June 12, 2020; accepted July 6,
2020. This article was presented in the International Conference on Compilers,
Architecture, and Synthesis for Embedded Systems 2020 and appears as part
of the ESWEEK-TCAD special issue.
R. V. W. Putra is with Technische Universita¨t Wien (TU Wien), Vienna,
Austria. E-mail: rachmad.putra@tuwien.ac.at
M. Shafique is with Division of Engineering, New York University
Abu Dhabi (NYU AD), Abu Dhabi, United Arab Emirates, and
Institute of Computer Engineering, Technische Universitt Wien (TU
Wien), Vienna, Austria. E-mail: muhammad.shafique@nyu.edu and
muhammad.shafique@tuwien.ac.at
such as [6] [7], due to its sparse spiking-based computation.
The STDP-based learning enables SNNs to learn information
from the unlabeled data, which is desired for real-world
applications, as gathering unlabeled data is easier and cheaper
than labeled data [8]. Thus, SNNs bear the potential to obtain
better algorithmic performance (e.g., classification accuracy)
with lower power/energy consumption than other neural
network algorithms in the unsupervised-learning settings [1].
A. Targeted Research Problem
In an SNN architecture with STDP [9] (depicted in
Fig. 1(a)), each excitatory neuron is expected to recognize a
class in the dataset, hence the connecting synapses from the
same excitatory neuron have to learn the input features of a
specific class (a detailed architecture discussion is provided in
Section II-A). Previous works [10]–[14] focus on improving
the classification accuracy, but at the cost of a huge amount of
additional computations, which leads to high energy and high
memory footprint. For instance, the state-of-the-art work in
[11] improves the effectiveness of the STDP-based learning by
updating the weights every two or three postsynaptic spikes, to
ensure that the update is essential. This reduces the number of
weight updates, but requires a 2-bit counter for each excitatory
neuron to keep track of the number of postsynaptic spikes.
Moreover, it needs a total of 200 neurons (100 excitatory and
inhibitory neurons), to achieve ∼74% accuracy in MNIST
digit classification1. Although all these techniques result in
an improvement in the classification accuracy, they incur high
computational, energy, and memory costs. This is not desirable
for embedded applications with a stringent constraints (for
instance, in terms of computations, energy, and memory).
60
70
80
90
100
A
cc
u
ra
cy
 [
%
]
100 Excitatory Neurons
4900 Excitatory Neurons
(a) (b)
Inhibition
Excitatory layer
Inhibitory layer
Input Layer
All-to-all connectivity &
Weights learned using STDP
One-to-one 
connectivity 
(inhibits all but the 
excitatory neuron)
Excitatory
neuron
Inhibitory
neuron
Number of training epochs
using MNIST training set
1st 2nd 3rd
74.0 75.2
85.2
89.2
92.0
75.2
(1MB)
(211MB)
Fig. 1. (a) An SNN architecture considered in this work is from [9] [11].
(b) A large-sized SNN typically achieves higher classification accuracy, e.g.,
the accuracy of an 1MB-sized SNN with a total of 200 neurons (100 excitatory
and inhibitory neurons) is lower than a 211MB-sized SNN with a total of 9800
neurons (4900 excitatory and inhibitory neurons) on MNIST [16].
1Note: unlike deep neural networks (DNNs), the research for the
unsupervised learning-based SNNs is still in early stage and mostly use small
datasets like MNIST and Fashion MNIST. We adopt the same test conditions
as used widely by the SNN research community [9]–[15].
ar
X
iv
:2
00
7.
08
86
0v
1 
 [c
s.N
E]
  1
7 J
ul 
20
20
2 TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE)
In the following, we present a motivational case study
to illustrate the compute, memory, and communication
requirements for an SNN executing on different hardware
platforms, and highlight the associated research challenges.
B. Motivational Analysis and Associated Research Challenges
In Fig. 1(b), we observed that a large-sized SNN typically
achieves higher classification accuracy and consumes a
larger memory footprint. It shows that to achieve 92%
accuracy in MNIST digit classification, the SNN requires
a total of 9800 neurons (4900 excitatory and inhibitory
neurons) with 3 epochs of training, and consumes more than
200MB of memory. On the other hand, most of the SNN
hardwares employ a limited size of on-chip memory (e.g., less
than 100MB) [17]–[20], which makes running a large-sized
network (whose size is larger than the on-chip memory)
energy-consuming. The reason is that this condition requires
a high number of memory accesses, whose energy is typically
higher than the compute operations [21]–[23]. Previous work
in [24] observed that the memory accesses are dominant,
consuming about 50%-75% energy of SNN processing in
different hardware platforms [17]–[19] (see Fig. 2).
0% 20% 40% 60% 80% 100%
SNNAP [15]
PEASE [14]
TrueNorth [13]
Energy Breakdown
Memory Access Communication Computation
4
5
7
8
9
Fig. 2. Energy breakdown of processing SNN in several hardware platforms
(adapted from the studies presented in [24]).
We also observed that there are inefficient computations
that hinder SNNs to achieve higher energy-efficiency, that
come from complex neuronal and STDP operations. They
require exponential calculations for computing the membrane
and threshold potential decay, and the synaptic trace and
weight dependence, respectively (see details in Section II).
Furthermore, there are ineffective STDP operations that come
from spurious weight updates, which occur when the synapses
of a neuron learn the overlapped features from different
classes, thereby degrading the recognition capability of the
neuron and also consuming energy. This happens since the
general STDP rule updates the synaptic weight every pre- and
post-synaptic spike (see details in Section II).
Required: An optimization technique is required to
reduce SNNs’ memory and energy requirements for both
training and inference processing, while maintaining the
classification accuracy, thereby enabling their deployment
on memory/energy-constrained embedded systems. However,
developing such an optimization technique poses different
design challenges as discussed below.
Associated Research Challenges: The high memory
requirements mainly come from a large number of parameters,
such as synaptic weights and neuron parameters. Reducing
these parameters may degrade the classification accuracy.
Hence, the parameter reduction should be done by identifying
and eliminating the non-significant parameters. Furthermore,
bit-width quantization may also be employed, but it can also
lead to accuracy degradation. To overcome the limitations
of the above optimization methods, the targeted research
question is: if and how can we refine the STDP-based learning
technique such that the classification accuracy is improved at
minimal overhead.
C. Our Novel Contributions
To address the above challenges, we propose FSpiNN,
a novel optimization framework for memory- and
energy-efficient spiking neural networks for both training and
inference, that employs the following key techniques (see an
overview in Fig. 3) to overcome the above-discussed research
challenges.
1) Optimization of the neuronal and STDP operations by
reducing (i) the inhibitory neurons through direct lateral
inhibitory connections, (ii) the presynaptic spike-based
weight updates, and (iii) the STDP complexity through
elimination of the exponential calculation in the weight
dependence part.
2) An algorithm for improving the accuracy of
STDP-based learning by (i) minimizing the spurious
weight updates through timestep-based operations, and (ii)
effectively potentiating the weight in each update through
an adaptive potentiation factor, and (iii) providing an
effective competition among neurons through an adaptive
inhibition.
3) SNN quantization to compress the bit-width of network
parameters: It employs a fixed-point format by rounding
to the nearest value, thereby providing a trade-off between
the classification accuracy and the memory requirements.
4) An algorithm to find the memory- and energy-aware
SNN model: It incorporates the memory and energy
requirements in the optimization process, and employs a
search algorithm to find the desired model.
Input 
SNN
Output
SNN
Our Novel Contributions
FSpiNN Framework (Section III)
Simplified STDP 
General STDP 
Optimizing the Computational 
Requirements of Neuronal & 
STDP Operations (Section III-A)
Improving the Accuracy of 
STDP-based Learning (Section III-B)
SNN Quantization (Section III-C)
Algorithm to Find the Memory- and 
Energy-Aware SNN Model (Section III-D)
Fig. 3. The overview of our novel contributions, which are highlighted in the
green boxes.
Key Results: We evaluated our FSpiNN framework using
a Python-based SNN simulator on an Nvidia GeForce GTX
1080 Ti GPU. The experimental results show that, compared
to the state-of-the-art [9], our FSpiNN achieves 7.5x memory
saving, and improves the energy-efficiency by 3.5x on average
for training and by 1.8x on average for inference, across
different datasets (MNIST and Fashion MNIST), with no
accuracy loss for a network with 4900 excitatory neurons.
PUTRA et al.: FSPINN: AN OPTIMIZATION FRAMEWORK FOR MEMORY- AND ENERGY-EFFICIENT SNNS 3
II. PRELIMINARIES
A. Spiking Neural Networks
Spiking Neural Networks (SNNs) are considered as the
third generation of neural network computation models [25],
since they exhibit high biological plausibility. They mimic
the behavior of biological spiking networks, i.e., action
potentials or spikes are used to convey information. The SNN
computational model is composed of spike/neural coding,
network architecture, neuron model, synaptic model, and
learning rule [26].
Spike Coding: It converts the information into a sequence
of spikes (spike train). Various spike coding methods have
been studied in the literature, such as rate, temporal,
rank-order, and phase coding schemes [27]–[30]. Here, we
consider rate coding, since it has demonstrated high accuracy
when employed in unsupervised learning-based SNNs. Rate
coding converts the intensity of a pixel to a spike train.
Typically, a higher intensity pixel is converted into a higher
number of spikes than a lower intensity pixel.
SNN Architecture: It consists of spiking neurons and
interconnecting synapses. Here, we consider the architecture
in Fig. 1(a), since it has demonstrated robustness when
performing different variants of STDP rules for unsupervised
learning [9]. It consists of input, excitatory, and inhibitory
layers. The input layer contains an input image, where every
pixel is connected to all excitatory neurons. In this manner,
each excitatory neuron has to recognize a class in the dataset,
and the connecting synapses from the same neuron have to
learn the features of the corresponding class. The excitatory
neurons are connected to inhibitory neurons in a one-to-one
connection. Each spike from an excitatory neuron triggers
the corresponding inhibitory neuron to generate a spike that
will be delivered to all excitatory neurons, except for the one
from which the inhibitory neuron receives a connection. This
inhibition provides competition among excitatory neurons.
Here, a winner-takes-all (WTA) mechanism is employed.
Neuron Model: It represents the neuron dynamics and
defines neuronal operations. Here, we consider the Leaky
Integrate-and-Fire (LIF) neuron model presented in [9] since
it has the lowest computational complexity compared to other
biological plausible models [31], which is in-line with our
objective. Note, LIF has also been widely adopted by the
neuromorphic hardware community. The model describes the
dynamics of membrane potential (V ) as stated in Eq. 1.
τv
dV
dt
= (Erest − V ) + ge(Eexc − V ) + gi(Einh − V ) (1)
Term τv is the membrane time constant. Eexc, Einh are
the equilibrium potentials of the excitatory and inhibitory
synapses, respectively. Erest is the resting membrane
potential. And, ge, gi are the conductances of the excitatory
and inhibitory synapses, respectively. The membrane potential
V is increased at the occurrence of the incoming spike,
otherwise it decays exponentially. When the membrane
potential reaches the threshold potential (Vth), it generates a
spike and goes back to the reset potential (Vreset). Afterwards,
the neuron is in the refractory period in which it cannot
generate spike(s). Fig. 4 shows the illustration of the neuronal
dynamics of LIF neuron model.
At the system level, we consider adding an adaptive
thresholding mechanism to ensure that a neuron does not
dominate the spiking activity, and to enable different neurons
to recognize different input features, as has been demonstrated
in [9]. Therefore, the membrane threshold is not determined
by Vth only, rather by the sum of Vth + θ, where θ is
increased each time the neuron generates a spike, otherwise
the membrane threshold decays exponentially.
time
time
time
membrane
potential
Vreset
Vth
presynaptic
spike train
postsynaptic
spike train
refractory period
Fig. 4. Illustration of the neuronal dynamics of LIF model.
Synaptic Model and Learning Rule: A synapse is modeled
by the conductance change and synaptic weight (w), i.e., when
a presynaptic spike arrives at a synapse, the conductance
is increased by the synaptic weight w, otherwise it decays
exponentially. The synaptic weight w is defined by the
spike-timing-dependent plasticity (STDP) learning rule, which
will be discussed in Section II-B. The synaptic model is stated
as Eq. 2.
τge
dge
dt
= −ge and τgi
dgi
dt
= −gi (2)
The term τge denotes the time constant of an excitatory
postsynaptic potential, and τgi denotes the time constant of
an inhibitory postsynaptic potential.
B. Spike-Timing-Dependent Plasticity (STDP)
The synaptic weight change is dependent on the timing
correlation between presynaptic and postsynaptic spikes,
known as spike-timing-dependent plasticity (STDP) rule [32].
Although there are several variants of STDP [9], we consider
the general STDP (i.e., pair-wise weight-dependent STDP) as
the baseline, since it has been extensively used by previous
works [9] [10] [14] [33]. It updates the synaptic weight every
presynaptic and postsynaptic spike, based on its temporal
correlation with the most recent postsynaptic and presynaptic
spike, respectively. To improve the simulation speed, the
weight changes in STDP are computed using synaptic traces
[34]. Eq. 3 is the most common and general form of STDP
operation used in literature.
∆w =
{
−ηprexpostwµ on presynaptic spike
ηpostxpre(wm − w)µ on postsynaptic spike
(3)
∆w is the synaptic weight change. ηpre and ηpost are
the learning rate for a presynaptic and postsynaptic spike,
respectively. xpre and xpost are the presynaptic and
postsynaptic traces/history, respectively. wm is the maximum
4 TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE)
weight allowed, w is the previous weight value, and µ is
the weight dependence factor. Every time a presynaptic spike
occurs, xpre is set to 1, otherwise xpre decays exponentially.
Similar processing is done for the postsynaptic spike using
xpost (see Fig. 5).
spike time:
spike trace:
presynaptic trace
postsynaptic trace
presynaptic spike
postsynaptic spike
time
time
0
1
0
1
time
time
xpre
xpost
(a)
wij
presynaptic 
spike
synapse 
postsynaptic
spike
presynaptic 
neuron 
(neuron-i)
postsynaptic
neuron 
(neuron-j)
ji
(b)
Fig. 5. (a) A single synaptic SNN connection. (b) Relation between spike
time and spike traces (xpre and xpost).
III. OUR FSPINN FRAMEWORK FOR EMBEDDED SNNS
We propose FSpiNN, an optimization framework for
obtaining memory- and energy-efficient SNNs for both the
training and inference while maintaining accuracy. FSpiNN
employs the following key steps. The detailed flow of different
steps is shown in Fig. 6.
Memory- and 
energy-efficient SNN
Simplified STDP 
operations 
Defined
timestep
Dataset
Network
architecture
Train the network
For each input image
Neuron model
Hyper-
parameters
Spike coding
Reduce the number of 
neuronal operations
1
2Spike encoding
Run the model
Postsynaptic 
spike train
Synapse model
& learning rule Reduce the 
complexity of 
STDP operations
Eliminate the presynaptic 
spike-based weight updates
Define the timestep for 
a synaptic weight update
Reduce the number of 
synaptic weight updates
Fix the weight 
dependence (µ=1)
Replace the inhibitory layer 
with lateral inhibitory connections
Apply adaptive inhibition strength
Direct lateral inhibitory 
architecture
Optimize the neuronal and STDP operations
(Section III-A)
Improve the accuracy of 
STDP-based learning (Section III-B)
Count the 
number of 
postsynaptic 
spikes
Calculate 
the weight 
update
(∆w)
Define the 
potentiation 
factor 
(k)
3
Fixed-Point SNN Quantization 
(Section III-C)
Optimized 
SNN model
Memory
Requirement
Energy
Requirement
4
DSE to find the memory- and 
energy-aware SNN model
(Section III-D) 
Proposed Optimizations
Fig. 6. Detailed flow of different steps of our FSpiNN framework. The novel
steps are highlighted in green boxes.
1) Optimize the processing of neuronal and STDP
operations through the following means (details in
Section III-A):
• Reduce the number of neuronal operations by replacing
the inhibitory layer with the direct lateral inhibitory
connections. It removes the inhibitory neurons and
substitutes the function of spikes from the inhibitory
neurons with spikes from the excitatory neurons.
• Reduce the number of STDP-based synaptic weight
updates by eliminating the presynaptic spike-based
weight updates. The updates happen only when the
postsynaptic spikes occur, which indicates that the
synapses learn the input features effectively.
• Reduce the STDP complexity by fixing the weight
dependence factor µ to 1, hence eliminating the complex
exponential calculation.
2) Improve the accuracy of STDP-based learning through
the following means (details in Section III-B):
• Timestep-based synaptic weight updates aim to minimize
the spurious weight updates that are induced by
postsynaptic spikes, thereby ensuring that each update
is essential.
• Adaptive STDP potentiation factor makes use of the
number of postsynaptic spikes to ensure how strong the
potentiation should be applied in each weight update.
It compensates for the loss of accuracy induced by the
STDP simplification.
• Adaptive inhibition strength aims to proportionally
provide competition among the excitatory neurons by
applying a proper inhibition strength to other neurons. It
is derived from an experimental analysis that investigates
the accuracy of different inhibition strength values.
3) Fixed-Point SNN Quantization (details in Section III-C)
to further compress the bit-width of SNN parameters. It
employs the rounding to the nearest value technique, and
explores the trade-off between the accuracy and memory
requirements for different quantization levels.
4) A design space exploration algorithm to find the SNN
model that fulfills the memory and energy budgets
(details in Section III-D). It integrates a search algorithm
with the proposed optimization to obtain a model that offers
a good trade-off in memory, energy, and accuracy.
A. Optimizing the Computational Requirements of Neuronal
and STDP Operations
Reducing the number of neuronal operations: Our
experiments in Fig. 7 illustrate that the number of postsynaptic
spikes generated from excitatory neurons is less than the
presynaptic spikes. Therefore, the number of incoming spikes
required to trigger an inhibitory neuron to spike is less than
the excitatory ones, and the inhibitory neuron typically has
a smaller range of active membrane potential (between reset
potential Vreset and threshold potential Vth) compared to the
excitatory ones. This indicates that the inhibitory neurons
have different parameters from excitatory ones to be saved
in memory. Hence, a large number of neurons utilized in
the inhibitory layer will consume a considerable amount of
memory and energy. Moreover, each inhibitory neuron needs
to process only a small number of incoming spikes to generate
the inhibition spike. Therefore, the use of inhibitory neurons
could be optimized further to reduce the memory and energy
requirements.
PUTRA et al.: FSPINN: AN OPTIMIZATION FRAMEWORK FOR MEMORY- AND ENERGY-EFFICIENT SNNS 5
N
eu
ro
n
 in
d
ex 0
783I
n
p
u
t 
la
ye
r
Simulation time (ms)0 350
N
eu
ro
n
 in
d
ex
Ex
ci
ta
to
ry
 la
ye
r
99
0
Low number of postsynaptic spikes 
High number of presynaptic spikes 
Fig. 7. Illustration of the spike trains from the input and excitatory layers. It
shows a significant difference between the number of spikes from the input
layer (presynaptic spikes) and the excitatory layer (postsynaptic spikes).
Proposed Optimization: We propose to replace the inhibitory
layer with direct lateral inhibitory connections to reduce
the number of neurons in the network, thereby curtailing
the neuronal operations (as illustrated in Fig. 8(a)). In this
manner, half of the total number of neurons are removed,
and the function of spikes from the inhibitory neurons (to
provide competition among excitatory neurons through a
winner-takes-all mechanism) is substituted by the spikes from
the excitatory neurons. Our experimental results in Fig. 8(b)
show that the lateral inhibitory connections have the potential
to maintain accuracy, while having less resources than using
the inhibitory layer. For instance, label- 1 in Fig. 8(b) indicates
that the SNN with a lateral inhibition can achieve a high
accuracy faster than the SNN with an inhibitory layer, and then
they converge to a comparable accuracy after more samples
presented in the training phase. The reason is that, the lateral
inhibition directly conveys spikes from an excitatory neuron
to other neurons, hence the number of spikes is typically
higher than the ones from the inhibitory layer. Therefore, the
inhibition is stronger and it results in more diverse feature
learning across neurons, thereby achieving high accuracy
with a small number of training samples. This behavior is
beneficial, especially when the SNN-based systems have only
a small number of training samples.
0
20
40
60
80
100
0 10000 20000 30000 40000 50000 60000
A
cc
u
ra
cy
 [
%
]
Number of samples over a training phase
Size: 100 | Lateral Inhibition Size: 100 | Exc. & Inh.  Layers
Size: 400 | Lateral Inhibition Size: 400 | Exc. & Inh. Layers
Size: 900 | Lateral Inhibition Size: 900 | Exc. & Inh. Layers
Size: 1600 | Lateral Inhibition Size: 1600 | Exc. & Inh. Layers
(b)
(a) Excitatory neurons Inhibitory neurons Excitatory neurons
Excitatory and inhibitory layers Lateral inhibitory connections
1
Fig. 8. (a) Replacing the inhibitory layer with the direct lateral inhibitory
connections (red dashed-lines), through which each excitatory neuron is
connected to other excitatory neurons. (b) Impact of employing the direct
lateral inhibitory connections on accuracy. This architecture offers comparable
accuracy across different sizes of networks (i.e., 100, 400, 900, and 1600
excitatory neurons) as compared to employing the inhibitory layer.
Reducing the number of STDP-based synaptic weight
updates: In the unsupervised SNNs, each neuron has to
recognize features that belong to a specific class, so that
each neuron can generate the highest number of spikes to
represent its recognition category. To achieve this, the general
STDP rule presented in Eq. 3 updates the synaptic weight
in every event of a pre- and post-synaptic spike. However,
previous work [11] observed that there are spurious weight
updates which may decrease the accuracy of learning. The
spurious updates are observed in two conditions: (i) when the
neurons spike unpredictably in the early phase of learning, due
to the random weight initialization, and (ii) when a neuron
generates spikes for patterns that belong to different classes,
but share common features, thereby causing the synapses to
learn the overlapped features from different classes. Therefore,
the STDP-based weight updates that are induced by these pre-
and post-synaptic spikes might not learn the input features
effectively, and hence decreasing the recognition capability of
the neuron and consuming energy. We exploit this observation
in a new way to optimize the SNN computations, while
preserving the classification accuracy.
Proposed Optimization: We propose to eliminate the
presynaptic spike-based weight updates to reduce the
spurious weight updates that are induced by the presynaptic
spikes. Therefore, the learning will focus on the condition
when postsynaptic spikes happen, which indicates that the
connecting synapses effectively learn the input features. It also
reduces the computational energy as the number of presynaptic
spikes is higher than the postsynaptic ones, as shown in Fig. 7.
Reducing the STDP Complexity: The change in each
synaptic weight (∆w) is updated using an STDP operation
that requires complex exponential calculations for the synaptic
trace and weight dependence parts (see Eq. 3). We observed
that the value of the weight dependence factor (µ) is typically
less than 1 [11], which makes it expensive to compute.
Therefore, the use of weight dependence factor could be
optimized to achieve further energy-efficiency.
Proposed Optimization: We propose to fix the weight
dependence factor µ to 1, thereby simplifying the computation
of STDP operations. However, we observed that only fixing the
weight dependence factor value may degrade the classification
accuracy across different sizes of the network, as shown in
Fig. 9. Therefore, we propose a technique for improving
the STDP-based learning (discussed in Section III-B) to
compensate for the loss of this µ simplification, and to
maintain the accuracy.
0
20
40
60
80
100
100 400 900 1600
A
cc
u
ra
cy
 [
%
]
µ = 1 µ = 0.1 µ = 0.01 µ = 0.001 µ = 0.0001
Number of Excitatory Neurons
Different values of weight dependence factor (µ)
have different accuracy
Fig. 9. Impact of different values of weight dependence factor µ on accuracy,
across different sizes of networks.
6 TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE)
B. Improving the Accuracy of STDP-based Learning
We observed that for each input image, at least a single
excitatory neuron is expected to recognize the input features
and generate the highest number of spikes to represent the
recognition of the corresponding class. Therefore, information
regarding the number of postsynaptic spikes should be
leveraged and used to improve the accuracy.
Proposed Solution: We propose an algorithm to improve
the accuracy of STDP-based learning by employing
timestep-based synaptic weight updates, and adaptively
determining the STDP potentiation factor (k) and the
inhibition strength. Timestep-based synaptic weight updates
aim to reduce the spurious weight updates that are induced by
the postsynaptic spikes. Therefore, our technique updates the
weight once within a timestep, as long as at least there is a
postsynaptic spike (see Fig. 10).
timestep-1
presynaptic spikes
postsynaptic spikes
presynaptic traces
time
time
time0
1
timestep-2
spike time:
spike trace:
xpre
synaptic weight 
update-1
(Nspikes = 1)
synaptic weight 
update-2
(Nspikes = 3)
xpre
Nspikes : number of accumulated 
postsynaptic spikes 
Fig. 10. Overview of the timestep, synaptic weight updates, and number of
accumulated postsynaptic spikes (Nspikes) in our proposed technique.
We also propose an adaptive STDP potentiation factor
k, which aims at determining how strong the potentiation
should be in each weight update, by leveraging the number
of postsynaptic spikes. To do this, our technique accumulates
the number of postsynaptic spikes observed from the first
time when the spike trains of an input image are presented
to the network, until the time when a weight update is
performed (denoted as Nspikes in Fig. 10). The number
of postsynaptic spikes is used to determine the potentiation
factor k, as formulated in Eq. 4. Term maxNspikes denotes
the maximum number of accumulated spikes, and Nspikes th
denotes the number of threshold spikes, which normalizes
the value of maxNspikes. Afterwards, the potentiation factor
k is used to compute the synaptic weight change ∆w, as
formulated in Eq. 5. The synaptic weight update is conducted
for the excitatory neuron that generates the highest number of
postsynaptic spikes (i.e., the winning neuron). In this manner,
the confidence level of learning is expected to increase over
time when presenting the spike trains of an input image.
k =
⌈
maxNspikes
Nspikes th
⌉
(4)
∆w = kηpostxpre(wm − w) on update time (5)
Furthermore, balancing the strength of excitatory and
inhibitory synaptic conductance is important as it makes
the inhibition neither too strong, nor too weak. Too strong
inhibition means that once the winning neuron is selected, it
strongly prevents other excitatory neurons from firing, thereby
dominating the recognition of input features (ineffective
competition). Meanwhile, too weak inhibition means that it
does not necessarily provide competition among the excitatory
neurons, thereby giving no influence to the overall learning
process (no competition). Previous work in [9] observed that
the ratio between the excitatory and inhibitory strengths have
an important role to balance the learning process. Towards
this, we performed an experimental analysis to investigate
the accuracy in different inhibition strength conditions and
different datasets to justify the generality of the effective
ratio conclusion. The results are presented in Fig. 11. Our
analysis shows that when the inhibitory strength is too weak
or too strong, the accuracy is sub-optimal. We observed
that two comparable accuracy points are obtained using the
ratio of 2x-4x. Therefore, we propose to use an adaptive
inhibition strength that provides a proper competition among
the excitatory neurons, by applying an inhibition strength equal
to 2x-4x of the excitatory strength.
0
20
40
60
80
100
0.0625x 0.125x 0.25x 0.5x 1x 2x 4x 8x 16x
A
cc
u
ra
cy
 [
%
]
Ratio of Inhibitory Strength to Excitatory Strength
Fashion MNIST MNIST
Too strongToo weak
Just
right
Fig. 11. Impact of different ratio values between inhibitory and excitatory
strengths when running the MNIST and Fashion MNIST datasets. When the
ratio is too weak or too strong, the accuracy is sub-optimal.
Algorithm 1 synergistically employs the above-discussed
techniques. For each excitatory neuron, the algorithm monitors
whether a postsynaptic spike is generated. If so, the number of
the postsynaptic spikes is accumulated, and the corresponding
presynaptic traces are recorded. Otherwise, no action is
required. When the timestep is reached, the algorithm
identifies which excitatory neuron generates the highest
number of spikes (the winning neuron). Once a winning
neuron is identified, the connecting synapses to the winning
neuron are updated with the synaptic weight change ∆w.
C. Fixed-Point Quantization for SNNs
It is a common practice to perform SNN processing using
a single-precision floating-point operation to achieve a high
classification accuracy. However, floating-point operations
typically consume high memory and energy. To achieve a
memory- and energy-efficient SNN processing, it is more
convenient to use a fixed-point format for neuronal and STDP
operations. However, quantizing a value implies a reduction
of its representation capability, thereby decreasing the
accuracy of the networks. Therefore, the quantization process
should consider the trade-off between accuracy and memory
requirement, to find the acceptable quantization levels. In
this manner, the users can select the acceptable accuracy and
memory to comply with the design specifications.
Towards this, our FSpiNN framework performs exploration
to investigate the impact of different quantization levels of
PUTRA et al.: FSPINN: AN OPTIMIZATION FRAMEWORK FOR MEMORY- AND ENERGY-EFFICIENT SNNS 7
Algorithm 1 Pseudo-code for improving the accuracy of
STDP-based learning
INPUT: (1) Number of training dataset (Dtrain);
(2) Simulation time for an input image (tsim = 350);
(3) Timestep (tstep = 4);
(4) SNN parameters: number of excitatory neurons
(nexc), number of synapses-per-neuron (nsyn), number of
accumulated postsynaptic spikes (Nspikes);
(5) STDP parameters: learning rate (ηpost = 0.01),
max. weight value (wm = 1), previous weight value
(w), number of threshold spikes (Nspikes th = 10),
potentiation factor (k);
(6) Postsynaptic spike event (spikepost);
OUTPUT: Synaptic weight update (∆w);
BEGIN
Initialization:
1: ∆w[nexc, nsyn] = zeros[nexc, nsyn];
2: Nspikes[nexc] = zeros[nexc];
3: xpre = zeros[nexc, nsyn];
Process:
4: for (d = 0 to (Dtrain − 1)) do
5: for (t = 0 to (tsim − 1)) do
6: for (i = 0 to (nexc − 1)) do
7: if spikepost then
8: Nspikes[i] += 1;
9: monitor xpre[i, :];
10: end if
11: end for
12: if ((t mod tstep) == 0) then
13: maxNspikes = max(Nspikes);
14: j ← index(max(Nspikes));
15: k = d(maxNspikes/Nspikes th)e;
16: ∆w[j, :] = kηpostxpre[j, :](wm − w);
17: end if
18: end for
19: end for
20: return ∆w;
END
SNN parameters (i.e., synaptic weights) to the accuracy, using
a rounding to the nearest value technique with the rounding
half-up rule. It approximates the values that are half-way
between two representable numbers by rounding them up. The
fixed-point number can be written in 〈Qi.Qf 〉 format, with Qi
and Qf are the integer and fractional part, respectively. The
total number of bits (wordlength N ) in the fixed-point format
consists of the number of bits for the integer part Ni and the
fractional part Nf (i.e., N = Ni +Nf ). The precision of the
fixed-point format  is defined as  = 2−Nf and it is used to
define the quantized number xq .
xq =
⌊
x+

2
⌋
(6)
D. Design Space Exploration (DSE) Algorithm for the
Memory- and Energy-Aware SNN Model
To provide better applicability in many application
scenarios, the proposed optimizations need to fulfill the given
Algorithm 2 Pseudo-code for the DSE algorithm
INPUT: (1) Memory requirement (mem);
(2) Energy requirement for training (Etrain);
(3) Energy requirement for inference (Einf );
(4) SNN model (model): number of the excitatory neurons
(model.nexc), model size (model.mem), energy of model
for training (model.Etrain), energy of model for inference
(model.Einf ), accuracy of model (model.acc);
(5) Number of additional excitatory neurons (nadd);
OUTPUT: SNN model (model);
BEGIN
Initialization:
1: model.nexc = 0;
2: model.size = 0;
3: accsaved = 0;
Process:
4: while model.size ≤ memreq do
5: if (model.nexc > 0) then
6: perform training using Algorithm 1;
7: monitor model.Etrain;
8: if (model.Etrain ≤ Etrain) then
9: perform inference;
10: monitor model.Einf and model.acc;
11: if (model.Einf ≤ Einf ) and (model.acc >
accsaved) then
12: accsaved = model.acc;
13: save model;
14: end if
15: end if
16: end if
17: model.nexc+ = nadd;
18: end while
19: return model;
END
memory and energy requirements. Towards this, we also
propose a DSE algorithm to find an SNN model whose
memory and energy (for both the training and inference)
are within the given memory and energy budgets, while
maintaining the accuracy. The main idea is to incrementally
increase the size of SNN model (i.e., number of excitatory
neurons) and evaluate whether the currently investigated model
satisfies the memory and energy budgets. If so, the DSE will
evaluate whether the accuracy is better. If the accuracy is the
same, the DSE will select the smaller model to keep the
memory and energy consumption low. In this manner, our
FSpiNN framework can support many applications where the
memory and energy are constrained. The pseudo-code of the
algorithm is presented in Algorithm 2.
IV. EVALUATION METHODOLOGY
Fig. 12 illustrates the experimental setup with different
steps, to evaluate our proposed framework. We used a
Python-based SNN simulator [35] for evaluating the accuracy
of the SNN. We run the SNN simulations on three different
types of GPUs, namely Nvidia GeForce GTX 1060 [36],
GTX 1080 Ti [37], and RTX 2080 Ti [38] (the detailed
8 TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE)
Processing
power
SNN Simulator
Spike trains
Spike coding Running 
the 
network
Classification
accuracy
Hyper-parameters
GPU 
SNN architecture Processing 
time
Memory 
footprint
Synapse model
& STDP rule
Neuron model
Dataset
Fig. 12. Experimental setup and tools flow.
specifications are presented in Table I), providing a wide
range of compute and memory capabilities to show the
scalability of our proposed framework. We selected the GTX
1060 and the GTX 1080 Ti as representative of the Pascal
architecture, which is also used in the embedded GPUs,
such as Nvidia Jetson TX2 [39]. We also selected the
RTX 2080 Ti as representative of the Turing architecture,
to provide variation in compute and memory capabilities
for evaluation. The same GPU architecture means the same
technology and memory hierarchy. Therefore, the results
obtained from the experiments can be used to estimate
the relative energy-efficiency improvement obtained by our
FSpiNN framework, as compared to the state-of-the-art works.
From the simulations, we extracted the size of SNN model
which represents the memory footprint. This information is
used to evaluate the memory savings. To estimate the energy,
we adopted the approach of [40]. We recorded the start- and
end-time of simulation to obtain the processing time, and
utilized the nvidia-smi utility to report the processing power,
which are then used to estimate the energy consumption.
TABLE I
GPU SPECIFICATIONS.
Specifications GTX 1060 GTX 1080 Ti RTX 2080 Ti
Architecture Pascal Pascal Turing
CUDA cores 1280 3584 4352
Memory 6GB GDDR5 11GB GDDR5X 11GB GDDR6
Interface width 192-bit 352-bit 352-bit
Bandwidth 8Gbps 11Gbps 14 Gbps
Power 120W 250W 250W
Datasets: We used the MNIST [16] and Fashion MNIST
[41] datasets, as they are widely used for evaluating the
accuracy of SNNs [2]. MNIST represents a simple dataset,
while Fashion MNIST represents a more complex dataset [15].
Each dataset has 60,000 images for training and 10,000 images
for test, each having a dimension of 28x28 pixels.
Input Encoding: Every pixel of an image from the dataset
is converted into a Poisson-distributed spike train whose firing
rate is proportional to the intensity of the pixel. A higher
intensity pixel is converted into a higher number of spikes
than a lower intensity pixel. The spike train from each pixel
is presented to the network for 350 ms duration.
Classification: In the training, the synaptic weight updates
are performed without label information as it is unsupervised
learning. Therefore, an additional mechanism is required
to categorize the excitatory neurons for classification. The
neurons are categorized based on their highest response to
different classes over one presentation of the training set (1x
epoch of training). Here, the labels are used to assign each
neuron with a specific class. Afterwards, the response of the
class-assigned neurons is used to measure the accuracy.
Comparisons: We compared our proposed framework with
two state-of-the-art designs, i.e., the general pair-wise weight
dependence STDP-based SNN (baseline) [9], and the enhanced
self-learning STDP-based SNN (SL-STDP) [11]. The sizes
of networks considered in the evaluation are the networks
with a different number of excitatory neurons: 100, 400, 900,
1600, 2500, 3600, and 4900. For conciseness, we refer them
to as Net100, Net400, Net900, Net1600, Net2500, Net3600,
and Net4900, respectively. To provide fair comparisons, we
recreated the baseline [9] and the SL-STDP [11], and then
simulated them using the same SNN simulator [35]. We also
used the same approach for obtaining the memory footprint
and the energy. That is, we extracted the size of SNN model
from simulation to evaluate the memory footprint, and we used
the nvidia-smi utility to report the power and recorded the
simulation time, which are then used to estimate the energy.
We also kept the hyper-parameter values the same for different
sizes of networks. In particular, we used 1x epoch of training
because the network will be trained with a full training set
once. Moreover, an SNN model trained with 1x epoch of
training is adopted by a wide-range of SNN community and
considered as a completely trained network [10] [11] [13] [33].
V. RESULTS AND DISCUSSIONS
A. Maintaining the Classification Accuracy
Results for the MNIST Dataset: Fig. 13(a) shows the
accuracy after 1x epoch of training for MNIST. It shows
that our FSpiNN maintains (and even improves in certain
cases) the accuracy across different sizes of networks as
compared to other designs. Following are the detailed accuracy
improvements achieved by FSpiNN:
• Label- 1 : In Net100, FSpiNN achieves 13.2% improvement
with 89.2% accuracy.
• Label- 2 : In Net400, FSpiNN achieves 7.2% improvement
with 95.6% accuracy.
• Label- 3 : In Net900, FSpiNN achieves 2.4% improvement
with 94.4% accuracy.
• Label- 4 : In Net1600, FSpiNN achieves 2.2% improvement
with 95.2% accuracy.
• Label- 5 : In Net2500, FSpiNN achieves 0.8% improvement
with 90% accuracy.
• Label- 6 : In Net3600, FSpiNN achieves 4.8% improvement
with 92.8% accuracy.
• Label- 7 : In Net4900, FSpiNN achieves 2.4% improvement
with 92.4% accuracy.
These results indicate that a larger network is harder to
train. For instance, the accuracy achieved in Net100 and
Net900 are 89.2% and 94.4% respectively, but the accuracy
improvements in Net100 and Net900 are 13.2% and 2.4%
respectively. The reason is that, a larger network has more
synapses to train for effectively learning the input features,
thereby requiring more careful training (e.g., hyper-parameter
tuning). This condition may cause the accuracy of the larger
PUTRA et al.: FSPINN: AN OPTIMIZATION FRAMEWORK FOR MEMORY- AND ENERGY-EFFICIENT SNNS 9
(a)
0
20
40
60
80
100
100 400 900 1600 2500 3600 4900
A
cc
u
ra
cy
 [
%
]
Size of the Network
Baseline [10} SL-STDP [15] FSpiNN9] 1
1 2 3 4 5 6 7
0 1 2 3 4    5 6    7    8    9
0
1
2
3
4
5
6
7
8
9
Target labels
Confusion Matrix of FSpiNN 
1000
800
600
400
200
0
P
re
d
ic
te
d
 la
b
e
ls
(c)
(b) 1.0
0.8
0.6
0.4
0.2
0.0
9
8
7
6
5
4
3
2
1
0
none
Synaptic weights learned by FSpiNN Classification matrix
Fig. 13. (a) Comparisons of accuracy for MNIST dataset in different sizes
of networks: Net100, Net400, Net900, Net1600, Net2500, Net3600, and
Net4900. (b) Synaptic weights learned by FSpiNN and its classification
matrix. (c) Confusion matrix in inference phase for Net400.
networks lower than the smaller ones in certain cases. For
instance, the accuracy achieved in Net4900 is 92.4%, which is
lower than the accuracy in Net900 (i.e., 94.4%). Furthermore,
Fig. 13(b) shows the synaptic weights and its classification
matrix, and Fig. 13(c) shows the confusion matrix for Net400.
These results show the common confusions, such as when
identifying between digits 3 and 8, 4 and 9, etc. The reason is
that, the connecting synapses from the same neuron learn the
common features (shape) from these classes. Hence, the same
neuron generates the highest number of spikes for different
classes, thereby resulting in more frequent false classifications.
Results for the Fashion MNIST Dataset: Fig. 14(a) shows
the accuracy after 1x epoch of training for Fashion MNIST. It
shows that our FSpiNN still maintains (and even improves in
certain cases) the accuracy across different sizes of networks as
compared to other designs. Following are the detailed accuracy
improvements achieved by FSpiNN:
• Label- 1 : In Net100, FSpiNN achieves 14.2% improvement
with 60.2% accuracy.
• Label- 2 : In Net400, FSpiNN achieves 5.2% improvement
with 64.8% accuracy.
• Label- 3 : In Net900, FSpiNN achieves 3.6% improvement
with 66% accuracy.
• Label- 4 : In Net1600, FSpiNN achieves 3.5% improvement
(a)
0
20
40
60
80
100
100 400 900 1600 2500 3600 4900
A
cc
u
ra
cy
 [
%
]
Size of the Network
Baseline [10} SL-STDP [15] FSpiNN9] 1
1 2 3 4 5 6 7
Target labels
Confusion Matrix of FSpiNN
800
600
400
200
0
P
re
d
ic
te
d
 la
b
e
ls
(c)
T-Shirt/Top
Trouser
Pullover
Dress
Coat
Sandals
Shirt
Sneaker
Bag
Ankle boots
(b) 1.0
0.8
0.6
0.4
0.2
0.0
9
8
7
6
5
4
3
2
1
0
none
Synaptic weights learned by FSpiNN Classification matrix
Fig. 14. (a) Comparisons of accuracy for Fashion MNIST dataset in different
sizes of networks: Net100, Net400, Net900, Net1600, Net2500, Net3600,
and Net4900. (b) Synaptic weights learned by FSpiNN and its classification
matrix. (c) Confusion matrix in inference phase for Net400.
with 68.8% accuracy.
• Label- 5 : In Net2500, FSpiNN achieves 3% improvement
with 60.6% accuracy.
• Label- 6 : In Net3600, FSpiNN achieves 27% improvement
with 64.4% accuracy.
• Label- 7 : In Net4900, FSpiNN achieves 11% improvement
with 61.6% accuracy.
Here, we observed the same trend as observed in MNIST.
A larger network has the potential to achieve higher accuracy
because more neurons are available for recognizing more
feature variations. This trend is shown in Fig. 14(a) for
Net100-Net1600 and Net3600-Net4900. At the same time,
a larger network is harder to train because more synapses
have to effectively learn input features. Therefore, a larger
network may achieve lower accuracy than the smaller ones
in cases where the synapses are not effectively trained.
This trend is shown in Fig. 14(a) for Net1600-Net3600.
The reason is that, in our experiments, we kept the same
hyper-parameter tuning across different sizes of networks,
and only performed 1x epoch of training. Therefore, the
accuracy of a larger network could still be improved through
more effective hyper-parameter tuning (e.g., more training
10 TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE)
epochs), as suggested from Fig. 15. The results in Fig. 15
indicate that employing multi-epoch training can increase
the accuracy, since the same features in the training set
are learned multiple times by the network. The accuracy
improvement in the earlier epoch is typically higher than
in the later ones, thereby only relying on multi-epoch
training may incur high energy consumption, without gaining
significant accuracy improvement in the end. To address
this, our FSpiNN employs the adaptive potentiation factor
and inhibition strength, which increase the confidence of
learning over time in the training. The results also show that
our FSpiNN achieves the highest accuracy across different
epochs as compared to state-of-the-art designs. Moreover,
FSpiNN with 1x training epoch achieves higher accuracy than
state-of-the-art designs with 3x training epochs. These results
show the effectiveness of the learning algorithm in FSpiNN.
0
20
40
60
80
100
Baseline [  ] SL-STDP  [  ] FSpiNN
A
cc
u
ra
cy
 [
%
]
Epoch-1 Epoch-2 Epoch-3
9 [11]
Fig. 15. Results of accuracy after 3x epochs of training for Net3600 when
running the Fashion MNIST.
Furthermore, Fig. 14(b) shows the synaptic weights
and its classification matrix, and Fig. 14(c) shows the
confusion matrix for Net400. These results show the common
confusions, such as when identifying between pullover, coat,
and shirt, as well as sandals, sneaker, and ankle boots. The
reason is that, the connecting synapses from the same neuron
learn the common features (shape) from these classes. Hence,
the same neuron generates the highest number of spikes for
different classes, thereby resulting in more frequent false
classifications.
These experimental results also indicate that the input
images with more overlapping features are harder to classify.
Therefore, in general, the classification accuracy achieved in
MNIST is higher than Fashion MNIST, since MNIST has
relatively simpler features than Fashion MNIST. However,
our FSpiNN can still achieve better accuracy in Fashion
MNIST across different sizes of networks, outperforming
state-of-the-art designs. The maintained accuracy achieved by
our FSpiNN comes from the improved STDP-based learning,
which reduces the spurious weight updates, and employs the
effective STDP potentiation and inhibition strength.
B. Impact of Employing the Fixed-Point Quantization on the
Classification Accuracy
Our framework converts a floating-point (FP32) format to
a fixed-point format, and conducts exploration to study the
impact of different quantization levels on the accuracy.
Results for the MNIST Dataset: Label- 1 in Fig. 16 shows
that the FSpiNN achieves better accuracy than the baseline and
the SL-STDP, when the minimum bit-width of quantization is
8 bits. The reason is that, the 8-bit (or more) format in the
FSpiNN provides sufficient levels of weight values to modulate
the input spikes from MNIST images, and induce each neuron
to recognize a specific digit class. In 8-bit precision, our
FSpiNN achieves 91.6% accuracy, while the baseline and the
SL-STDP achieve 87.6% and 82%, respectively. It indicates
that the accuracy achieved by the FSpiNN 8-bit is slightly
less than the FSpiNN FP32 (pointed by the label- 2 ), but
still higher than the baseline and the SL-STDP with FP32
precision (pointed by the label- 3 and label- 4 , respectively).
Therefore, the FSpiNN 8-bit offers no accuracy loss with a
reduced bit-width for MNIST.
0
20
40
60
80
100
4-bit 6-bit 8-bit 10-bit 12-bit 14-bit 16-bit
A
cc
u
ra
cy
 [
%
]
Fixed-Point Quantization
Baseline [10] SL-STDP [15] FSpiNN
1
selected precision: 8-bit
19]
3
4
2
Fig. 16. Accuracy vs. quantization for MNIST dataset for Net400.
Results for the Fashion MNIST Dataset: Label- 1 in
Fig. 17 shows that FSpiNN achieves better accuracy than the
baseline and the SL-STDP, when the minimum bit-width of
quantization is 8 bits. The reason is that, the 8-bit (or more)
format in the FSpiNN provides sufficient levels of weight
values to modulate the input spikes from Fashion MNIST
images, and induce each neuron to recognize a specific fashion
class. In 8-bit precision, our FSpiNN achieves 64.8% of
accuracy, while the baseline and the SL-STDP achieve 59.2%
and 58%, respectively. It indicates that the accuracy achieved
by the FSpiNN 8-bit is comparable to the FSpiNN FP32
(pointed by the label- 2 ), and it is higher than the baseline and
the SL-STDP with FP32 precision (pointed by the label- 3 and
label- 4 , respectively). Therefore, the FSpiNN 8-bit offers no
accuracy loss with a reduced bit-width for Fashion MNIST.
20
30
40
50
60
70
80
4-bit 6-bit 8-bit 10-bit 12-bit 14-bit 16-bit
A
cc
u
ra
cy
 [
%
]
Fixed-Point Quantization
Baseline [10] SL-STDP [15] FSpiNN[9] 1
1
selected precision: 8-bit
4
2 3
Fig. 17. Accuracy vs. quantization for Fashion MNIST dataset for Net400.
These experimental results also show that, for both MNIST
and Fashion MNIST datasets, the quantization levels with
less than 8-bit precision do not provide sufficient unique
information for distinguishing features of different classes
in the input images. This condition reduces the efficacy of
STDP learning of the synapses and recognition capability of
the neurons, thereby leading to low classification accuracy.
Furthermore, a reduced bit-width is beneficial since it leads
to a reduced memory requirement and energy consumption,
PUTRA et al.: FSPINN: AN OPTIMIZATION FRAMEWORK FOR MEMORY- AND ENERGY-EFFICIENT SNNS 11
which will be discussed in Section V-C and Section V-D.
Note, the users can select the quantization level based on
the trade-off consideration in the design specifications (e.g.,
accuracy, memory, and power/energy budget).
C. Reducing the Memory Requirements
Fig. 18 shows the memory requirements of different designs
across different sizes of networks for both the training
and inference phases. Label- 1 shows that the Net3600 and
Net4900 that employ the baseline or the SL-STDP techniques,
consume more than 100MB, thereby making them difficult to
be deployed on embedded systems. On the other hand, our
FSpiNN without quantization (FP32) achieves 1.8x and 1.9x
memory savings as compared to the baseline, for Net3600
and Net4900, respectively. The reason is that the FSpiNN
FP32 removes the inhibitory neurons completely, thereby
avoiding their parameters to be saved in the memory. After
applying quantization, the memory requirement is reduced
even more. The FSpiNN 16-bit achieves about 3.6x and 3.7x
memory savings, while the FSpiNN 8-bit achieves about 7.3x
and 7.5x memory savings, when compared to the baseline
for Net3600 and Net4900, respectively. Fig. 18 also shows
that the FSpiNN 8-bit consumes about 0.16MB-28MB for
Net100-Net4900, thereby making the networks easier to be
deployed on embedded systems. Furthermore, if we consider
the accuracy that the quantized designs can achieve, we can
select the FSpiNN design that offers a good trade-off between
high accuracy and acceptable memory footprint.
0
100
200
300
100 400 900 1600 2500 3600 4900
M
e
m
o
ry
 F
o
o
tp
ri
n
t 
[M
B
]
Size of the Network
Baseline [10] SL-STDP [15] FSpiNN (FP32) FSpiNN (16-bit) FSpiNN (8-bit)
1.
1x
2.
2x
4.
5x
1.
4x
2.
7x
5.
4x 1.
5x
3.
1x
6.
2x 1
.7
x
3.
4x
6.
7x 1
.8
x
3.
5x
7
.0
x
1.
8x
3
.6
x
7.
3x
1.
9x
3.
7x
7.
5x
1
9] 1
Fig. 18. Memory requirements for different sizes of networks (i.e., Net100,
Net400, Net900, Net1600, Net2500, Net3600, and Net4900) and different
quantization levels (i.e., FP32/without quantization, 16-bit, and 8-bit).
D. Energy-Efficiency Improvements
Fig. 19 and Fig. 20 illustrate the energy-efficiency across
different sizes of networks and different GPUs for MNIST
and Fashion MNIST datasets, respectively. These figures show
that the SL-STDP achieves higher energy-efficiency than the
baseline in training phase, and our FSpiNN achieves the
highest energy-efficiency among all designs in both training
and inference phases.
Training: The SL-STDP improves the energy-efficiency by
1.1x-1.2x compared to the baseline, across different sizes of
networks and GPUs, for both MNIST and Fashion MNIST.
The reason is that, the SL-STDP only employs postsynaptic
spike-based weight updates whose number of updates is less
than the baseline, which employs pre- and post-synaptic
spike-based weight updates. The FSpiNN FP32 improves the
energy-efficiency more than the SL-STDP, that is by 1.1x-2.8x
(MNIST) and by 1.1x-1.9x (Fashion MNIST) compared to
the baseline. The reason is that, apart from the elimination
of presynaptic spike-based weight updates, the FSpiNN FP32
also eliminates the inhibitory neurons and reduces the STDP
complexity. After applying quantization, the FSpiNN 16-bit
and FSpiNN 8-bit improve the energy-efficiency even more
than FSpiNN FP32. That is, the FSpiNN 16-bit achieves
1.7x-3.9x (MNIST) and 1.2x-2.4x (Fashion MNIST), while
FSpiNN 8-bit achieves a 1.8x-4.3x (MNIST) and 1.5x-2.7x
(Fashion MNIST), compared to the baseline.
Inference: The SL-STDP has comparable energy-efficiency
compared to the baseline, across different sizes of networks
and GPUs, for both MNIST and Fashion MNIST. The reason is
that, the SL-STDP and the baseline have similar computational
complexity in the inference phase. Meanwhile, the FSpiNN
FP32 improves the energy-efficiency by 1.3x-1.9x (MNIST)
and by 1.1x-1.4x (Fashion MNIST) compared to the baseline.
The improvements mainly come from the elimination of the
inhibitory neurons. After applying quantization, the FSpiNN
16-bit and FSpiNN 8-bit improve the energy-efficiency even
more than FSpiNN FP32. That is, the FSpiNN 16-bit achieves
1.4x-2.6x (MNIST) and 1.2x-2.1x (Fashion MNIST), while
FSpiNN 8-bit achieves 1.4x-2.9x (MNIST) and 1.3x-2.3x
(Fashion MNIST), compared to the baseline.
Furthermore, if we consider the classification accuracy and
memory footprint that the quantized designs can achieve, we
can select the FSpiNN design that offers a good trade-off
in the accuracy, memory, and energy-efficiency. For instance,
the FSpiNN 8-bit achieves energy-efficiency improvements by
4.3x (MNIST) and by 2.7x (Fashion MNIST) in training (see
labels- 1 in Fig. 19 and Fig. 20), and by 2x (MNIST) and by
1.6x (Fashion MNIST) in inference (see labels- 2 in Fig. 19
and Fig. 20), compared to the baseline in Net4900, while
obtaining 7.5x memory saving with an accuracy of ∼92%
for MNIST and ∼61% for Fashion MNIST. The experimental
results in Fig. 19 and Fig. 20 also suggest that our FSpiNN
framework is scalable for different sizes of networks and can
be used for other systems where different types of GPUs are
deployed, such as embedded systems with embedded GPUs.
Note that this work is not about justifying SNNs over
deep neural networks (DNNs). Rather, we consider what
necessary optimizations are required if the SNNs make
it to a real-world system following an increasing trend
of the neuromorphic computing, due to their benefits in
energy-efficient spike-based computations and unsupervised
learning. Moreover, there is a substantial difference in the
underlying learning mechanism between the SNNs (with the
unsupervised learning) and the DNNs (with the supervised
learning), thus we cannot directly compare the accuracy of
the unsupervised SNNs with the supervised DNNs. Previous
work [42] has observed that the accuracy of the DNNs (with
the supervised back-propagation algorithm) is generally higher
than the SNNs (with the unsupervised STDP algorithm),
because the unsupervised STDP algorithm does not have labels
when updating the weights, hence it is less effective than the
supervised ones. Furthermore, in the SNN community, many
different optimization aspects are explored, and they have
the potential to be incorporated into our FSpiNN framework.
For instance, the works in [43] and [44] focus on generating
12 TO APPEAR AT THE IEEE TRANS. ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020 (ESWEEK-TCAD SPECIAL ISSUE)
0
1
2
3
4
100 400 900 1600 2500 3600
En
e
rg
y-
e
ff
ic
ie
n
cy
 
(N
o
rm
al
iz
e
d
 t
o
 B
as
e
lin
e
)
Size of the Network
(a) Nvidia GeForce GTX 1060
0
1
2
3
4
5
100 400 900 1600 2500 3600 4900
Size of the Network
x
x
x
x
x
x
(b) Nvidia GeForce GTX 1080 Ti
Tr
ai
n
(c) Nvidia GeForce RTX 2080 Ti
0
1
2
3
4
100 400 900 1600 2500 3600
En
e
rg
y-
e
ff
ic
ie
n
cy
 
(N
o
rm
al
iz
e
d
 t
o
 B
as
e
lin
e
)
Size of the Network
In
fe
re
n
ce
0
1
2
3
100 400 900 1600 2500 3600 4900
Size of the Network
Baseline [  ] SL-STDP [  ] FSpiNN (FP32) FSpiNN (16-bit) FSpiNN (8-bit)
0
1
2
3
4
5
100 400 900 1600 2500 3600 4900
Size of the Network
x
x
x
x
x
x
x
x
x
x
x
x
x 0
1
2
3
4
100 400 900 1600 2500 3600 4900
Size of the Network
x
x
x
x
x
(a.1)
(a.2)
(b.1)
(b.2)
(c.1)
(c.2)
x
x
x
x
x
1 x
x
9MNIST
2
11]
Fig. 19. Results of energy-efficiency (normalized to the baseline) for training and inference on MNIST, while considering different sizes of networks (i.e.,
Net100, Net400, Net900, Net1600, Net2500, Net3600, and Net4900), different quantization levels (i.e., FP32, 16-bit, and 8-bit), and different types of GPUs:
Nvidia GeForce (a) GTX 1060, (b) GTX 1080 Ti, and (c) RTX 2080 Ti. Due to the limited memory, the GTX 1060 can only run Net100-Net3600.
0
1
2
3
100 400 900 1600 2500 3600 4900
Size of the Network
0
1
2
3
100 400 900 1600 2500 3600 4900
Size of the Network
0
1
2
3
100 400 900 1600 2500 3600 4900
Size of the Network
Baseline [  ] SL-STDP [  ] FSpiNN (FP32) FSpiNN (16-bit) FSpiNN (8-bit)
0
1
2
3
100 400 900 1600 2500 3600 4900
Size of the Network
0
1
2
3
100 400 900 1600 2500 3600
En
e
rg
y-
e
ff
ic
ie
n
cy
 
(N
o
rm
al
iz
e
d
 t
o
 B
as
e
lin
e
)
Size of the Network
0
1
2
3
100 400 900 1600 2500 3600
En
e
rg
y-
e
ff
ic
ie
n
cy
 
(N
o
rm
al
iz
e
d
 t
o
 B
as
e
lin
e
)
Size of the Network
(a) Nvidia GeForce GTX 1060 (b) Nvidia GeForce GTX 1080 Ti
Tr
ai
n
(c) Nvidia GeForce RTX 2080 Ti
In
fe
re
n
ce
x
x
x
x
x
x
x
x
x
x
x
x
(a.1)
(a.2)
(b.1)
(b.2)
(c.1)
(c.2)
x
x
x
x
9
1
x
x
x
x
x
x
x
x
Fashion MNIST
2
11]
Fig. 20. Results of energy-efficiency (normalized to the baseline) for training and inference on Fashion MNIST, while considering different sizes of networks
(i.e., Net100, Net400, Net900, Net1600, Net2500, Net3600, and Net4900), different quantization levels (i.e., FP32, 16-bit, and 8-bit), and different types of
GPUs: Nvidia GeForce (a) GTX 1060, (b) GTX 1080 Ti, and (c) RTX 2080 Ti. Due to the limited memory, the GTX 1060 can only run Net100-Net3600.
precise spike sequences like the real-world observation. They
target a different optimization purpose compared to the one
targeted by our FSpiNN framework. However, they can still be
incorporated in the FSpiNNs’ optimization flow for generating
precise spike sequences. This illustrates the flexibility of our
FSpiNN for integration with other optimization techniques.
VI. CONCLUSION
In this paper, we proposed a novel FSpiNN framework
that synergistically employs different techniques to reduce
the memory footprint and to improve the energy-efficiency
of SNNs, while maintaining their accuracy. Experimental
results illustrate the benefits and efficiency of the proposed
framework, compared to the state-of-the-art designs, across
different sizes of networks and different datasets (MNIST
and Fashion MNIST). For instance, in a network with
4900 excitatory neurons, our FSpiNN achieves 7.5x memory
saving and 3.5x energy-efficiency improvement on average for
training and by 1.8x on average for inference, with no accuracy
loss. In short, our proposed framework enables efficient
embedded SNN implementations for the next-generation smart
embedded systems.
ACKNOWLEDGMENT
The authors acknowledge the scholarship granted by the
Indonesia Endowment Fund for Education (IEFE/LPDP),
Ministry of Finance, Republic of Indonesia.
REFERENCES
[1] M. Pfeiffer and T. Pfeil, “Deep learning with spiking neurons:
Opportunities and challenges,” Frontiers in Neuroscience, vol. 12, p.
774, 2018.
[2] A. Tavanaei et al., “Deep learning in spiking neural networks,” Neural
Networks, vol. 111, pp. 47–63, 2019.
[3] M. Shafique et al., “Robust machine learning systems: Challenges,
current trends, perspectives, and the road ahead,” IEEE Design and Test,
vol. 37, no. 2, pp. 30–57, 2020.
[4] A. Marchisio et al., “Is spiking secure? a comparative study on the
security vulnerabilities of spiking and deep neural networks,” arXiv
preprint:1902.01147, 2019.
[5] V. Venceslai et al., “Neuroattack: Undermining spiking neural
networks security through externally triggered bit-flips,” arXiv
preprint:2005.08041, 2020.
[6] M. Davies et al., “Loihi: A neuromorphic manycore processor with
on-chip learning,” IEEE Micro, vol. 38, no. 1, pp. 82–99, January 2018.
[7] R. Massa et al., “An efficient spiking neural network for recognizing
gestures with a dvs camera on the loihi neuromorphic processor,” arXiv
preprint:2006.09985, 2020.
PUTRA et al.: FSPINN: AN OPTIMIZATION FRAMEWORK FOR MEMORY- AND ENERGY-EFFICIENT SNNS 13
[8] N. Rathi et al., “Stdp-based pruning of connections and weight
quantization in spiking neural networks for energy-efficient recognition,”
IEEE Trans. on Computer-Aided Design of Integrated Circuits and
Systems (TCAD), vol. 38, no. 4, pp. 668–677, April 2019.
[9] P. Diehl and M. Cook, “Unsupervised learning of digit recognition
using spike-timing-dependent plasticity,” Frontiers in Computational
Neuroscience, vol. 9, p. 99, 2015.
[10] H. Hazan et al., “Unsupervised learning with self-organizing spiking
neural networks,” in Proc. of the Int. Joint Conf. on Neural Networks
(IJCNN), July 2018, pp. 1–6.
[11] G. Srinivasan et al., “Spike timing dependent plasticity based enhanced
self-learning for efficient pattern recognition in spiking neural networks,”
in Proc. of the Int. Joint Conf. on Neural Networks (IJCNN), May 2017,
pp. 1847–1854.
[12] P. Panda et al., “Asp: Learning to forget with adaptive synaptic plasticity
in spiking neural networks,” IEEE Journal on Emerging and Selected
Topics in Circuits and Systems (JETCAS), vol. 8, no. 1, pp. 51–64, 2018.
[13] D. J. Saunders et al., “Locally connected spiking neural networks for
unsupervised feature learning,” Neural Networks, vol. 119, 2019.
[14] H. Hazan et al., “Lattice map spiking neural networks (lm-snns) for
clustering and classifying image data,” Annals of Mathematics and
Artificial Intelligence, September 2019.
[15] X. She et al., “Fast and low-precision learning in gpu-accelerated spiking
neural network,” in Proc. of the Design, Automation Test in Europe Conf.
Exhibition (DATE), 2019, pp. 450–455.
[16] Y. LeCun et al., “Gradient-based learning applied to document
recognition,” Proc. of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[17] F. Akopyan et al., “Truenorth: Design and tool flow of a 65 mw
1 million neuron programmable neurosynaptic chip,” IEEE Trans. on
Computer-Aided Design of Integrated Circuits and Systems (TCAD),
vol. 34, no. 10, pp. 1537–1557, October 2015.
[18] S. Sen et al., “Approximate computing for spiking neural networks,” in
Proc. of the Design, Automation Test in Europe Conf. Exhibition (DATE),
2017, March 2017, pp. 193–198.
[19] A. Roy et al., “A programmable event-driven architecture for evaluating
spiking neural networks,” in Proc. of the IEEE/ACM Int. Symp. on Low
Power Electronics and Design (ISLPED), July 2017, pp. 1–6.
[20] C. Frenkel et al., “A 0.086-mm2 12.7-pj/sop 64k-synapse 256-neuron
online-learning digital spiking neuromorphic processor in 28-nm cmos,”
IEEE Trans. on Biomedical Circuits and Systems (TBCAS), vol. 13,
no. 1, pp. 145–158, Feb 2019.
[21] M. Horowitz, “1.1 computing’s energy problem (and what we can do
about it),” in Proc. of the IEEE Int. Solid-State Circuits Conf. Digest of
Technical Papers (ISSCC), Feb. 2014, pp. 10–14.
[22] M. Capra et al., “An updated survey of efficient hardware architectures
for accelerating deep convolutional neural networks,” Future Internet,
vol. 12, no. 7, p. 113, 2020.
[23] R. V. W. Putra et al., “Drmap: A generic dram data mapping policy
for energy-efficient processing of convolutional neural networks,” arXiv
preprint:2004.10341, 2020.
[24] S. Krithivasan et al., “Dynamic spike bundling for energy-efficient
spiking neural networks,” in Proc. of the IEEE/ACM Int. Symp. on Low
Power Electronics and Design (ISLPED), July 2019, pp. 1–6.
[25] W. Maass, “Networks of spiking neurons: The third generation of neural
network models,” Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997.
[26] M. Mozafari et al., “Spyketorch: Efficient simulation of convolutional
spiking neural networks with at most one spike per neuron,” Frontiers
in Neuroscience, vol. 13, p. 625, 2019.
[27] J. Gautrais and S. Thorpe, “Rate coding versus temporal order coding:
a theoretical approach,” Biosystems, vol. 48, no. 1, pp. 57–65, 1998.
[28] S. Thorpe and J. Gautrais, “Rank order coding,” in Computational
neuroscience. Springer, 1998, pp. 113–118.
[29] C. Kayser et al., “Spike-phase coding boosts and stabilizes information
carried by spatial and temporal spike patterns,” Neuron, vol. 61, no. 4,
pp. 597–608, 2009.
[30] S. Park et al., “Fast and efficient information transmission with burst
spikes in deep spiking neural networks,” in Proc. of the 56th Annual
Design Automation Conf. (DAC), 2019, p. 53.
[31] E. M. Izhikevich, “Which model to use for cortical spiking neurons?”
IEEE Trans. on Neural Networks (TNN), vol. 15, no. 5, pp. 1063–1070,
September 2004.
[32] G. Bi and M. Poo, “Synaptic modifications in cultured hippocampal
neurons: Dependence on spike timing, synaptic strength, and
postsynaptic cell type,” J. of Neuroscience, vol. 18, no. 24, 1998.
[33] D. J. Saunders et al., “Stdp learning of image patches with convolutional
spiking neural networks,” in Proc. of the Int. Joint Conf. on Neural
Networks (IJCNN), July 2018, pp. 1–7.
[34] A. Morrison et al., “Spike-timing-dependent plasticity in balanced
random networks,” Neural Computation, vol. 19, no. 6, 2007.
[35] H. Hazan et al., “Bindsnet: A machine learning-oriented spiking neural
networks library in python,” Frontiers in Neuroinformatics, vol. 12,
p. 89, 2018.
[36] Nvidia. Nvidia geforce gtx 1060. [Online]. Available: https://www.
nvidia.com/en-in/geforce/products/10series/geforce-gtx-1060
[37] ——. Nvidia geforce gtx 1080 ti. [Online]. Available: https:
//www.nvidia.com/en-sg/geforce/products/10series/geforce-gtx-1080-ti
[38] ——. Nvidia geforce rtx 2080 ti. [Online]. Available: https:
//www.nvidia.com/de-at/geforce/graphics-cards/rtx-2080-ti
[39] ——. Nvidia jetson tx2. [Online]. Available: https://developer.nvidia.
com/embedded/jetson-tx2
[40] S. Han et al., “Deep compression: Compressing deep neural networks
with pruning, trained quantization and huffman coding,” arXiv preprint
arXiv:1510.00149, 2015.
[41] H. Xiao et al., “Fashion-mnist: a novel image dataset for benchmarking
machine learning algorithms,” CoRR, vol. abs/1708.07747, 2017.
[42] Z. Du et al., “Neuromorphic accelerators: A comparison between
neuroscience and machine-learning approaches,” in Proc. of the 48th
Annual IEEE/ACM Int. Symp. on Microarchitecture (MICRO), 2015, pp.
494–507.
[43] A. Mohemmed et al., “Optimization of spiking neural networks with
dynamic synapses for spike sequence generation using pso,” in Proc. of
the 2011 Int. Joint Conf. on Neural Networks (IJCNN), 2011.
[44] A. Russell et al., “Optimization methods for spiking neurons and
networks,” IEEE Trans. on Neural Networks (TNN), vol. 21, no. 12,
pp. 1950–1962, 2010.
Rachmad Vidya Wicaksana Putra is a research
assistant and Ph.D. student at Computer Architecture
and Robust Energy-Efficient Technologies
(CARE-Tech), Institute of Computer Engineering,
Technische Universitt Wien (TU Wien), Austria. He
received B.Sc. on Electrical Engineering in 2012
and M.Sc. on Electronics in 2015 with distinction,
both from Bandung Institute of Technology (ITB),
Indonesia. He was a teaching assistant at Electrical
Engineering Department, School of Electrical
Engineering and Informatics ITB in 2012-2017 and
also a research assistant at Microelectronics Center ITB in 2014-2017. He is
a recipient of the Indonesian Endowment Fund for Education (IEFE/LPDP)
Scholarship from Ministry of Finance, Republic of Indonesia. His research
interests mainly include computer architecture, VLSI design, system-on-chip,
brain-inspired and neuromorphic computing, energy-efficient computing, and
electronic design automation.
Muhammad Shafique (M’11 - SM’16) received
the Ph.D. degree in computer science from the
Karlsruhe Institute of Technology (KIT), Germany,
in 2011. Afterwards, he established and led a highly
recognized research group at KIT for several years
as well as conducted impactful R&D activities in
Pakistan. In Oct.2016, he joined the Institute of
Computer Engineering at the Faculty of Informatics,
Technische Universitt Wien (TU Wien), Vienna,
Austria as a Full Professor of Computer Architecture
and Robust, Energy-Efficient Technologies. Since
Sep.2020, he is with the Division of Engineering, New York University Abu
Dhabi (NYU AD), United Arab Emirates.
His research interests are in brain-inspired computing, AI & machine
learning hardware and system-level design, energy-efficient systems, robust
computing, hardware security, emerging technologies, FPGAs, MPSoCs, and
embedded systems. His research has a special focus on cross-layer analysis,
modeling, design, and optimization of computing and memory systems. The
researched technologies and tools are deployed in application use cases from
Internet-of-Things (IoT), smart Cyber-Physical Systems (CPS), and ICT for
Development (ICT4D) domains.
Dr. Shafique has given several Keynotes, Invited Talks, and Tutorials, as
well as organized many special sessions at premier venues. He has served as
the PC Chair, Track Chair, and PC member for several prestigious IEEE/ACM
conferences. Dr. Shafique holds one U.S. patent has (co-)authored 6 Books,
10+ Book Chapters, and over 200 papers in premier journals and conferences.
He received the 2015 ACM/SIGDA Outstanding New Faculty Award, AI 2000
Chip Technology Most Influential Scholar Award in 2020, six gold medals,
and several best paper awards and nominations at prestigious conferences.
