Towards Accurate and High-Speed Spiking Neuromorphic Systems with Data
  Quantization-Aware Deep Networks by Liu, Fuqiang & Liu, C.
Towards Accurate and High-Speed Spiking Neuromorphic
Systems with Data Quantization-Aware Deep Networks
Fuqiang Liu and Chenchen Liu
fqliu92@gmail.com, chliu@clarkson.edu
ABSTRACT
Deep Neural Networks (DNNs) have gained immense suc-
cess in cognitive applications and greatly pushed today’s
artificial intelligence forward. The biggest challenge in ex-
ecuting DNNs is their extremely data-extensive computa-
tions. The computing efficiency in speed and energy is con-
strained when traditional computing platforms are employed
in such computational hungry executions. Spiking neuro-
morphic computing (SNC) has been widely investigated in
deep networks implementation own to their high efficiency
in computation and communication. However, weights and
signals of DNNs are required to be quantized when deploying
the DNNs on the SNC, which results in unacceptable accu-
racy loss. Previous works mainly focus on weights discretize
while inter-layer signals are mainly neglected. In this work,
we propose to represent DNNs with fixed integer inter-layer
signals and fixed-point weights while holding good accuracy.
We implement the proposed DNNs on the memristor-based
SNC system as a deployment example. With 4-bit data
representation, our results show that the accuracy loss can
be controlled within 0.02% (2.3%) on MNIST (CIFAR-10).
Compared with the 8-bit dynamic fixed-point DNNs, our
system can achieve more than 9.8× speedup, 89.1% energy
saving, and 30% area saving.
1. INTRODUCTION
Deep Neural Networks (DNNs) have achieved great suc-
cess in cognitive applications such as image classification [1,
2, 3], object detection [4, 5], and natural language process-
ing [6]. However, the computations are extremely data-
extensive and expensive in perspective of speed and energy.
And the computing power of the current von Neumann ma-
chines with limited data bandwidth and energy efficiency
becomes insufficient to support these computations. This is-
sue becomes more severe with the rapid growth of the depth
of the deep network models [7]. Consequently, novel non-
von Neumann computing architectures and other hardware-
software co-designs based on CPU, GPU and FPGA have
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
tion on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
DAC ’18, June 24–29, 2018, San Francisco, CA, USA
c© 2018 ACM. ISBN 978-1-4503-5700-5/18/06. . . $15.00
DOI: https://doi.org/10.1145/3195970.3196131
been extensively investigated to improve the computational
efficiency [8, 9, 10].
Among these innovative works, the brain-like neuromor-
phic computing appears as a promising solution: Deep net-
works are implemented by VLSI designs and high comput-
ing efficiency in speed and energy is obtained inherently
by fulfilling data processing and communication in a single-
chip [11]. Neuromorphic designs with digital or analog com-
putations have been reported not only in traditional CMOS
technology but also in post-silicon devices such as spin de-
vices and memristor [12, 11, 13, 14]. Contributed by the
event-driven computation and digitized data communica-
tion, spiking neuromorphic computing (SNC) has been proved
to be ultra-low-cost in design and energy and is highly at-
tractive in deploying and executing deep networks.
The current system-level spiking designs mainly employ
an off-line training methodology and the well-trained deep
networks are deployed on the hardware system. Nonethe-
less, one big challenge exists when performing the straight-
forward deployment, that is, obvious system accuracy loss
induced by the constrained precision of synapses (or synap-
tic weights) and neurons (or inter-layer signals). For exam-
ple, the IBM’s TureNorth chip has five synaptic states (i.e.
0, ±1, ±2) and acceptable precision can only be achieved by
assembling multiple synaptic layers with sacrificed design
and energy cost [15]. Similarly, the synaptic weights in the
memristor-based designs are usually represented by three or
four bits data. Although the memristor devices can afford
continuous conductance states or 6-bit (64 levels) as was
reported by HP Labs [16], the heavy programming cost in
speed and circuit design are not acceptable. In these SNC
designs, the neuron signals are rate coded and the signal
strength is represented by spike numbers in a time window
in discrete values. To get sufficient accuracy, the compu-
Figure 1: (a) Computation speed in different pre-
cision of neurons, (b) Accuracy loss caused by low-
precision neurons and weights, respectively (evalu-
ated on LeNet for MNIST dataset).
ar
X
iv
:1
80
5.
03
05
4v
1 
 [c
s.C
V]
  8
 M
ay
 20
18
tation speed will be compensated, e.g. an 8-bit precision
corresponds to 256 spikes and requires large time window
for spike generation.
While solutions have been explored in previous works,
the issue is still unsolved completely. In [17], Wang et al.
proposed a one-level precision synapse and applied it on
the memristor-based neuromorphic design, while the con-
strained precision of the neuron signals are unconsidered.
However, as is shown in Figure 1, computing speed of the
spiking system is mainly constrained by data communica-
tion (i.e. time required by spikes generation to guarantee
good accuracy). And compared to weight quantization, dis-
critizing the neurons results in larger accuracy loss. In addi-
tion, realistic neuromorphic design of the proposed one-level
synapse is challenging due to the various synaptic states dis-
tribution in different layers. Recently, researchers tried a
binary synapse and neurons deployment on TureNorth chip
targeting high speed and low energy while retaining accu-
racy [9]. The training rule is similar to BinaryNet [18] and
usually leads to obvious accuracy loss [19].
In this work, we focus on tackling the unacceptable ac-
curacy loss caused by the low-precision spike neurons and
synapses during deep networks deployment. The 32-bit floating-
point deep networks are transformed to data quantization-
aware networks with fixed integer neurons and fix-point synapses.
The proposed networks can be applied to the emerging SNC
universally. The memristor-based platform [8, 20, 12] is se-
lected as our deployment example in this work. Our target
is retaining the high accuracy of deep networks in build-
ing dedicate hardware framework with high computing effi-
ciency. Our major contributions are summarized as follows:
• We transform the inter-layer signals of deep networks
to be M -bit fixed integers in neural network training
to mimic the discrete spike neurons in the SNC. These
integral data in different layers are constrained to the
same range and hence hardware implementation-friendly;
• We propose a weight clustering methodology to rep-
resent the synapses with N -bit fixed point data in a
linear distribution. The best affordable states are ob-
tained to improve system accuracy in low design cost;
• We deploy the proposed quantization-aware deep net-
works on the memristor-based SNC for performance
evaluation. The system accuracy on the state-of-the-
art dataset such as MNIST and CIFAR-10 are mea-
sured. The speed, area, and energy are evaluated and
compared with previous 8-bit fixed point design.
Our experimental results show that, when utilizing 4-bit
integral neurons and fixed-point synapses and comparing
with the ideal 32-bit floating point DNNs, our accuracy loss
can be controlled within 0.02% and 2.3% on MNIST and
CIFAR10. Compared with 8-bit fixed-point precision, our
system can achieve more than 13.9× speedup, 89.1% energy
saving, and 30% area saving.
2. PRELIMINARY
2.1 Related Works in DNNs Quantization
Normally, the state-of-the-art deep networks are repre-
sented by 32-bit floating points. Quantized DNNs have been
explored in previous works to facilitate computation burden
Figure 2: Deploying a convolutional layer of DNN
on crossbars
and hardware complexity while retain comparable accuracy.
Some earlier works focus on training DNNs with quantized
weights and regardless of inter-layer signals [21, 22]. For
example, Lin et.al trained the deep network efficiently with
binary weights and quantized back propagation [21].
In recent works, implementation of DNNs with fixed-point
synaptic weights and inter-layer signals are proposed. Gysel
et al. [23] compressed DNNs into 8-bit dynamic fixed point
values. A fine-tuning was employed to recover the accu-
racy loss incurred by the weight quantization, however, the
loss caused by the inter-layer signal quantization can not
be recovered. Adopting Gysel’s quantization process, Tann
et al. [24] proposed DNNs with inter-layer signals in 8-bit
dynamic fixed-point precision and weights in integer power-
of-two values. Lin et al. [21] proposed to tune DNNs with
fixed-point weights and inter-layer signals. These works can
achieve the target of improving computation efficiency in
speed, hardware cost, and energy. Unfortunately, they are
not adaptive to the spiking neuromorphic systems in two
reasons. First, the 8-bit data utilization of inter-layer sig-
nals in the spiking systems will be extremely expensive in
speed and hardware complexity. Second, the dynamic val-
ues varies greatly in the range for different layers and lead
to large design complexity.
Different with the above works, we implement quantized
deep networks that be particularly feasible to the spiking
neuromorphic systems: the proposed networks have fixed
integer neurons and fixed point synapses and different layers
have uniform values.
2.2 DNNs Deployment on SNC
The memristor-based SNC platform is chosen to deploy
DNNs in this work. Memristor is a two-terminal device in a
MIM (metal-insulator-metal) structure that stores informa-
tion by resistance states [25, 26, 27]. Its high density cross-
bar structure and multiple states enable nature implementa-
tion of vector-matrix computation in a neural network, and
thus are extremely attractive to be leveraged in the emerging
neuromorphic approaches [8, 20, 12].
In the SNC implementation, weights in a neural network
layer correspond to the memristor devices in a crossbar ar-
ray. Outputs of each layer are transformed to spikes and
be fed into the next layer as inputs. Fully connected layers
in a DNN can be mapped on a crossbar directly [12], while
it is more complicate to implement a convolutional layer.
Figure 2 depicts how to deploy a convolutional layer on the
memristor-based SNC. Filters in a convolutional layer is de-
ployed to the crossbar column by column: Kij that represents
the jth convolution filter in the ith layer is mapped to the
jth column of the crossbar, i.e. BLj ; the covolutional results
of the filter Kij is obtained at the end of BLj . Obviously,
the ith layer with a filter number of J i requires crossbar
with J columns. Consider the scenario that each 4-D filer
in the ith layer has a size of si × si (si is the scale of the
filter) with a depth of di, the number of required rows will
be si×si×di×J i−1 [1]. Here, the di is equals to J i−1, which
is the filer number of the (i− 1)th layer. Constrained by the
realistic size of the memrisotr-based crossbar [28], multiple
crossbar are utilized in parallel to compose a large layer.
The crossbar numbers will be calculated as Equation 1.
Li =
⌈
J i
t
⌉
·
⌈
si × si × J i−1
t
⌉
. (1)
where dxe = k, k ≥ x & x > (k − 1) & k ∈ Z, t is the row
or column size of a square crossbar.
3. DESIGN METHODOLOGY
In this work, we aim to construct quantized DNNs with
high accuracy, and hence obtain SNC with optimal accu-
racy, computation efficiency, and design cost. Two major
approaches are proposed–a Neuron Convergence for fixed
integer inter-layer signals and a Weight Clustering for fixed-
point weights. All network layers are executed and gain
uniform values to minimize hardware design complexity.
3.1 Neuron Convergence
In this work, inter-layer signals are constrained to fixed
integer in a dedicate range that is decided by target bit
width. The ranges are the same in all layers to achieve
uniform values in networks and alleviate design complexity.
The fixed integer is adopted to mimic the discrete output
spikes and the dedicate bit width is designed to decrease
required spike numbers for high speed and low design cost.
Notably, quantizing inter-layer signals causes significant ac-
curacy loss. We propose a novel neuron regularization in
neural network training to recover the loss.
Figure 3: The forms of different regularization.
Here, the bit width is set to be 2.
(a) none (b) l1-norm
(c) truncated l1 (d) proposed reg
Figure 4: Inter-layer signals distribution with ap-
plied regularization in four scenarios: none, l1 −
norm, truncated l1 − norm and the proposed. The
1st hidden layer’s outputs of LeNet on MNIST are
shown as example and M is set to be 4).
As is indicated in Figure 3, during neural networks train-
ing, l1 − norm regularization and truncated l1 − norm are
usually utilized for weights sparsity and range restriction,
respectively. In contrast, we propose a regularization term
that can train neural networks with inter-layer signals not
only sparse but also range-fixed. Particularly, the range in
all the layers are uniform. The loss function in neural net-
works training are formulated as Equation 2.
E(W ) = ED(W ) + λ ·R(W ) +
L∑
i=1
(λi ·Rg(Oi)). (2)
Here, W represents the weights in the DNN; ED(·) is the loss
term; R(·) is the normal regularization on weights. Rg(·) is
the proposed regularization on each inter-layer and its cal-
culation in the ith layer can be represented as Rg(O
i) =∑Ri
r=1
∑Ci
c=1
∑Di
d=1 rg(o
i
r,c,d) (r, c, and d represent row, col-
umn, and depth, separately). The regularization of each
inter-layer signal rg(·) is calculated by Equation 3, as is de-
scribed by the blue curve in Figure 3.
rg(o) =
{|o| − 2M−1 + α · |o| , |o| ≥ 2M−1
α · |o| , others (3)
α is set to be 0.1 empirically and M is the targeting quan-
tized bit width.
Demonstrated by Figure 4, our proposed regularization
constrains the inter-layer signals in the objective range with
sparse values successfully. To obtain the target bit width,
signals in the constrained range are then quantized to in-
teger values. Note that the sparse and uniform range-fixed
signal can greatly reduce the quantization loss inherently by
minimizing the error transmitted from one layer to the next
layer.
More specifically, Equation 4 illustrates the error between
oir,c,d and its quantized value oˆ
i
r,c,d.
4oir,c,d =
di−1∑
d=1
bsi/2c∑
θ=−bsi/2c
bsi/2c∑
θ=−bsi/2c
4oi−1r+θ,c+θ,d ·wij,θ,θ,d. (4)
where wij,θ,θ,d represents the weight of the j
th convolution
filter in the ith layer, and si is the size of the filter. After the
proposed training, the absolute value of the weights in the
DNN should be extremely small as all the inter-layer signals
are distributed in the same range. Following Equation 4,
the error 4oir,c,d is small and leads to mainly unchanged⌈
oix,y,z
⌉
. As a result, the error propagation is prevented and
the quantized error is minimized.
Following the proposed DNNs training, the inter-layer sig-
nals are quantized to M-bit integer values with sparse prop-
erty while retaining good accuracy.
3.2 Weight Clustering
In implementing the memristor-based SNC, floating points
synaptic weights in a DNN are quantized to the available
resistance states of the devices and result in accuracy drop.
We further propose a weight clustering to achieve fixed-point
synaptic values in linear distribution that is hardware im-
plementation friendly and can also reduce the accuracy loss.
Based on the inter-layer signals obtained in Sec. 3.1, the
accuracy loss generated by weight quantization can be rep-
resented by Equation 5.
4oir,c,d =
di−1∑
d=1
bsi/2c∑
θ=−bsi/2c
bsi/2c∑
θ=−bsi/2c
oi−1r+θ,c+θ,d ·4wij,θ,θ,d. (5)
Because of the sparse inter-layer signals indicated in Fig-
ure 4, the majority of oi−1r+θ,c+θ,d are zero or close to zero.
Similar to the explanation in Sec. 3.1, the accuracy loss in
the weight quantization is extremely small. To further lower
the loss, we train a cluster to minimize the error between the
original weights and the quantized weights, as is depicted in
Equation 6.
D∗ = argmin
D
{
∥∥∥∥ D2N −W
∥∥∥∥2} (6)
where elements inD belongs to {0, ±1, ±2, ..., ±2N−2, 2N−1},
N ≥ log2( max(|D|)max(|W |) ), W represents the weight matrix of a
DNN, D
2N
is the quantized matrix with fixed point, and N
is the target bit width of the weights.
The Equation 6 is designed to find a matrix D
2N
, whose
elements are N -bit fixed-point values with a linear distri-
bution and best nearest the ideal floating point matrix in
the DNN. Here, we transform the weight quantization to an
optimization problem that can be solved by k-nearest neigh-
bors algorithm.
4. EXPERIMENTS
4.1 Experimental Setup
Three different DNNs–Lenet, Alexnet, and Resnet are de-
veloped on Torch. Neural networks model details and their
ideal accuracy (Ideal Acc.) on MNIST and CIFAR10 with-
out quantization are listed in Table 1. The quantized net-
works following our proposed method are implemented on
the memristor-based SNC, and the hardware design method-
ology follows [12]. Resistance states range of the memristor
device is set to be [50KΩ, 1MΩ] [12]. The crossbar size is
set to be 32 × 32, and required crossbar numbers of each
network layer is calculated by the Equation 1 in Sec. 2.2.
Table 1: Neural Network Models and Ideal Accu-
racy on MNIST and CIFAR10
Model Lenet Alexnet Resnet
Dataset MNIST CIFAR10 CIFAR10
Input Size 28× 28× 1 32× 32× 3 32× 32× 3
Conv Layers 2(5× 5) 1(5× 5), 4(3× 3) 17(3× 3)
FC Layers 2 3 1
Weights 7× 103 3.4× 105 1.2× 107
Ideal Acc. 98.16% 85.35% 93.05%
4.2 Neuron Convergence on Inter-layer Sig-
nals Quantization
The capability of Neuron Convergence in recovering quan-
tization accuracy loss is evaluated and results are listed in
Table 2. In this experiment, the weights are ideal floating
points without quantization. Inter-layer signals of the Lenet,
Alexnet, and Resnet are quantized to 5-bit, 4-bit, and 3-bit
integer values by utilizing the proposed training and tra-
ditional training without the Neuron Convergence. As an
example, accuracy of the two scenarios are represented by
“Lenet (w/)” and “Lenet (w/o)” in Table 2. The accuracy
recovered from traditional quantization by utilizing our pro-
posed method is shown as “Recovered Acc”. The computing
accuracy of our proposed design is also compared with the
idea accuracy and the accuracy loss is described in Table 2
as “Acc. Drop”.
Table 2: The Accuracy Measurement after Neuron
Quantization with and without Neuron Convergence
Model 5-bit 4-bit 3-bit
Lenet (w/o) 97.74% 97% 92.9%
Lenet (w/) 98.16% 98.15% 98.13%
Recovered Acc. 0.42% 1.15% 5.24%
Acc. Drop -0% -0.01% -0.03%
Alexnet (w/o) 82.51% 77.8% 67.83%
Alexnet (w/) 85.2% 83.15% 82.1%
Recovered Acc. 2.69% 4.95% 14.27%
Acc. Drop -0.15% -2.2% -3.25%
Resnet (w/o) 91.37% 75.72% 26.57%
Resnet (w/) 92.5% 91.33% 88.95%
Recovered Acc. 1.13% 15.61% 62.38%
Acc. Drop -0.55% -1.72% -4.1%
The results indicate quantizing inter-layer signals directly
without the proposed Neuron Convergence induces heavy
accuracy loss, which is unacceptable. For example, the ac-
curacy of Alexnet and Resnet with 3-bit inter-layer signals
on CIFAR10 drops to 67.83% and 26.57% from ideal ac-
curacy 85.35% and 93.05%, respectively. By utilizing our
proposed method, the accuracy can be recovered to 82.1%
and 88.95%. The Lenet network on MNIST is robust and
the 4-bit and 3-bit network has only 0.01% and 0.03% ac-
curacy loss with our proposed training and discretize. Our
method can quantize the Alexnet and Resnet to 4-bit signals
with 83.15% and 91.33% accuracy on CIFAR10. Compared
with the ideal accuracy, the accuracy loss caused by the pro-
posed 4-bit precision is only 2.2% and 1.72%. The accuracy
drop of the three networks in 5-bit signals are fully recov-
ered (0% on MNIST) or extremely small (0.15% and 0.55%
on CIFAR10) after using our proposed method.
The above results demonstrate that our proposed Neu-
ron Convergence can recover the accuracy loss during signal
quantization successfully. Neural networks with fixed inte-
ger signal and good accuracy are obtained. The best accu-
racy with 4-bit inter-layer signals on CIFAR10 can achieve
91.33%, and the accuracy is 98.15% on MNIST.
4.3 Weight Clustering on Weights Quantiza-
tion
We also evaluate the performance of the proposed Weight
Clustering in recovering the accuracy loss caused by weights
quantization. Table 3 shows the experimental results with
and without the proposed method. Similarly, networks with
5-bit, 4-bit, and 3-bit fixed point weights are evaluated and
the inter-layer signals are set to be ideal floating points with-
out quantizaiton. The results indicate that our proposed 4-
bit Lenet, Alexnet, and Resnet can achieve 98.1%, 83.69%,
and 91% on MNIST and CIFAR10 with only 0.06%, 1.76%,
and 2.05% accuracy drop, comparing to the ideal accuracy.
4.4 Neuron Convergence and Weights Clus-
tering on Data Quantization
In this experiment, the proposed Neuron Convergence and
Weights Clustering are applied together in the three neural
networks for overall performance evaluation. Through the
proposed method, the inter-layer signals and the weights
are quantized to fixed integer values and fixed-point values
in 5-bit, 4-bit, and 3-bit, respectively. The accuracy of the
networks with and without the proposed method is depicted
and compared in Table 4. Similar to Sec. 4.2 and Sec. 4.3,
the “Recovered Acc.” indicates accuracy recovery ability of
our proposed method and “Acc. Drop” is the accuracy loss
compared with the ideal accuracy. Besides compared with
the ideal accuracy in Table 1, we also include the accuracy
of the 8-bit dynamic fixed point neural networks in [23] for
comparison.
Compard with the 8-bit dynamic fixed point networks
in [23], our proposed networks with 5-bit integer inter-layer
signal and fixed-point weights can gain almost the same ac-
curacy: same accuracy of Lenet on MNIST, only 0.03% drop
of Alexnet on CIFAR10, and 0.72% drop of Resnet on CI-
FAR10. Our proposed 4-bit Lenet, Alexnet, and Resnet can
achieve accuracy of 98.14%, 83.05%, and 90.33% on MNIST
and CIFAR10 with only 0.06%, 1.76%, and 2.05% accuracy
loss compared with the ideal accuracy. Even with 3-bit data
Table 3: The Accuracy Measurement after Weights
Quantization with and without Weight Clustering
Model 5-bit 4-bit 3-bit
Lenet (w/o) 98.16% 97.86% 94.52%
Lenet (w/) 98.16% 98.1% 97.79%
Recovered Acc. 0% 0.24% 3.27%
Acc. Drop -0% -0.06% -0.37%
Alexnet (w/o) 83.02% 79.19% 75.33%
Alexnet (w/) 85.26% 83.59% 82.92%
Recovered Acc. 2.28% 4.4% 7.59%
Acc. Drop -0.05% -1.76% -2.43%
Resnet (w/o) 91% 77.12% 29%
Resnet (w/) 92.8% 91% 88.1%
Recovered Acc. 1.8% 12.88% 59.1%
Acc. Drop -0.25% -2.05% -4.95%
Table 4: The Accuracy Measurement after Signals
and Weights Quantization with and without our pro-
posed method
Lenet 8-bit [23] 98.16%
Model 5-bit 4-bit 3-bit
Lenet (w/o) 97.74% 96.38% 93.43%
Lenet (w/) 98.16% 98.14% 97.46%
Recovered Acc. 0.42% 1.76% 4.03%
Acc. Drop -0% -0.02% -0.7%
Alexnet 8-bit [23] 84.5%
Model 5-bit 4-bit 3-bit
Alexnet (w/o) 81.8% 76.16% 69.7%
Alexnet (w/) 84.47% 83.05% 81.53%
Recovered Acc. 2.67% 6.89% 11.83%
Acc. Drop -0.88% -2.3% -3.82%
Resnet 8-bit [23] 91.75%
Model 5-bit 4-bit 3-bit
Resnet (w/o) 91.03% 75.16% 22.18%
Resnet (w/) 91.48% 90.33% 87.71%
Recovered Acc. 0.45% 15.17% 65.53%
Acc. Drop -1.57% -2.72% -5.34%
representation, our method can achieve 97.46% on MNIST
and 87.71% on CIFAR10.
Based on the above discussions, it is proved that our pro-
posed method can represent DNNs using 4-bit or even 3-bit
data representation in the inter-layer signals and weights
while keeping good accuracy.
4.5 Improvement on Computation Efficiency
In the SNC system implementation, computation result of
one DNNs layer is transformed to spikes by integrate-and-
fire circuits (IFCs) to generate digitized outputs through
counters [12, 11]. Therefore, a reduced bit width of signals
between layers corresponds to less required spike numbers
and thus improved speed, design cost, and energy efficiency.
Weights quantization also helps to improve computing effi-
ciency and reduce hardware design complexity by decreasing
the utilization of synaptic crossbar and programming cost.
In this work, the benefit of our proposed DNNs with M
bit fixed integer inter-layer signals and N bit fixed-point
weights on improving computation efficiency is evaluated on
the memristor-based SNC. Based on the results in Sec. 4.4,
two scenarios with (M, N ) is (4, 4) and (3, 3) are imple-
mented and analyzed. The 8-bit dynamic fixed-point in [23]
is also implemented for comparison. In the memristor-based
SNC, each computation unit (i.e. neural network layer) in-
cludes four components: wordline (WL) drivers to generate
robust input signals, memristor-based crossbars to complete
the matrix computation, IFCs to convert the current results
from the crossbar to spikes, and counters to generate digi-
tized output of each layer. The speed, energy, and area are
obtained from circuits simulation on IBM 130nm technology
and the simulation parameter configuration is based on [12].
The results in Table 5 show that our proposed method can
achieve significant computation efficiency improvement com-
pared with the previous 8-bit dynamic fixed point DNNs.
Our systems have more than 9.8× speed up, 89.1% energy
saving and 29.7% area saving.
Table 5: Memristor-bsed SNC System Evaluation and Comparison
model Layer Num.
Speed
Speedup
Energy Energy Area Area
(MHz) (uJ) Saving (mm2) Saving
Lenet 8-bit [23] 4 0.64 - 4.7 - 1.48 -
Lenet 4-bit in this work 4 8.93 13.9x 0.57 87.9% 1.04 29.7%
Lenet 3-bit in this work 4 15.63 24.4x 0.27 94.3% 0.93 37.2%
Alexnet 8-bit [23] 8 0.27 - 337.0 - 34.3 -
Alexnet 4-bit in this work 8 2.66 9.8x 36.9 89.1% 24.0 30%
Alexnet 3-bit in this work 8 3.79 11.8x 26.3 92.2% 21.4 37.6%
Resnet 8-bit [23] 18 0.11 - 19200 - 937.3 -
Resnet 4-bit in this work 18 1.38 12.5x 1500 92.2% 656.2 30%
Resnet 3-bit in this work 18 2.20 20x 935 95% 585.9 37.5%
5. CONCLUSIONS
DNNs quantization in implementing the spiking neuro-
morphic computing (SNC) is important for acceptable de-
sign complexity and computational efficiency. However, di-
rectly weights and inter-layer signal quantization cause heavy
accuracy loss. In this work, we propose data quantization-
aware DNNs with a neuron convergence and a weight clus-
tering method to recover the accuracy loss in neural net-
work quantization. The obtained fixed integer signals and
fixed-point weights particularly benefit the SNC in design
cost and computation efficiency. We carefully deploy the
quantized DNNs on the memristor-based SNC to study the
system efficiency improvement that can be achieved by the
proposed method. The system accuracy and performance is
evaluated in three networks–Lenet, Alexnet, and Resnet on
MNIST and CIFAR10 and compared with the ideal DNNs
and the previous 8-bit dynamic fixed-point DNNs. The re-
sults indicate that the design can achieve 98.14% and 90.33%
accuracy on MNIST and CIFAR10 with 4-bit data represen-
tation, which is only 0.02% and 2.72% lower than the ideal
DNNs. Compared with the 8-bit dynamic fixed point frame-
work, the proposed design demonstrates more than 9.8×
speedup, 89.1% energy saving, and 30% area saving.
6. ACKNOWLEDGMENTS
This work is supported in part by AFRL ICA2017-UP-
017. We would like to thank NVIDIA Corporation for their
generous GPU donation. Any opinions, findings, and con-
clusions or recommendations expressed in this material are
those of authors and do not necessarily reflect the views of
AFRL or its contractors.
7. REFERENCES
[1] A. Krizhevsky et al., “Imagenet classification with deep
convolutional neural networks,” in NIPS, pp. 1097–1105,
2012.
[2] K. Simonyan et al., “Very deep convolutional networks for
large-scale image recognition,” in ICLR, 2015.
[3] K. He et al., “Deep residual learning for image recognition,”
in CVPR, pp. 770–778, 2016.
[4] Girshick et al., “Rich feature hierarchies for accurate object
detection and semantic segmentation,” in CVPR,
pp. 580–587, 2014.
[5] S. Ren et al., “Faster R-CNN: Towards real-time object
detection with region proposal networks,” in NIPS,
pp. 1137–1149, 2015.
[6] D. Li et al., “Recent advances in deep learning for speech
research at microsoft,” in ICASSP, pp. 8604–8608, 2013.
[7] M. Rastegari et al., “Xnor-net: Imagenet classification using
binary convolutional neural networks,” in ECCV, 2016.
[8] P. Chi et al., “Prime: A novel processing-in-memory
architecture for neural network computation in reram-based
main memory,” in ISCA, pp. 27–39, 2016.
[9] E. S. K. et al., “Convolutional networks for fast, energy
efficient neuromorphic computing,” in PNAS, 2016.
[10] S. Han et al., “EIE: efficient inference engine on compressed
deep neural network,” in ISCA, pp. 243–254, 2016.
[11] P. Merolla et al., “A digital neurosynaptic core using
embedded crossbar memory with 45pj per spike in 45nm,”
in CICC, pp. 1–4, 2011.
[12] C. Liu et al., “A spiking neuromorphic design with resistive
crossbar,” in DAC, pp. 1–6, 2015.
[13] S. A. G. et al., “Effective calculations on neuromorphic
hardware based on spiking neural network approaches,” in
Lobachevskii Journal of Mathematics, pp. 964–996, 2017.
[14] A. Aayush et al., “Resparc: A reconfigurable and
energy-efficient architecture with memristive crossbars for
deep spiking neural networks,” in DAC, pp. 1–6, 2017.
[15] Merolla et al., “A million spiking-neuron integrated circuit
with a scalable communication network and interface,”
Science, pp. 668–673, 2014.
[16] C. Liu et al., “Rescuing memristor-based neuromorphic
design with high defects,” in DAC, pp. 1–6, 2017.
[17] Y. Wang et al., “Classification accuracy improvement for
neuromorphic computing systems with one-level precision
synapses,” in ASP-DAC, pp. 776–781, 2017.
[18] Hubara et al., “Binarized neural networks,” in NIPS,
pp. 4107–4115, 2016.
[19] O. Russakovsky et al., “Imagenet large scale visual
recognition challenge,” in IJCV, pp. 211–252, 2015.
[20] L. S. et al., “Pipelayer: A pipelined reram-based accelerator
for deep learning,” in HPCA, pp. 541–552, 2017.
[21] D. D. Lin et al., “Fixed point quantization of deep
convolutional networks,” in ICML, pp. 2849–2858, 2016.
[22] H. Song et al., “Deep compression: Compressing deep
neural networks with pruning, trained quantization and
huffman coding,” in ICLR, 2016.
[23] P. Gysel et al., “Hardware-oriented approximation of
convolutional neural networks,” in ICLR, 2016.
[24] T. Hokchhay et al., “Hardware-software codesign of
accurate, multiplier-free deep neural networks,” in DAC,
pp. 1–6, 2017.
[25] T. W. Lee et al., “Memristor resistance modulation for
analog applications,” EDL, pp. 1456–1458, 2012.
[26] H. Jo et al., “Nanoscale memristor device as synapse in
neuromorphic systems,” in Nano letters, pp. 1297–1301,
2010.
[27] G. S. S., “Spike-timing-dependent learning in memristive
nanodevices,” in NANOARCH, pp. 85–92, 2008.
[28] Y. Wang et al., “Group scissor: Scaling neuromorphic
computing design to large neural networks,” in DAC,
pp. 1–6, 2017.
