SpRRAM: A Predefined Sparsity Based Memristive Neuromorphic Circuit for
  Low Power Application by Fayyazi, Arash et al.
SpRRAM: A Predefined Sparsity Based Memristive 
Neuromorphic Circuit for Low Power Application 
Arash Fayyazi*   Souvik Kundu*   Shahin Nazarian    Peter A. Beerel   Massoud Pedram 
University of Southern California, Los Angeles, USA 
{fayyazi, souvikku, s.nazarian, pabeerel, pedram}@usc.edu  
 
 
 
Abstract— In this paper, we propose an efficient predefined 
structured sparsity-based ex-situ training framework for a hybrid 
CMOS-memristive neuromorphic hardware for deep neural 
network to significantly lower the power consumption and 
computational complexity and improve scalability. The structure is 
verified on a wide range of datasets including MNIST handwritten 
recognition, breast cancer prediction, and mobile health 
monitoring. The results of this study show that compared to its fully 
connected version, the proposed structure provides significant 
power reduction while maintaining high classification accuracy. 
Keywords— Deep Neural Network, Sparsity, Neuromorphic 
circuit, Low Power Circuit. 
I. INTRODUCTION 
In today’s data-driven world, Deep Neural Networks 
(DNNs) play a key role in driving the state of the art 
technologies like image processing, pattern recognition and 
speech recognition. Modern neural networks are formally 
built as graphs with a large number of trainable parameters 
[1] which is memory intensive. This makes the process of 
both training and inference of such networks an arduous job 
to perform in a power efficient on-chip manner.  
Imitating the computational behaviors and reverse 
engineering the immensely authoritative cognitive capacities 
of the human brain has opened the door of brain-inspired 
neuromorphic computing [2] which has two major variant as 
Artificial Neural Network (ANN) and Spiking Neural 
Network (SNN) [3]. In our paper, we have focused on analog 
implementation of ANN for its higher accuracy and power 
efficiency [3]. Memristor [4], also known as the fourth 
fundamental passive two-terminal component is one of the 
most popular choices for analog ANN design due to its area 
efficient way of storing weights in terms of resistance.  
Introduced in 1971 and first physical realized in 2008 [5], 
this element has the capability of being fabricated densely and 
has plasticity in resistance [6]. The conductance value of the 
memristors can be altered by the voltage applied to them that 
should be larger than a threshold voltage. The amplitude and 
duration of the voltage pulse determine the amount of the 
change in the resistance. We have used a cross-bar array of 
memristors for hardware realization of synapses of ANNs 
which performs a weighted summation of the inputs. In 
addition to the synapses, an activation function which is 
usually a non-linear function is essential for implementation  
* Equal contribution 
of an ANN. The activation function can be implemented using 
circuits such as CMOS inverters [7] and op-amps [8]. In this 
work, we have focused on the inverter-based implementation 
which would result in lower area and power consumption [7]. 
The large number of parameters and connectivity of the 
network provide the neural network a complex non-linear 
nature that enables them to effectively draw precise decision 
boundary between classes; however, too many parameters 
make the network likely to overfit the outputs of the ANN to 
the training samples. This tradeoff makes it a challenging task 
to predict a proper network structure for a dataset. For many 
overfitted networks separate hyperparameters are used to 
reduce overfitting and remove unnecessary memorization of 
undesirable noise patterns. Earlier methods of parameter 
reduction mentioned in [9],[10] perform significantly well in 
software but are not quite helpful to adapt in a hardware 
implementation of network training because they require the 
network to be fully connected at some point in time during 
training. Also, their additional computation overhead to 
reduce network structure causes extra power consumption 
which limits their applicability in a power-sensitive hardware 
environment. Recently, [11] has shown that many of these 
parameters can be omitted without any significant loss in 
network fidelity. In this paper, we adopt the idea of a hardware 
implementable predefined sparsity (keeping a fraction of the 
total weights, in one or multiple junctions before training) 
[12] based neural network training. In particular, we propose 
a power efficient structured predefined sparsity-based hybrid 
memristive ex-situ training framework. Here, by ‘structured’ 
we mean equal fan-out for each neuron in the preceding layer 
and equal fan-in for each neuron in succeeding layer 
respectively. This framework is unique because the training 
algorithm can also be implemented on-chip without any 
additional computation or hardware cost to reduce the number 
of weights. In our proposed framework, the Memristor 
Crossbar Array (MCA) is no longer densely connected. In 
addition to being power efficient, these sparsely connected 
MCAs inherently deal with the overfitting issue of a DNN.   
The major contributions of this paper are summarized 
below: 
1. To the best of our knowledge, we are the first to propose 
and validate a structured predefined sparse network in a 
MCA-based ex-situ training framework. This approach 
significantly reduces the power consumption of large 
neural networks for on-chip inference, making it a 
strong hardware foundation for ultra-low power 
machine learning domains, such as in IoT edge devices.  
2. We present an enhancement to the Physical 
Characteristic Aware Ex-situ training framework 
(PHAX) [7] to support sparse ANN structures. Using 
this framework, we show that predefined sparsity can 
yield no significant loss of test accuracy and cross-
validated these results with a HSPICE simulation model.  
3. We further present experimental results that explore the 
effect of different levels of sparsity and memristor 
process variation on classification accuracy of the 
network. 
 The rest of the paper is organized as follows. Section II 
presents the proposed predefined sparsity-based ex-situ 
memristive training framework. Section III describes the 
experimental setup and results against benchmark datasets 
(IRIS [13], BCW [14], MNIST [15] and MHEALTH [16]). 
Finally, the paper is concluded in section IV.  
II. PROPOSED FRAMEWORK 
We first review the MCA present in PHAX [7] in Section II.A 
before describing our enhancement to support predefined 
sparsity in inference and training in Section II.B and II.C 
respectively. It is noteworthy that our proposed training 
algorithm is applicable to a variety of underlying memristive 
circuits [7], [8]. 
A.  Memristor-based ANN Structure 
The fully connected (FC) junction memristive circuit, 
shown in Fig. 1, has a better performance, lower power 
consumption, higher energy efficiency, and smaller area than 
other memristive neuromorphic circuits such as the op-amp 
based circuit proposed in [8]. For this circuit, the dot-product 
operation is performed in the memristive crossbar while the 
inverters implement the neuron’s nonlinear operation, i.e. 
activation function. The circuit, which has differential inputs, 
makes use of two memristors per weight, implementing both 
negative (𝑛) and positive (𝑝) weights. 
Memristor
o1
net2 net3
V2p
V2n
V1p
V1n
o2 o3
net1σ11p
σ32n
Vdd /2
-Vdd /2
Vdd /2
-Vdd /2
(Weighted sum of inputs and  biases)
V V V
(non-inverted)
(inverted)
(non-inverted)
(inverted)
 
Fig. 1. Circuit structure of the FC memristive neuromorphic circuit [7] used 
in this work. 
The inverter voltage transfer characteristics (VTC) has the 
form of a scaled sigmoidal function, which acts as the neuron 
activation function. To provide differential inputs for the next 
layer, two inverters are used at the output of each layer 
(except the last layer). It should be noted that for classifier 
ANNs, each output of the circuit corresponds to one of the 
output classes and has a digital value of either logical “0” or 
logical “1”. In the case of function approximate applications, 
however, the outputs of ANNs are analog and an analog-to-
digital converter is needed to digitalize their outputs [17]. 
B.  Structured Predefined Sparsity Characterization  
A DNN with all FC junctions has all neurons of the 
(𝑙 − 1)𝑡ℎ layer connected to all neurons of the (𝑙)𝑡ℎ layer, 
thus demanding a large memory to store the weights of each 
junction which must be updated during the training. Most 
parameter reduction methods such as [9],[10] reduce the size 
of these weight matrices through iterative algorithms and start 
with an FC network. Hence, it is not clear how to apply these 
methods to a hardware training environment. To address this 
issue, we propose a structured predefined sparse neural 
network. Consider a network with 𝐿 layers of neurons. Thus, 
the network has 𝐽 = (𝐿 − 1)  junctions. The junction 
between layer 𝑗 and layer 𝑗 + 1 has 𝑁𝑗  ×  𝑁𝑗+1 weights 
giving a total of ∑ 𝑁𝑗  ×  𝑁𝑗+1𝑗  for an FC network where 𝑁𝑗 
is the number of neurons of layer 𝑗. We take the connection 
density of a junction as defined in [12], 
𝐷𝑗  =  (𝑊𝑗  / 𝑁𝑗  ×  𝑁𝑗+1)   (1) 
where 𝑊𝑗  =  𝑁𝑗  ×  𝐹𝑂𝑗  =  𝑁𝑗+1  ×  𝐹𝐼𝑗 , and 𝐹𝑂𝑗 and 𝐹𝐼𝑗 
are fan-out count and fan-in count from each of preceding and 
succeeding neurons, respectively. 
 
Input: Connection density, 𝐷𝑗 for each of 𝐽 = 𝐿 − 1 junctions 
Output: Mask matrices corresponding to weight matrices 
1: Initialize 𝐽𝑗 mask matrices each of size 𝑁𝑗  ×  𝑁𝑗+1  where 𝑗 varies from 
1 to 𝐽 = 𝐿 − 1, all with 0. 
2: Generate the integer (𝐹𝑂𝑗) and (𝐹𝐼𝑗)  
      𝐹𝑂𝑗 =  𝐷𝑗 × 𝑁𝑗+1 
      𝐹𝐼𝑗 =  𝐷𝑗 × 𝑁𝑗 
3: For 𝐽𝑡ℎ junction define sparsity to be 0, i.e. FC. 
4: For 1 to 𝐽 − 1 matrices: 
5:     For each 𝐶𝑐 in columns: 
6:          Assign 1 to elements in column randomly s.t. 
                  ∑ 𝐶𝑐[𝑖]𝑖  =  𝐹𝑂𝑗, where 𝑖 = 1,2, … , 𝑁𝑗   
7:     End For 
8:     Select rows 𝑅𝑛𝑜𝑡_𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑒𝑑 for which  
         ∑ 𝑅𝑟[𝑘] ≠𝑘  𝐹𝐼𝑗 , 𝑘 = 1, 2, … , 𝑁𝑗+1, 𝑅𝑟  ∈  𝑅𝑅  
9:     While 𝑅𝑛𝑜𝑡_𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑒𝑑 isn’t empty  
10:         Select 𝑅𝑚𝑖𝑛 & 𝑅𝑚𝑎𝑥   from  𝑅𝑛𝑜𝑡𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑒𝑑 , s.t. 
                 max (∑ 𝑅𝑟[𝑘]𝑘 )  𝑎𝑛𝑑 min(∑ 𝑅𝑟[𝑘]𝑘 ) , 𝑅𝑟  ∈  𝑅𝑛𝑜𝑡_𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑒𝑑  
11:         Find indices 𝑖𝑑𝑥 in 𝑅𝑚𝑖𝑛 & 𝑅𝑚𝑎𝑥  s.t.  
                 𝑅𝑚𝑖𝑛[𝑖𝑑𝑥] = 0 𝑎𝑛𝑑 𝑅𝑚𝑎𝑥[𝑖𝑑𝑥] = 1  
            Sweep 𝑅𝑚𝑖𝑛[𝑖𝑑𝑥] and 𝑅𝑚𝑎𝑥[𝑖𝑑𝑥]       
12:        End While 
13: End For 
Fig. 2 Pseudocode for generating the mask matrices. 
For predefined sparsity in our work, we arbitrarily remove 
the edges between two successive layers by following 
algorithm shown in Fig. 2 and generate the mask matrices 
(matrices which take care of which weights to keep) and then 
train the network with remaining connections (weights) using 
the generated mask matrices. Therefore, for this kind of 
network reduction technique, the removed weights never 
appear during training or inference and hence this technique 
is a preferable choice for hardware design.  
Fig. 3 explains the difference between an FC junction, an 
unstructured sparse junction and a structured sparse junction 
with 50% connectivity (i.e. 50% sparsity). It is to be noted 
that we have kept the bias connection out of the scope of 
sparsity, i.e. the biases for all the neurons are always present.  
 
W11 W12 W13 W14
W21 W22 W23 W24
 Layer (l-1)
 L
a
y
e
r 
(l
)
W10
W20
W11 W12 W13 W14
W21 W22 W23 W24
W10
W20
W11 W12 W13 W14
W21 W22 W23 W24
W10
W20
(a) (b) (c)  
Fig. 3. Generation of a predefined sparse structure of a network with four 
neurons in layer 𝑙 − 1  and two in layer 𝑙 , (a) a fully connected (FC) 
junction between layer 𝑙 − 1 and 𝑙, (b) its unstructured sparsity version with 
a predefined connectivity of 50%, (c) corresponding structured sparse version.  
The structured sparse version can be used as a mapping 
scheme that can map a given DNN to any MCA size 
permissible by the memristive technology for reliable 
operations. As shown in Fig. 4, we have a network with a 
junction of 4:2 (4 neurons in layer 𝑙 − 1 and 2 neurons in 
layer 𝑙 ), with proposed structured sparse version, we can 
easily map it to two 2:1 MCA. Thus, this mapping technique 
has a potential advantage of saving area.     
To consider the predefined sparse connections, we have 
removed both the memristors per weight according to mask 
matrices. Fig. 5 show a structured sparsity version of the 
memristive circuit with a predefined connectivity of 25%. 
Each of the N inputs of this circuit (e.g., 𝑥𝑖) is differential 
containing inverted (𝑉𝑖𝑛) and non-inverted (𝑉𝑖𝑝) signals. The 
Kirchhoff’s current law (KCL) at the input of the jth inverter 
may be written as (see Fig. 5)  
∑ 𝑎𝑗𝑖 ((𝑉𝑖𝑝 − 𝑉𝑛𝑒𝑡𝑗) 𝜎𝑗𝑖𝑝 + (𝑉𝑖𝑛 − 𝑉𝑛𝑒𝑡𝑗) 𝜎𝑗𝑖𝑛)
𝑁
𝑖=1 = 0   (2) 
Where 𝑎𝑗𝑖  has a value of either 0 or 1 to indicate whether 
there is a connection between the jth neuron of the current 
layer and ith neuron of the previous layer. Also, 𝑉𝑛𝑒𝑡𝑗  is the 
voltage of the node 𝑛𝑒𝑡𝑗 (the input of the inverter of column 
j), and 𝜎𝑗𝑖𝑝  (𝜎𝑗𝑖𝑛) is the conductance of the memristor located 
in the non-inverted (inverted) row i and column j. Therefore, 
the input voltage of the jth inverter (𝑉𝑛𝑒𝑡𝑗) can be obtained 
from 
𝑉𝑛𝑒𝑡𝑗 =
∑ 𝑎𝑗𝑖(𝑉𝑖𝑝  𝜎𝑗𝑖𝑝+𝑉𝑖𝑛  𝜎𝑗𝑖𝑛)
𝑁
𝑖=1
∑ 𝑎𝑗𝑖( 𝜎𝑗𝑖𝑝+ 𝜎𝑗𝑖𝑛)
𝑁
𝑖=1
     (3) 
Also, assuming a neural network layer with the inputs 𝑉𝑖𝑝 
and 𝑉𝑖𝑛  (i =1, …, N) and the weights of 𝑤𝑗𝑖𝑝   and  𝑤𝑗𝑖𝑛  
weights, the input of the jth neuron (𝑛𝑒𝑡𝑗) of this layer may be 
obtained from 
𝑛𝑒𝑡𝑗 = ∑ 𝑎𝑗𝑖(𝑉𝑖𝑝𝑤𝑗𝑖𝑝 +
𝑁
𝑖=1 𝑉𝑖𝑛𝑤𝑗𝑖𝑛)  (4) 
Note that in (2)-(4), the positive (negative) bias is 
considered as an input with a constant value of 𝑉𝑑𝑑/2  
(−𝑉𝑑𝑑/2).  
For the crossbar circuit with the inverter-based neurons to 
mimic the neural network, the values of 𝑛𝑒𝑡𝑗  and 𝑉𝑛𝑒𝑡𝑗 
should be the same for all the combinations of the inputs. 
Therefore, the corresponding coefficients (i.e., 𝑤𝑗𝑖𝑝   and 
𝑤𝑗𝑖𝑛) are expressed as 
{
𝑤𝑗𝑖𝑝 =
𝜎𝑗𝑖𝑝
∑ 𝑎𝑗𝑚(𝜎𝑗𝑚𝑝
+𝜎𝑗𝑚𝑛
) 𝑁𝑚=1
, 𝑎𝑗𝑖 == 1
𝑤𝑗𝑖𝑝 =  0                                  , 𝑎𝑗𝑖 == 0
 (5) 
Thus,  
𝑎𝑗𝑖 [𝑤𝑗𝑖𝑝 (∑ 𝑎𝑗𝑚 (𝜎𝑗𝑚𝑝 + 𝜎𝑗𝑚𝑛)
𝑁
𝑚=1 ) − 𝜎𝑗𝑖𝑝] = 0 (6) 
We may find the relations between weights and memristor 
conductance by solving equations in (6) which can be shown 
that has a non-trivial solution only if the  
∑ 𝑎𝑗𝑖(𝑤𝑗𝑖𝑛 + 𝑤𝑗𝑖𝑝)
𝑁
𝑖=1 = 1 (7) 
Hence, the learning algorithm should train the neural 
network with these constraints: 
1. Weights must be positive and the conductance of all of 
the memristors must be in the range of [𝜎𝑚𝑖𝑛 , 𝜎𝑚𝑎𝑥 ]. 
Where 𝜎𝑚𝑖𝑛 and 𝜎𝑚𝑎𝑥  are minimum and maximum of 
possible conductance values. In this work, we use the 
memristor parameters of [18] where the maximum and 
minimum values are 7.9μ℧ and 0.12μ℧, respectively. 
2. The sum of all the existing weights must be equal to 1 
(based on (7)). 
 
C.  Training of Network with Predefined Sparsity 
To train the neural network, we propose an ex-situ training 
algorithm based on PHAX [7]. PHAX [7] is a circuit aware 
training framework where the backpropagation algorithm of 
[19] was modified to consider the physical characteristics of 
the neuromorphic circuit. Our proposed algorithm benefits 
from the integrity of the gradient descent search in the 
backpropagation algorithm as well as effectively dealing with 
the physical constraints (described in Section II.B) in the 
memristive circuits. The pseudocode for the proposed 
algorithm is given in Fig. 6. In our proposed algorithm, we 
check the existence of connection in forward and backward. 
Since the proposed weight mapping function has components 
of other weights in the same layer (in other words, all the 
neurons in the previous layer), the weights of the removed 
connection must keep zero before and after applying weight 
mapping functions (line 3 and 13 in pseudo code.) 
In the proposed algorithm, 𝜃𝑘𝑗 is the weight between the j
th 
neuron of the previous layer (𝑙 − 1) and the kth neuron of the 
current layer (𝑙), 𝑔1 and 𝑔2 are two mapping functions to 
consider the physical constraints[7] and maps unconstrained 
ANN weights to implementation weights (memristor 
conductance in this work) and are incorporated in the 
update rules of the backpropagation algorithm. Also, 𝛿𝑘 is 
the portion of the error of the kth neuron of the output layer, 𝜂 
is the learning rate and 𝑂𝑚 and tm are the actual and target 
output of the mth neuron. Finally, netm is the weighted sum of 
the inputs of the mth neuron. Although the above equations 
have been expressed for the case of the P weights, very 
similar equations may be written for the N weights. Note that 
we used the same notation as [7] where PHAX algorithm is 
discussed in deep. In this work, we set the target accuracy to 
98% (classification error of 2%) and epoch limit to 100,000. 
 
V
d
d
/2
-V
d
d
/2
V
d
d
/2
-V
d
d
/2
O
1
l
O
2
l
O
1
l
+
V
d
d
/2
V
4
n
 l
-1
-V
d
d
/2
V
4
p
 l
-1
V
3
n
 l
-1
V
3
p
 l
-1
V
2
n
 l
-1
V
2
p
 l
-1
V
1
n
 l
-1
V
1
p
 l
-1
V
3
n
 l
-1
V
3
p
 l
-1
V
1
n
 l
-1
V
1
p
 l
-1
V
2
n
 l
-1
V
2
p
 l
-1
V
4
n
 l
-1
V
4
p
 l
-1
O
2
l
Area of a 10 X 2 
crossbar
Area of two 6 X 1 
crossbar
 
Fig. 4. Structured sparse version with 50% of connection and potential 
clustering by utilizing structured sparse version and smaller MCA (2 of 
2:1.) 
o1
net2 net3
o2
net1
Vdd /2
-Vdd /2
V V V net4V
L-1 L-1 L-1 L-1
L L
 
Fig. 5. Circuit structure of the memristive neuromorphic circuit with a 
structured sparsity of 25%. Mask matrix for this layer is [
1 0
0
1
0
1
0
1
]   
III. EXPERIMENTAL RESULTS 
In this section, we present the performance of our proposed 
framework in terms of inference accuracy in benchmark 
applications, complexity analysis and power consumption. 
We have also investigated how accuracy varies w.r.t different 
percentages of sparsity and sparsity at different junctions and 
have also analyzed the network performance subject to 
variation in memristor conductance as well as bit precision 
limits. We have generated the conductance values from 
MATLAB and used them in corresponding HSPICE 
simulation models assuming TSMC 90nm technology and the 
memristor model proposed in [20] for the memristors devices 
of [18]. We used SPICE simulation to measure the inference 
accuracy and power consumption. 
1: Initialize all weights with small random numbers 
2: For each layer in the network 
3:   𝜃𝑗  =  𝜃𝑗⨀𝐴𝑗  
4: End For 
5: Do 
6:   For every pattern in the training set 
7:         Present the pattern to the network 
8: // Propagated the input forward through the network: 
9:         For each layer in the network 
10:               If 𝑎𝑗𝑖 == 1 then 
11:                   Calculate the function mapped weights 𝑤𝑗𝑖𝑝and 𝑤𝑗𝑖𝑛 
                   𝑤𝑗𝑖𝑝 =  𝑔2 (𝜎𝑗𝑖𝑝) =  𝑔2 (𝑔1 (𝜃𝑗𝑖𝑝)) 
12:               Else 
13:                    𝑤𝑗𝑖𝑝 = 0  
14:               End If  
15:               For every node in the layer 
16:                   Calculate the weighted sum of the inputs to the node   
17:                   Calculate the activation for the node 
18:               End For 
19:         End For 
20:         Calculate the Cost function 𝐽 
21: // Propagate the errors backward through the network: 
22:         For every node in the output layer 
23:           Calculate the error signal (𝑡𝑘 − 𝑂𝑘
𝐿)   
24:           If 𝑎𝑗𝑖 == 1 then 
25:             Calculate the weight updating rule (change value) Δ𝜃𝑘𝑗  
26:             Update each node's weight in the output layer 
27:           End If 
28:         End For 
29:         For all hidden layers   
30:           For every node in the layer   
31:             Calculate the node's signal error (∑ 𝛿𝑘𝑎𝑤𝑗𝑖𝑝  𝑘∈(𝑙+1) ) 
32:             If 𝑎𝑗𝑖 == 1 then 
33:               Calculate the weight updating rule Δ𝜃𝑘𝑗 
34:               Update each node's weight in the network 
                    𝜃𝑘𝑗 𝑛𝑒𝑤 =  𝜃𝑘𝑗 𝑜𝑙𝑑 − 𝜂Δ𝜃𝑘𝑗 
35:             End If   
36:           End For   
37:         End For  
38:   End For                
39: Till (maximum number of iterations > than specified) or   
40:           (Cost function 𝐽 is < than specified)) 
Fig. 6. Pseudocode of the proposed training algorithm. 
Our MATLAB script takes user inputs of the level of 
predefined sparsity and generates mask matrices to use along 
with the mathematical model of the neurons (i.e., fitted VTC 
of the inverters that are extracted via SPICE simulation). The 
script then models the network training and maps the 
converged trained values to the desired memristor 
conductance. For the memristor device used in this work, the 
write threshold voltage was 4V and the minimum (maximum) 
resistance of the memristors was about 125KΩ (8.3MΩ). In 
all the simulations, the supply voltage level was 0.5V. Also, 
the computer system used for the simulations utilized an Intel 
Core™ i7-7700HQ CPU with a nominal clock frequency of 
2.8GHz and 16 GB of RAM. Note that we assumed scheme 
of [21] to be implemented for writing the memristors which 
addresses the problems of device variation and stochastic 
write and has the relative accuracy of 99%. Also, the utilized 
memristor model has a good endurance and retention [18]. 
Since the I-V characteristic (and hence the conductance) of 
the memristor is determined by its state variables, the 
extracted conductance values, the methodology proposed in 
[7] is used to map the extracted conductance values to the 
corresponding state parameter of the memristor for the use in 
the SPICE simulations.   
A.  Classification Accuracy with Predefined Sparsity 
We have used IRIS, BCW, MNIST and MHEALTH 
datasets for the performance measurement. 80% data of a 
dataset are randomly chosen for training whereas the rest 20% 
are used for test accuracy measurement. Table I provides the 
details of the network structures for each of the classification 
dataset and Fig. 7 provides their accuracy under a different 
percentage of network connections (or in other words, under 
a different level of sparsity). It is clear from the results in Fig. 
7 that with sparse connectivity at the penultimate junction 
(junction 𝐽 − 1  for a network with 𝐽  junctions) having 
connectivity as low as 25% the network’s classification 
accuracy is hardly degraded. To train the networks we ran 
10,000 epochs and used a mini-batch size of 1. 
 
Fig. 7: Relation between test accuracy and the percentage of sparsity in the 
penultimate junction for various datasets in FC network and sparse network 
with sparsity applied at penultimate junction only.   
For the IRIS, BCW and MNIST datasets we did not 
observe any considerable difference in the accuracy by 
exchanging the sparse junction and FC junction between 
junction 1 & 2, for MHEALTH we did apply different % of 
sparse connections at junction 2. Another important trend we 
noticed is for MNIST where applying 50% connectivity at 
both junctions performs poorly compared to a 25% 
connectivity followed by an FC junction (or FC followed by 
25% connectivity junction).  
B. Power Efficiency and Computation Complexity Reduction 
Power and computation complexity in term of memristor 
counts (memory elements) of the sparse memristive 
neuromorphic circuit with those of the FC memristive circuit 
are compared in Fig. 8. Table I mentions the power where the 
sparse version has 25% connectivity at the penultimate 
junction. The comparison confirms that using the proposed 
sparse circuit results in considerable power saving over the 
fully-connected memristive circuit. The power consumption 
of the proposed circuit with 25% connectivity in the 
penultimate junction is reduced by 57% for MNIST dataset. 
TABLE I 
Network Structures Used for Different Datasets 
 
Dataset Network 
structure 
Power 
Consumption 
for FC 
version (μW)   
Power 
consumption for 
sparse version 
(μW) 
BCW 10-8-2 13.4 8.87 
MNIST 196*-100-10 1221 527 
IRIS 4-4-3 7.67 7.45 
MHEALTH 23-80-60-13 703.2 639 
* A compressed version (14 × 14) of actual 28 × 28 input image is used 
 
  
Fig. 8. (a) Power and (b) complexity reduction through reduction of 
memristor count for proposed sparse neuromorphic circuits over their FC 
counterparts for different applications. Note that 25% and 50% connectivity 
is only at the penultimate junction. 
Additionally, compared to the FC designs, our proposed 
sparse design has very low computational complexity in 
hardware. Note that inverters make up a large portion of the 
total power consumption. 
C. Effect of Process Variation and Limited Write Precision 
To evaluate the performance of the proposed design 
subject to process variation and limited write precision, we  
 
 
Fig. 9 IRIS dataset classification accuracy in inverter-based FC and sparse 
memristive neuromorphic circuits under process variations and 4-bit 
quantization. 
have added random Gaussian noise with 5%, 10%, and 25% 
variances to the conductance values of the memristors and 
measured accuracy. For the sake of space, In Fig. 9 we have 
shown the effects on memristive circuit for only IRIS. With 
worst-case simulation of 25% process variation results show 
that the accuracy of the pre-defined sparse structure with only 
25% connections in the penultimate junction, is still above 
84%, 82.2%, 92%, 95% for MNIST, MHEALTH, IRIS and 
BCW respectively. Authors of [22], [23] have proposed 
algorithms to deal with this variability of memristors and 
CMOS components in an efficient way. [23] uses the neurons 
20
40
60
80
100
120
BCW MNIST IRIS MHEALTH
T
e
st
 a
c
c
u
r
a
c
y
(%
)
FC
50% connection at penultimate junction
25% connection at penultimate junction
0
15
30
45
60
%
 o
f 
P
o
w
er
 R
ed
u
ct
io
n
50% connectivity 25% connectivity
0
20
40
60
80
%
 o
f 
 m
em
ri
st
o
rs
 R
ed
u
ct
io
n
50% connectivity 25% connectivity
0
50
100
150
5% variance 10%
variance
25%
variance
4-bit
quantization
T
e
st
 a
c
c
u
r
a
c
y
 %
FC 25% connection at
penultimate junction
(a) (b) 
characteristics (CMOS inverter here) extracted from a 
manufactured chip and based on that adjusts weights of the 
chip. 
 
IV. SUMMARY AND CONCLUSIONS 
In this work, we analyze structured predefined sparse 
memristor crossbar array structure in ex-situ training 
framework and its impact on classification accuracy, 
power, memristors count and process variations of 
memristor. We obtain a test accuracy of ~90% in MNIST 
dataset with only ~1/4 of the total weights present. Also, 
we have obtained a considerable power efficiency of this 
structure compared to FC memristive neuromorphic 
circuits. An efficient area optimization technique for the 
reduced memristor count design is a promising future 
scope for our work.   
REFERENCES 
[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet 
Classification with Deep Convolutional Neural Networks,” 
NIPS Proc., pp. 1106–1114, 2012. 
[2] C. Mead, “Neuromorphic Electronic Systems,” Proc. IEEE, 
vol. 78, no. 10, pp. 1629–1636, 1990. 
[3] Z. Du et al., “Neuromorphic accelerators: a comparison 
between neuroscience and machine-learning approaches,” 
Proc. 48th Int. Symp. Microarchitecture, pp. 494–507, Dec. 
2015. 
[4] W. Wang, C. Yakopcic, E. Shin, K. Leedy, T. M. Taha, and 
G. Subramanyam, “Fabrication, characterization, and 
modeling of memristor devices,” in National Aerospace 
and Electronics Conference, Proceedings of the IEEE, vol. 
2015–Febru, pp. 259–262, 2015 
[5] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. 
Williams, “The missing memristor found,” Nature, vol. 459, 
no. 7250, pp. 1154–1154, 2009. 
[6] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, 
and W. Lu, “Nanoscale memristor device as synapse in 
neuromorphic systems,” Nano Lett., vol. 10, no. 4, pp. 
1297–1301, 2010. 
[7] M. Ansari, A. Fayyazi, A. Banagozar, M. A. Maleki, M. 
Kamal, and M. Pedram, “PHAX : Physical Characteristics 
Aware Ex-Situ Training Framework for Inverter-Based 
Memristive Neuromorphic Circuits,” vol. 70, no. c, 2017. 
[8] R. Hasan, C. Yakopcic, and T. M. Taha, “Ex-situ training of 
dense memristor crossbar for neuromorphic applications,” 
in Proc. IEEE/ACM Int. Symp. Nanoscale Architectures 
(NANOARCH), pp. 75–81,2015 
[9] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both 
Weights and Connections for Efficient Neural Networks,” 
pp. 1–9, 2015. 
[10] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, 
and R. Salakhutdinov, “Dropout: a simple way to prevent 
neural networks from overfitting,” J. Mach. Learn. Res., vol. 
15, no. 1, pp. 1929–1958, 2014. 
[11] S. Dey, K.-W. Huang, P. A. Beerel, and K. M. Chugg, 
“Characterizing Sparse Connectivity Patterns in Neural 
Networks,” 2017. 
[12] S. Dey et al., “A Highly Parallel FPGA Implementation of 
Sparse Neural Network Training,” arXiv:1806.01087v1, 
2018. 
[13] S. B. Kotsiantis and P. E. Pintelas, “Logitboost of simple 
Bayesian classifier,” Inform., vol. 29, no. 1, pp. 53–59, 
2005. 
[14] O. L. Mangasarian, W. N. Street, and W. H. Wolberg, 
“Breast Cancer Diagnosis and Prognosis Via Linear 
Programming,” Oper. Res., vol. 43, no. 4, pp. 570–577, 
1995. 
[15] Y. LeCun, C. Cortes, and C. J. C. Burges, “The MNIST 
database of handwritten digits.” 1998. 
[16] B. O. et al., “mHealthDroid: A novel framework for agile 
development of mobile health applications,” in Lecture 
Notes in Computer Science (including subseries Lecture 
Notes in Artificial Intelligence and Lecture Notes in 
Bioinformatics), vol. 8868, no. JANUARY, pp. 91–98, 2014. 
[17] A. Fayyazi, M. Ansari, M. Kamal, A. Afzali-Kusha, and M. 
Pedram, “An Ultra Low-Power Memristive Neuromorphic 
Circuit for Internet of Things Smart Sensors,” IEEE 
Internet Things J., 2018. 
[18] W. Lu, K. H. Kim, T. Chang, and S. Gaba, “Two-terminal 
resistive switches (memristors) for memory and logic 
applications,” in Proceedings of the Asia and South Pacific 
Design Automation Conference, ASP-DAC, pp. 217–223, 
2011 
[19] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, 
“Learning internal representations by error propagation,” in 
Parallel Distributed Processing: Explorations in the 
Microstructure of Cognition, D. E. Rumelhart and James 
L.McClelland, Eds. Cambridge, MA: MIT Press, pp. 318–
362, 1986. 
[20] C. Yakopcic, T. M. Taha, G. Subramanyam, and R. E. Pino, 
“Generalized memristive device SPICE model and its 
application in circuit design,” IEEE Trans. Comput. Des. 
Integr. Circuits Syst., vol. 32, no. 8, pp. 1201–1214, Jul. 
2013. 
[21] F. Alibart, L. Gao, B. D. Hoskins, and D. B. Strukov, “High 
precision tuning of state for memristive devices by 
adaptable variation-tolerant algorithm,” Nanotechnology, 
vol. 23, no. 7, p. 75201, 2012. 
[22] B. Liu, H. Li, Y. Chen, X. Li, Q. Wu, and T. Huang, “Vortex: 
variation-aware training for memristor x-bar,” in Proc. 
52nd ACM/EDAC/IEEE Design Automation Conf. (DAC),  
pp. 1–6, 2015 
[23] A. BanaGozar, M. A. Maleki, M. Kamal, A. Afzali-Kusha, 
and M. Pedram, “Robust neuromorphic computing in the 
presence of process variation,” in Design, Automation & 
Test in Europe Conference & Exhibition (DATE), pp. 440–
445, 2017. 
 
