Advancing Memristive Analog Neuromorphic Networks: Increasing
  Complexity, and Coping with Imperfect Hardware Components by Bayat, F. Merrikh et al.
Advancing Memristive Analog Neuromorphic 
Networks: Increasing Complexity, and Coping 
with Imperfect Hardware Components  
    
F. Merrikh Bayat1, M. Prezioso1, B. Chakrabarti1, I. Kataeva2, and D. B. Strukov1 
1UCSB, Santa Barbara, CA 93106-9560, U.S.A.  
2Research Laboratories, DENSO CORP., 500-1 Minamiyama, Komenoki-cho, Nisshin, Japan 470-0111 
 
Abstract - We experimentally demonstrate classification of 
4×4 binary images into 4 classes, using a 3-layer mixed-
signal neuromorphic network (“MLP perceptron”), based on 
two passive 20×20 memristive crossbar arrays, board-
integrated with discrete CMOS components. The network 
features 10 hidden-layer and 4 output-layer analog CMOS 
neurons and 428 metal-oxide memristors, i.e. is almost an 
order of magnitude more complex than any previously 
reported functional memristor circuit. Moreover, the inference 
operation of this classifier is performed entirely in the 
integrated hardware. To deal with larger crossbar arrays, we 
have developed a semi-automatic approach to their forming 
and testing, and compared several memristor training schemes 
for coping with imperfect behavior of these devices, as well 
as with variability of analog CMOS neurons. The 
effectiveness of the proposed schemes for defect and variation 
tolerance was verified experimentally using the implemented 
network and, additionally, by modeling the operation of a 
larger network, with 300 hidden-layer neurons, on the MNIST 
benchmark. Finally, we propose a simple modification of the 
implemented memristor-based vector-by-matrix multiplier to 
allow its operation in a wider temperature range.  
I.    INTRODUCTION 
Several types of emerging nonvolatile memory devices are 
now being actively investigated for their use in fast and 
energy-efficient analog and mixed-signal neuromorphic 
networks [1-8]. For relatively immature technologies, such as 
RRAM devices [9] (also called memristors [10]), and even 
more mature PCM cells [6], the best previously reported 
results were obtained, to the best of our knowledge, 
combining experimental devices with external computers (for 
example, by reading from and writing to one device at a time 
[6]) to emulate the functionality of the whole system.  
One key result of our work is an experimental 
demonstration of a fully functional, board-integrated, mixed-
signal memristor-based neural network of a complexity much 
higher than those reported previously. Another important 
result is an experimental verification of several in-situ and ex-
situ approaches to defect- and variation-tolerant memristor 
training. Specifically, we have analyzed training of a pattern 
classifier based on a firing-rate neural network (MLP 
perceptron), which is very efficient for high-performance 
implementation of deep-learning algorithms [11].  
Our focus is on using passive metal-oxide memristor 
crossbar arrays, with crosspoint devices of a very small chip 
footprint (determined only by the overlap area of the wire 
electrodes). Such memristors may be scaled down below 10 
nm without sacrificing their endurance, retention, and tuning 
accuracy, with some of the properties (e.g., the ON/OFF 
conductance ratio) being actually improved [12]. Moreover, 
these devices are naturally suitable for 3D integration [4, 13], 
which may be instrumental for keeping all the data required 
for deep learning locally, and thus cutting dramatically the 
energy and latency overheads of off-chip communications. 
II.   MEMRSITIVE CROSSBAR ARRAYS 
20×20 crossbar arrays, with 200-nm lines separated by 
400-nm gaps (Fig. 1), and a Pt/Al2O3/TiO2-x/Ti/Pt memristor 
at each crosspoint, were fabricated using a technique similar 
to that reported in Refs. 2, 3. To speed up the memristor 
forming procedure, a setup for its automation was developed 
(Fig. 2). The setup was used for early screening of defective 
samples, and has allowed a successful forming and testing of 
numerous crossbar arrays. As Fig. 3 shows, we have explored 
forming using both current and voltage stress pulses, but have 
not detected much difference between the two, likely due to 
larger parasitics of these larger arrays.  
Similarly to our previously reported results [2, 3], the 
crossbar devices have relatively good uniformity, with a 
spread of the set and reset voltages narrow enough (Fig. 3) to 
allow a precise adjustment of each memristor of the whole 
array (Fig. 4). To our knowledge, this is the first report of 
such a precise adjustment on this integration scale; for 
example, Ref. 14 reported a less precise tuning, performed for 
smaller (8×8 device) fragments of a larger array, with all the 
remaining devices always kept in the high-resistive state.  
III.   MULTILAYER PERCEPTRON IMPLEMENTATION   
The implemented MLP perceptron is fed with 16 binary 
inputs encoding 4×4 B/W pixels with ±0.2 voltages, and 
consists of 10 hidden and 4 output layer neurons, connected 
with two 20×20-memristor arrays (Figs. 5a,b). With 
additional bias inputs, 17×20 and 11×8 portions of the arrays 
were used to implement differential synaptic weights, with 
memristor conductances G± in the range [10 µs, 100 µs] in the 
first and second crossbars. The neurons, as well as the 
circuitry for weight adjustment, were implemented with 
discrete CMOS components. All components of the system, 
were integrated on a printed circuit board (Figs. 5c, d).  
Figure 5e shows the design of a hidden-layer neuron, 
which consists of two opamps, computing a pair of 
differential voltages RFΣiGi±Vi, where RF = 2 kΩ is feedback 
resistance, by enforcing the virtual ground condition on the 
incoming crossbar lines. This pair of voltages is then fed into 
a third opamp, which computes the difference between the 
inputs, and clips the output voltage, keeping it between the 
voltage supply rails, thus effectively implementing a piece-
linear activation function with a low-voltage slope of 10 and 
saturation at ±5 V. This output is scaled down, with one more 
opamp circuit, to be within at most ±0.2 V, to avoid 
disturbing the state of memristors in the second layer.  
IV.   COPING WITH IMPERFECT HARDWARE 
The perceptron was trained to classify stylized letter 
patterns (Fig. 6), using four alternative approaches (Fig. 7). In 
the simplest ex-situ training (Figs. 7a, 8), the weights are 
computed in a “precursor” external computer, assuming 
perfect on-chip hardware, and then are “imported” into the 
crossbars. This method has the lowest on-chip hardware 
overhead and is suitable for the most popular applications of 
neuromorphic networks.  However, though the write-verify 
algorithm circumvents the problem of threshold variations in 
memristors, such ex-situ approach cannot cope with other 
imperfections in the hardware, such as stuck-on or stuck-off 
defects (Fig. 9b) and the device I-V curve asymmetry (Figs. 
1d, 9a).  The ex-situ classification fidelity may be 
significantly improved, e.g., from 95% (Fig. 8a) to 100% 
(Fig. 9c) on the 4-class pattern set (Fig. 6), by detecting 
various defects and then using this information at the 
precursor training (Fig. 7b). A potential drawback of such 
defect-aware ex-situ scheme is that the chip-specific 
precursor training may not be suitable for some applications - 
e.g., when training takes too much time.  
An apparent alternative is the in-situ training, performed 
directly in a hardware (Figs. 7c, 10). With a supporting on-
chip training circuitry, the in-situ approach might be utilized 
to implement on-line learning, making it suitable for a 
broader range of applications. (In our demonstration, similar 
to that described in Refs. [2, 3], some stages of in-situ training 
were assisted by an external computer.) However, in our 
experiments this approach, in its batch-mode, fixed-amplitude 
version [2], has provided an inferior fidelity of ~70% for a 3-
class pattern set, compared to the 100% fidelity for both ex-
situ approaches for the same task. The main reason, as 
confirmed by simulations based on a simple dynamic model 
[3] (Fig. 10b, c), are large variations of the switching 
thresholds, which are more effectively coped by the tuning 
algorithm at the ex-situ training. The in-situ training fidelity 
could potentially be improved using a variable-amplitude 
mapping scheme [2], which was not possible in the current 
design due to fixed voltage inputs. A hybrid approach (Fig. 
7d) - first initializing weights to the values prescribed by ex-
situ training, and then readjusting them with in-situ training, 
leads to a much better classification fidelity. Actually, 
experimental results show that with more artificially injected 
defects, the fidelity may be much higher than that at the 
purely ex-situ approach (Figs. 11a). This fact is also 
confirmed via simulations of a much larger network (Figs 
12a, b). The simulation results shown in Figs. 12c-e confirm 
that other defect types are manageable in larger networks. 
Finally, practical memristive hardware should be able to 
operate correctly under wide temperature ranges. For the 
considered circuits (Fig. 5b), the change in memristor 
conductance can be compensated by utilizing memristor as a 
feedback and mapping to a higher conductances (Fig. 13). 
ACKNOWLEDGMENT 
This work was supported by AFOSR under MURI grant 
FA9550-12-1-0038, by DARPA under contract HR0011-13-
C-0051UPSIDE via BAE Systems, Inc., by NSF grant CCF-
1528205, and by the DENSO CORP., Japan. Useful 
discussions with P.-A. Auroux, J. Edwards, and K. K. 
Likharev are gratefully appreciated. 
REFERENCES 
[1]  S. Yu et al., “A neuromorphic visual system using RRAM synaptic 
devices with sub-pJ energy and tolerance to variability: Experimental 
characterization and large-scale modeling”, IEDM’12 Tech. Dig., p. 
10.4.1, 2012.  
[2]  M. Prezioso et al., “Modeling and simulation of firing-rate 
neuromorphic-network classifiers with bilayer Pt/Al2O3/TiO2-x/Pt 
memristors”, IEDM’15 Tech. Dig., pp. 455-458. 
[3]  M. Prezioso et al., “Training and operation of an integrated 
neuromorphic network based on metal-oxide memristors”, Nature, vol. 
521, pp. 61-64, 2015.  
[4] G. Piccolboni et al., “Investigation of the potentialities of vertical 
resistive RAM (VRRAM) for neuromorphic applications”, IEDM’15 
Tech. Dig., pp. 447-450.   
[5]  S. Park et al., “RRAM-based synapse for neuromorphic system with 
pattern recognition function”, IEDM’12 Tech. Dig., p. 10.2.1, 2012.  
[6] S. Kim et al., “NVM neuromorphic core with 64k-cell (256-by-256) 
phase change memory synaptic array with on-chip neuron circuits for 
continuous in-situ learning”, IEDM’15 Tech. Dig., pp. 443-446, 2015. 
[7] M. Suri et al., “CBRAM devices as binary synapses for low-power 
stochastic neuromorphic systems: auditory (cochlea) and visual (retina) 
cognitive processing applications”, IEDM’12 Tech.Dig., p.10.3.1, 2012. 
[8] M. Hu, J.P. Strachan, Z. Li, R. S. Williams, “Dot-product engine as 
computing memory to accelerate machine learning algorithms”, in: 
Proc. ISQED’16, Santa Clara, CA, Mar. 2016, pp. 374-379. 
[9] H.S.P. Wong et al., “Metal-oxide RRAM”, Proc. IEEE, vol. 100, pp. 
1951-1970, 2012. 
[10] J.J. Yang, D.B. Strukov, and D.R. Stewart, “Memristive devices for 
computing”, Nature Nanotechnology, vol. 8, pp. 13-24, 2013. 
[11] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning”, Nature, vol. 
521, pp. 436-444, 2015. 
[12]  B. Govoreanu et al., “10×10 nm2 Hf/HfOx crossbar resistive RAM with 
excellent performance, reliability and low-energy operation”, in: 
IEDM’11 Tech. Dig.,  p. 31.6.1, 2011. 
[13] G.C. Adam et al., “Highly-uniform multi-layer ReRAM crossbar 
circuits”, accepted for presentation to ESSDERC’16, Lausanne, 
Switzerland, Sep. 2016. 
[14]  K.-H. Kim et al., “A functional hybrid memristor crossbar-array/CMOS 
system for data storage and neuromorphic applications”, Nano Lett., 
vol. 12, pp. 389-395, 2012. 
1 µm
Initialize RTH, Istop, Istep, Rmin
Read pristine 
state R
R ≤ RTH
Current sweep & 
read R
R ≤ Rmin
Reset 
device
Increase 
Istop
Hard 
reset
Increase 
Istop
yes 
no 
forming 
neededno
no
yes 
reset  
to 
lower 
leakage
form next device
repeat 
with higher 
stress
> max # 
attempts
no
yes
Forming failed
Fig. 2. Flow diagram of the automatic memristor
forming procedure. The value of Istop was so far
adjusted manually after the failure to form a
device automatically (in ~10% of all cases).
Fig. 6. 40 patterns, of 4 classes,
used for the classifier training.
640 test patterns were formed by
flipping one pixel in each
training pattern.
pattern “T” pattern “U” pattern “X” pattern “A”
(a) (b)
Fig. 1. 20×20 crossbar circuit with integrated Pt/Al2O3/TiO2-x/Ti/Pt memristors: (a) a top-view SEM and (b)
cross-section TEM images. (c) All forming I-V curves for one crossbar, and (d) a typical switching I-V curve,
with its asymmetry clearly visible.
(c)
0        1         2         3
0.4
0.3         
0.2
0.1
0C
u
rr
en
t (
m
A)
Voltage (V)
neurons
PCB #2
Fig. 5. Multilayer perceptron classifier: (a) Graph representation of the implemented network and (b) its equivalent electrical circuit. (c, d) Photos of the two printed
circuit boards hosting (PCB #1) wire-bonded memristive crossbar chips and the memristor tuning circuitry, and (PCB #2) discrete CMOS neurons. (e) Equivalent
circuit of a hidden layer neuron based on discrete CMOS opamps. The output layer neurons are implemented without the output scaling (the last opamps) and the 10
KΩ pulldown resistor in the second-stage opamp. (f) Typical output signal dynamics during classification; note the few-microsecond signal time of the operation.
(a)
(d)
(e)
+
-
+
-
+
-
2KΩ
2KΩ 500KΩ
50
0K
Ω
10KΩ
10KΩ
+
-24
.
62
KΩ
2K
Ω
1.
87
KΩ
93
0Ω
37
0Ωvirtual ground
transfer function and subtraction
output scaling
switching matrix
xbar #1
xbar #2 PCB #1
(c)
bias
bias
10-
neuron 
hidden 
layer
outputs
16 inputs
Fig. 4. High precision tuning in 20×20 memristive crossbar: (a) the desired “smiley face” pattern,
quantized to 256 gray levels. (b) The actual resistance values measured after tuning all devices with the
nominal 5% accuracy, using the automated tuning algorithm, and (c) the corresponding statistics of the
tuning errors. On panel (b), the white / black pixels correspond to 84 kΩ / 7kΩ at 0.2 V bias.
0 10 20 30 40 50 60 70 800
10
20
30
40
50
60
Error [%]
N
u
m
be
r 
of
 
D
ev
ic
e
s
50
40
30
20
10
00 20 40 60 80
Error, %
#
of
de
vic
e
s
- Mode = 0.5% 
(minimum 
most sampled 
value)
- Mean = 7.4% 
(excluding 2 
‘unformed’ 
devices)
(a) (c)(b)
Fig. 3. Set and reset threshold statistics for seven 20×20-
device arrays at memristor switching with current and
voltage pulses. The set/reset thresholds are defined as the
smallest voltages at which the device resistance is
increased/decreased by more than 5% at the application of
a voltage or current pulse of the corresponding polarity.
reset threshold
current 
sweep
set threshold 
current sweep
µ=-1.18V 
σ=0.140V
µ=0.99V
σ=0.183V
µ=-1.28V 
σ=0.16V
µ=1.0V
σ=0.17V
co
u
n
t
co
u
n
t
Voltage (V)-2.0  -1.6   -1.2  -0.8  -0.4     0    0.4   0.8  1.2   1.6  2.0
reset threshold
voltage pulse
set threshold 
voltage pulse160
80
0
160
80
0
...
...
...
... ...
1,1G
+
2,1G
+
17,1G
+
1,1G
−
2,1G
−
17,1G
−
... ...
1,10G
+
2,10G
+
17,10G
+
1,10G
−
2,10G
−
17,10G
−
...
...
...
...
1,2G
+
2,2G
+
17,2G
+
1,2G
−
2,2G
−
17,2G
−
+ - + - + -
...
...
...
... ...
1,1G
+
2,1G
+
1,1G
−
2,1G
−
... ...
1,4G
+
2,4G
+
1,4G
−
2,4G
−
...
...
1,2G
+
2,2G
+
1,2G
−
2,2G
−
10,1G
+
10,1G
−
10,4G
+
10,4G
−
...
10,2G
+
10,2G
−
Inputs
H
id
d
e
n
 
n
e
u
ro
n
s
Bias= 0.2 V
11,1G
+
11,1G
−
11,4G
+
11,4G
−
...
11,2G
+
17,1G
−
...
+ - + - + -
Outputs
B
ia
s=
 0
.2
 V (f)
50 nm Pt
Pt
TiO2-x
Ti
Al2O3Ta
(d)
-2 -1 0 1
-600
-400
-200
0
200
 
 
Cu
rr
e
n
t (µ
A)
Voltage (V)
Reset
Set
Voltage )
(b)
0 1 2 3 4 5 6-150
-100
-50
0
50
100
150
 
 
N
e
u
ro
n
 
o
u
tp
u
t (m
V)
Time (s)
 Neuron 1
 Neuron 2
 Neuron 3
 Neuron 1
 Neuron 2
 Neuron 3
90 95 100 105 110 115 120
-120
-100
-80
-60
-40
-20
0
20
40
60
80
 
 
Ne
u
ro
n
 
o
u
tp
u
t (m
V)
Time (us)
Fig. 7. Training approaches to cope with impecfect hardware: (a)
ex-situ, (b) defect-aware ex-situ, (c) in-situ, and (d) hybrid.
PROS CONS
- Lowest HW  
overhead
- Good fidelity
- Poor imperfection 
tolerance 
- Off-line learning
- Best 
imperfection 
tolerance 
- Low HW 
overhead
- Best fidelity
- Poorly scalable 
step 1 (HW test) 
- Off-line learning
- Chip-specific 
training
- Best  
imperfection 
tolerance 
- Suitable for on-
line learning
- High HW overhead
- Suboptimal fidelity  
- Long training times
- Chip-specific 
training
- Best  
imperfection 
tolerance 
- Suitable for on-
line learning
- High HW overhead
precursor 
training
weight 
import1 2
precursor 
training
1 HW test
weight 
import
3
2
precursor 
training1
weight 
import2
3
in-situ 
training
(a)
(b)
(c)
in-situ 
training1
(d)
(a) (b)
 
 
-200
-150
-100
-50
0
50
15
10
5
0
0         5        10     15     
50
0
-50
-100
-150
-200
15
10
5
0
20
10
-0
-10
-20
-30
0          5       10      15     
Fig. 8. Ex-situ training: Measured results for classification for (a) training and (b) test patterns.
(a)
10 20 30 40-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
 
 
Ne
u
ro
n
 
o
u
tp
u
t (V
)
Pattern
 Pattern 1
 Pattern 2
 Pattern 3
 Pattern 4
5% misclassified
1 100 200 300 400 500 600-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
 
 
N
e
u
ro
n
 
o
u
tp
u
t (V
)
Pattern
20.94% misclassified (b)
Fig. 9. Defect-aware ex-situ training:
(a) the tuning error map and (b) the
current (@ ±0.2 V) asymmetry map
for the used 17×20 portion of the
crossbar. Both panels show color-
coded percentages. (c, d)
Classification results for (c) training
and (d) test patterns.
(c) (d)
1 100 200 300 400 500 600
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
 
 
N
e
u
ro
n
 
vo
lta
ge
 
(V
)
Pattern
1 10 20 30 40-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
 
 
Ne
u
ro
n
 
o
u
tp
u
t (V
)
Pattern
 Pattern 1
 Pattern 2
 Pattern 3
 Pattern 4
100% classified
18.6% misclassified
Fig. 10. In-situ training for 3-pattern
classification: (a) Measured and simulated
error decay dynamics during training. (b)
Example of switching kinetics obtained from
the model. (c) Comparison of the weight
adjustment distributions computed for all
weights over the entire batch training.
-0.10 -0.05 0.00 0.05 0.10
0
200
400
600
800
 
 
Co
u
n
t
∆w
 Experiment
 Simulation
0 20 40 60 80 100 120 140 160
20
30
40
50
 
 
M
isc
la
ss
ific
at
io
n
 
(%
)
Epoch
 Ex-situ
 Simulation
 Experiment
1 10 20 30 40 50 60 70 80
0
10
20
30
40
50
60
70
 
 
M
isc
la
ss
ific
at
io
n
 
(%
)
Epoch
 Experiment
 Simulation-with variations
 Simulation- without variations
Fig. 11. Hybrid (ex-situ+in-situ) training for 3
pattern classification: (a) Measured and simulated
error decay dynamics assuming stuck on and off
devices, marked with red/green “x” and “o”
1st/2nd xbars, (b) and different output voltages
(±0.4 V in 1st, 7th, 8th, ±0.1 V in 2nd, 4th, 5th, and
±0.2 V in the rest) in hidden-layer neurons.
(a) (b)
(c)
(a) (b)
1.62
1.6
1.58
1.56
import     inference    import&
w/ noise
idealGaussian noise at 
synapses with 
σ/μ=0.05  for ex-situ 
import, inference or 
both  
inference 
RMIN RMAX+RMIN
idealRMAX
sigma for resistance spread, %
1.9
1.8
1.7
1.6
1.5
0           10           20          30
100
10
1
ex-situ
in-situ
hybrid
0                   50                  100
Ratio of stuck-open devices, %
Fig. 12. Modeling of imperfect hardware effects at different training approaches. Classification rate degradation with respect to (a) finite weight import accuracy, (b)
stuck-on and stuck-off memristors, (c) variations in maximum ROFF and RON during in-situ training, (d) synaptic noise during import, inference, or both, and (e) stuck-
on-high and stuck-on-low neurons. On all panels, the vertical axis is the classification error rate in percent; all data are for a 3-layer perceptron with 300 hidden layer
neurons, and MNIST benchmark.
ideal
ex-situ
hybrid
sigma for import precision, %
0       10     20      30      40
10
5
4
3
2
1
(a) (b)
(c)
(d) (e)20
15
10
5
0
defective neurons, %
0         10           20         30
ex-situ ON
ex-situ OFF
in-situ ON
in-situ OFF
(e)
ON/OFF = hidden 
neurons with stuck 
high/low output
Fig. 13. Temperature sensitivity study: (a) The I-
V curves of a single memristor for several
temperatures and (b) the extracted temperature
dependence of its conductance. (c) The proposed
vector-by-matrix multiplier circuit for wide-
temperature-range operation. (For clarity, only
one memristor pair from an array, and only one
neuron are shown.) The drift in conductance over
temperature is reduced by choosing larger values
of GBIAS at which dependence is the smallest and
by using memristor in output opamp feedback.-0.4 -0.2 0 0.2 0.4
10-8
10-7
10-6
10-5
10-4
Voltage (V)
Cu
rr
e
n
t (A
)
 
 
 RT
 40
 55
 70
 85
100
 RT
 40
 55
 70
 85
100
1E-4 S
1E-5 S
20 40 60 80 10010
-6
10-5
10-4
Temperature (C)
Co
n
du
ct
an
ce
 
(S
)
feedback memristor
-
VOUT
-
-
G+
R1
RF
+
+
+
RF
RF
G-
GM
VIN
scaling factor
(a) (b) (c)
1
FRA
R
=
( ) ( ) ( ) / 2
( ) ( ) ( ) / 2
BIAS
BIAS
G T G T G T
G T G T G T
+
−
= +
= −
( )( ) ( )( )OUT INM
AV V G T G T
G T
+ −
= −∑
 
