Improving Dependability of Neuromorphic Computing With Non-Volatile
  Memory by Song, Shihao et al.
Improving Dependability of Neuromorphic
Computing With Non-Volatile Memory
Shihao Song
Drexel University
Philadelphia, PA 19104
Email: shihao.song@drexel.edu
Anup Das
Drexel University
Philadelphia, PA 19104
Email: anup.das@drexel.edu
Nagarajan Kandasamy
Drexel University
Philadelphia, PA 19104
Email: nk78@drexel.edu
Abstract—As process technology continues to scale aggres-
sively, circuit aging in a neuromorphic hardware due to neg-
ative bias temperature instability (NBTI) and time-dependent
dielectric breakdown (TDDB) is becoming a critical reliability
issue and is expected to proliferate when using non-volatile
memory (NVM) for synaptic storage. This is because an NVM
requires high voltage and current to access its synaptic weight,
which further accelerates the circuit aging in a neuromorphic
hardware. Current methods for qualifying reliability are overly
conservative, since they estimate circuit aging considering worst-
case operating conditions and unnecessarily constrain perfor-
mance. This paper proposes RENEU, a reliability-oriented ap-
proach to map machine learning applications to neuromorphic
hardware, with the aim of improving system-wide reliability
without compromising key performance metrics such as execution
time of these applications on the hardware. Fundamental to
RENEU is a novel formulation of the aging of CMOS-based
circuits in a neuromorphic hardware considering different failure
mechanisms. Using this formulation, RENEU develops a system-
wide reliability model which can be used inside a design-space
exploration framework involving the mapping of neurons and
synapses to the hardware. To this end, RENEU uses an instance
of Particle Swarm Optimization (PSO) to generate mappings that
are Pareto-optimal in terms of performance and reliability. We
evaluate RENEU using different machine learning applications on
a state-of-the-art neuromorphic hardware with NVM synapses.
Our results demonstrate an average 38% reduction in circuit
aging, leading to an average 18% improvement in the lifetime
of the hardware compared to current practices. RENEU only
introduces a marginal performance overhead of 5% compared
to a performance-oriented state-of-the-art.
I. INTRODUCTION
Machine learning models built with spike-based computa-
tions and bio-inspired learning algorithm, e.g., Spiking Neural
Network (SNN) [1] lower the energy consumption of machine
learning tasks when they are executed on event-driven neu-
romorphic hardware such as DYNAP-SE [2], TrueNorth [3],
and Loihi [4]. A typical neuromorphic hardware consists of
computation units called neurosynaptic cores, communicating
spikes via a shared interconnect. A neurosynaptic core is
essentially a crossbar, which is a n×n organization of input and
output neurons and synaptic weight storage at each crosspoint.
Recently, Non-Volatile Memory (NVM) such as Phase
Change Memory (PCM), Oxide-based RAM (OxRAM), and
Spin-Transfer Torque Magnetic RAM (STT-MRAM) are used
for synapses in neuromorphic architectures [5], beyond their
use as memory for conventional von-Neumann computing [6]–
[8]. NVMs bring certain advantages such as high integration
density, multi-bit synapse, CMOS compatibility, and above all,
non-volatility, which can further lower the energy consumption
of neuromorphic hardware. However, NVMs also introduce
reliability issues such as endurance, aging, and read distur-
bances (see Table I) [9]–[11]. These issues are triggered when
propagating current through an NVM synapse. In this work,
we focus on one specific reliability issue — that of aging
of circuit components in a neuromorphic hardware leading to
hard failures, and propose an intelligent solution to mitigate
this problem (Section IV).
Although reliability issues of NVMs have been addressed
at the system-level for von-Neumann computing with NVM-
based main memory, e.g., [6], [12], these approaches do
not apply to neuromorphic computing. Recent system-level
works on mapping SNN-based applications to neuromorphic
hardware, such as [13]–[19], target performance improvement
only. They do not consider reliability issues of neuromorphic
computing. Our recent work [20] demonstrates the significant
reliability degradation introduced using these SNN mapping
approaches. In fact, this motivating work has also shown the
change in reliability profile for different neuron and synapse
mapping strategies. This is because with change in mapping,
computation units in the hardware are stressed differently
when processing a machine learning task, resulting in different
reliability behavior of the hardware.
Contributions: This paper addresses reliability issues,
specifically, circuit aging in neuromorphic computing with
NVM using an intelligent neuron and synapse mapping tech-
nique. Following are our key contributions.
• We formulate the detailed aging of CMOS-based circuits
in a neuromorphic hardware.
• We incorporate and integrate different failure mechanisms
such as Time-Dependent Dielectric Breakdown (TDDB),
Negative-Bias Temperature Instability (NBTI), and Hot
Carrier Injection (HCI).
• We use this low-level circuit aging formulation to develop
a system-wide reliability model, which allows to esti-
mate the aging of computation units in a neuromorphic
hardware when processing a spike train from a machine
learning application.
• We propose a meta-heuristic approach based on Particle
Swarm Optimization (PSO) [21] to map neurons and
ar
X
iv
:2
00
6.
05
86
8v
1 
 [c
s.N
E]
  1
0 J
un
 20
20
synapses to a neuromorphic hardware. We apply Pareto
Optimization to retain only those mappings that are
optimal in terms of reliability and performance.
We evaluate our approach, which we call RENEU
(REliability-aware NEUromorphic Computing), with state-of-
the-art machine learning applications built using multi-layer
perceptron (MLP), convolutional neural network (CNN), and
recurrent neural network (RNN) models on a state-of-the-
art neuromorphic hardware with OxRAM synapses. Results
demonstrate an average of 38% reduction in circuit aging,
leading to an average of 18% improvement in the lifetime
of the hardware compared to current practices. RENEU only
introduces a marginal performance overhead of 5% compared
to a performance-oriented state-of-the-art mapping approach.
To the best of our knowledge, this is the first work that
formulates the detailed aging of a neuromorphic hardware
when executing a machine learning application and proposes
a novel neuron and synapse mapping technique to reduce
the overall aging of the hardware, improving dependability
of neuromorphic computing.
II. BACKGROUND
A. Spiking Neural Networks
Spiking neural networks (SNNs) are computation models
with spiking neurons and synapses [1]. Neurons are typically
implemented using Leaky Integrate-and-Fire (LIF) model [22].
Information is represented using short impulses of infinites-
imally small duration, called spikes. Spiking LIF neurons
can be organized into feedforward layers, e.g., multi-layer
perceptron (MLP) and convolutional neural network (CNN) or
in a recurrent topologies, e.g., recurrent neural network (RNN).
In this work, we evaluate MLP, CNN, and RNN-based machine
learning applications (Section V). SNNs can implement both
supervised and unsupervised learning. For supervised learning,
a model is trained with examples from the field, without being
exclusively programmed with any task-specific rules. A trained
SNN model is then deployed on a neuromorphic hardware
to perform inference from in-field data. For unsupervised
learning, a machine learning model is trained in real-time
using bio-inspired learning algorithms such as spike-timing
dependent plasticity (STDP) [23]. Without loss of generality,
we focus on supervised machine learning approaches.
Recently, machine learning applications using analog com-
putation models have achieved significant breakthroughs in
computer vision and image processing domains [24]. Many
research centers around the world now use analog models
of CNNs for diverse applications. Section IV discusses how
RENEU applies to analog CNNs.
B. Neuromorphic Hardware
Figure 1 shows a representative neuromorphic hardware
similar in structure to the DYNAP-SE, Loihi, and TrueNorth
architectures. DYNAP-SE has four tiles per chip where each
tile consists of a crossbar (C) and communicates with other
tiles using the interconnect. Routing of spikes on the intercon-
nect is facilitated using a switch (S).
C
S S S
S S S
S S S
C
C C
CC C
C
C
Crossbar NVM
Fig. 1: A representative tile-based neuromorphic hardware.
A crossbar is an n× n-organization (in 3D) of n rows (top
electrodes) and n columns (bottom electrodes), and storage
elements, e.g., NVM at their crosspoints. In DYNAP-SE,
n = 128, while in TrueNorth, n = 256. When mapping an SNN
to a neuromorphic hardware, synaptic weights are programmed
as conductivity of these NVMs. The figure also illustrates a
small example of mapping an SNN to the crossbar. Synaptic
weights w1 and w2 are programmed into NVM cells P1 and
P2, respectively. The output spike voltages, v1 from N1 and
v2 from N2, inject current into the crossbar, which is obtained
by multiplying a pre-synaptic neuron’s output spike voltage
with the NVM cell’s conductance at the cross-point of the
pre- and post-synaptic neurons (following Ohm’s law). Current
summations along columns are performed in parallel using
Kirchhoffs current law, and implement the sums
∑
j wivi,
needed for forward propagation of neuron excitation.
A crossbar introduces the following constraints when map-
ping SNNs to a neuromorphic hardware.
• Each crossbar has n input ports and n output ports, i.e.,
n input neurons, one at each row, inject current into the
crossbar, and n output neurons, one at each column, act
as current sink to propagate neuron computations.
• Each crossbar can accommodate a maximum of n pre-
synaptic connections per output neuron.
C. Non-volatile Memory
Emerging NVM technologies such as phase-change memory
(PCM), oxide-based memory (OxRAM), spin-based magnetic
memory (STT-MRAM), and Flash have recently been used as
synaptic storage elements within crossbars. NVMs are non-
volatile, have high CMOS compatibility, and can achieve high
integration density. Each NVM device can implement both a
single-bit and multi-bit synapse. Because of these properties,
an NVM-based neuromorphic hardware typically consumes
energy that is magnitudes lower than using SRAMs [5], [25]–
[27]. However, NVMs also introduce reliability issues and
Table I summarizes the sources of reliability concerns.
In this work, we focus on PCM-based neuromorphic com-
puting [5]. Figure 2 ¶ illustrates how a chalcogenide semi-
conductor alloy is used to build a PCM cell. The amorphous
phase (RESET) in this alloy has higher resistance than the
crystalline phase (SET). Ge2Sb2Te5 (GST) is the most com-
monly used alloy for PCM. To compute (xi · wi), a current is
injected into the resistor-chalcogenide junction via the heater
element. The current is controlled to ensure that the phase
TABLE I: Reliability issues in NVMs.
Reliability Issues NVMs
High-voltage related circuit aging PCM, Flash
High-current related circuit aging OxRAM, STT-MRAM
Read disturbance All
Limited endurance All
of the PCM cell is not disturbed. This is the fundamental
operation of forward propagation of neuron excitation during
inference. For online learning (e.g., using STDP), the injected
current induces (Joule) heating in the chalcogenide alloy,
changing its conductivity, thereby achieving synaptic weight
updates. Figure 2 · illustrates the current profiles necessary
for inference (using the read pulse) and online learning (using
the SET and RESET pulse) in PCM. These current profiles
are generated using an on-chip charge pump (CP). Figure 2 ¸
illustrates the PCM cell’s operation when idle, i.e., when
a neuron is not activated. We illustrate a 1D-1R structure,
where a single PCM cell is connected to a row and column
using a diode as an access device. Diode-based PCM cells
allow very high integration density in scaled technology nodes
compared to transistor-based PCM. The CP is operated at 1.8V
to maintain the required biasing. Finally, Figure 2 ¹ illustrates
the PCM operation during inference. The CP is operated at
3V to generate the read current profile of Figure 2 · using
the sense amplifier (SA). The write driver (WD) is used for
generating the currents for online learning.
resistive 
heating element
chalcogenide 
alloy (GST)
metal (to bottom 
elcrode via diode)
metal (to to 
top electrode)
1 PCM Cell 2 Current profiles for read, 
SET and RESET
1.8V
SA
BL = floating
WL = 1.8V
WD
CP
3 Idle
3V
BL = 3V
WL = 0V
WD
CP
4 Spike Propagation
SA
Diode Diode
PCM 
Cell
PCM 
Cell
Fig. 2: Operation of PCM in neuromorphic computing.
These high-voltage operations of the charge pump (and
the peripheral circuit of a crossbar) accelerates circuit ag-
ing, lowering the dependability of neuromorphic computing.
Section III formulates aging in neuromorphic computing and
Section IV proposes our solution to this problem.
Dependability issues in transistor-based PCM cells:
When using transistor-based PCM cells, the CP is operated
at lower voltages: 1.2V during idle and 1.8V during spike
propagation. Though aging issues are less severe in such
designs, they are still dependability concerns for neuromorphic
computing. Our solution, which we describe in Section IV,
applies to both diode and transistor-based PCM cells, and
improves dependability for both (see Section V-D).
III. RELIABILITY FORMULATION
There are many sources of reliability issues in a neuro-
morphic hardware with PCM synapses, as listed in Table I.
We focus on the aging of CMOS-based circuits due to high
voltage PCM operations. Figure 3 shows the internal circuitry
of a neuron which injects current into a crossbar [28]. We
observe that the CMOS transistors are operated at elevated
voltages (1.8V and 3V for 1D-1R PCM, and 1.2V and 1.8V
for 1T-1R PCM) during the execution of a machine learning
application. These elevated voltages accelerate CMOS aging,
leading to hard or soft faults in the neuromorphic hardware. It
is important to note that continuous device scaling and elevated
operating temperatures can make these errors manifest sooner
than endurance-related failures, making CMOS aging a critical
dependability issue of NVM-based neuromorphic computing.
Vdd
Vsf
Vdd
Vdd
Vrfr
Vdd
Cmem
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
Iinj
Vmem
Vin
Ifb
Vspk
Vdd
Vlk
Vdd
M13
M14
M15
M16
M17M19
M18
M20
Vo1
Vo2
Vadap
IresetIleak Iadap
Vca
Figure 1: Circuit diagram of the integrate-and-fire neuron.
This design, however, is also undermined by large power
consumption, due to the very same problems present in the
Axon-Hillock circuit. In [14] van Schaik proposed a cir-
cuit with an amplifier at the input, to compare the voltage
on the capacitor with a desired spiking threshold voltage.
As the input exceeds the spiking threshold, the amplifier
drives the inverter, making it switch very rapidly. This cir-
cuit consumes less power than previously proposed ones,
but has not been optimized explicitly for power consump-
tion, and still lacks a spike-frequency adaptation mecha-
nism. In [13] Boahen demonstrates how it is possible to
implement spike-frequency adaptation by connecting a four
transistor “current-mirror integrator” in negative-feedback
mode to any I&F circuit. And in [9] the authors specifi-
cally address the problem of power consumption in I&F cir-
cuits (following preliminary results obtained by some of the
participants of the 1999 Neuromorphic Engineering Work-
shop). The I&F circuit proposed in [9] did not include spike-
frequency adaptation mechanisms, nor voltage threshold mo-
dulation ones, nor refractory periods, nor ways to imple-
ment explicit leak currents. The circuit we propose uses the
same design tricks proposed in [9] to minimize power con-
sumption, but also includes all of the properties mentioned
above.
3. CIRCUIT IMPLEMENTATION AND
MEASUREMENTS
The low-power circuit that implements the model of a leaky
I&F neuron is shown in Figure 1. It comprises twenty tran-
sistors and one (explicit) capacitor. Two additional para-
sitic (implicit) capacitors are exploited at nodes Vo2 and Vca
(see text below). The circuit can be subdivided in six main
blocks: a source follower M1-M2, for increasing the linear
integration range and for modulating the neuron’s thresh-
old voltage; an inverter with positive feedback M3-M7,
0 0.005 0.01 0.015 0.02
0
1
2
3
Time (ms)
V m
e
m
 
(V
)
0.015 0.0165 0.018
1.5
2
2.5
Figure 2: Measurement data showing a typical shape of a
spike. The inset shows a fit of the measured data with equa-
tion (1).
for reducing the switching short-circuit currents at the in-
put; an inverter with controllable slew-rate M8-M11, for
setting arbitrary refractory periods; a digital inverter M13-
M14, for generating the fast digital pulse that signals the
occurrence of a spike; a transient current-mirror integrator
M15-M19, for implementing the spike-frequency adapta-
tion mechanism, and a minimum size transistor M20 for
implementing a constant current leak.
3.1. Voltage threshold modulation and positive feedback
If Vmem is sufficiently low, the input current Iinj is inte-
grated linearly by Cmem. The source-follower M1-M2,
driven by Vmem generates the signal Vin = κ(Vmem−Vsf ),
where Vsf is a constant sub-threshold bias voltage and κ
is the sub-threshold slope coefficient [15]. By changing
Vsf one can change the neuron’s threshold voltage and use
this property to model long-term adaptation effects in cor-
tical cells, or to reproduce traveling waves or global oscil-
lations in the whole population of I&F neurons. As Vmem
increases and Vin approaches the threshold voltage of M5,
the current through M3 starts to increase, and Vo1 starts to
decrease. Consequently the feedback current Ifb starts to
increase Vmem and Vin more rapidly. As Vin (and Vo1) ap-
proach Vdd/2, the feedback current increases, reaching a
maximum value and decreasing again as Vin crosses Vdd/2
while approaching Vdd. The positive feedback has the effect
of making the inverter M3-M5 switch very rapidly, reduc-
ing dramatically its power dissipation. It can be shown that
to a first order approximation the positive feedback has the
effect of changing the profile of Vmem(t) from linear (for a
constant Iinj ) into a profile of the type
c1t + c2 ln
(
c3
e(t−t0) − 1
)
(1)
when t is close to the spike emission time t0. The param-
eters c1, c2, and c3 are proportionality constants. Figure 2
Fig. 3: Internal architecture of a neuron.
A. High-Voltage Related Circuit Aging
In this section we formulate CMOS aging considering
Time-Dependent Dielectric Breakdown (TDDB), Negative-
Bias Temperature Instability (NBTI), and Hot-Carrier Injection
(HCI) failure mechanims. These are the dominant ones in
scaled technology nodes (45nm and below). In older nodes,
Electromigration (EM) plays a key role [29]–[32]. Neverthe-
less, our aging formulation can be easily extend d to also
consider EM and any other failure mechanisms.
CMOS aging is accelerated when the device is stressed, i.e.,
exposed to high overdrive voltages1. With this understanding,
we provide a brief background of these failure mechanisms.
• TDDB: This is a failure mechanism in a CMOS device,
when the gate oxide breaks down as a result of long-time
application of relatively low electric field (as opposed to
immediate breakdown, which is caused by strong electric
field) [33]. The lifetime of a CMOS device is measured
in terms of its mean time to failure (MTTF) as
MTTFTDDB = A.e
−γ√V , (1)
1Overdrive voltage is defined as the voltage between transistor gate and
source (VGS ) in excess of the threshold voltage (Vth), where Vth is the
minimum voltage required between gate and source to turn the transistor on.
where A and γ are material-related constants, and V is
the overdrive gate voltage of the CMOS device.
• NBTI: This is a failure mechanism in a CMOS device
in which positive charges are trapped at the oxide-
semiconductor boundary underneath the gate [34]. NBTI
manifests as 1) decrease in drain current and transcon-
ductance, and 2) increase in off current and threshold
voltage. The NBTI lifetime of a CMOS device is
MTTFNBTI =
A
V γ
e
Ea
KT , (2)
where A and γ are material-related constants, Ea is the
activation energy, K is the Boltzmann constant, T is the
temperature, and V is the overdrive gate voltage of the
CMOS device.
• HCI: This is a failure mechanism in a CMOS device,
when a carrier (electron or hole) gains sufficient kinetic
energy to overcome the potential barrier of the con-
ducting channel and gets trapped in the gate dielectric,
permanently changing the switching characteristics of the
device [35].
Unlike the TDDB and NBTI failure mechanisms, for which
silicon-characterized reliability models are available from
foundries, characterized models for HCI failure mechanism
are still in development for scaled nodes.
1) TDDB Aging of a Single Neuron in a Tile: We illus-
trate our aging formulation for TDDB failure mechanism first,
and then show how to extend this formulation to consider other
failure mechanisms such as NBTI and HCI.
TDDB failures can also be modeled using the Weibull dis-
tribution [36] with a scale parameter α and a slope parameter
β. Reliability at time t can be written as
R(t) = e
−
(
t
α(V )
)β
, (3)
with the corresponding MTTF computed as
MTTF =
∫ ∞
0
R(t)dt = α(V )Γ
(
1 +
1
β
)
, (4)
where Γ is the Gamma function. Using the expressions for
MTTF from Equations 1 and 3, and rearranging, we obtain
the expression for the scale parameter α as
α(V ) =
A.e−γ
√
V
Γ
(
1 + 1β
) . (5)
Figure 4 illustrates a spike train and the change in operating
voltage of a neuron circuit to inject current into the crossbar
for each spike in the spike train. To estimate the aging in this
time duration, we let [ti, ti+1) be the (i+1)th time interval with
∆ti = ti+1 − ti and Vi as the neuron’s voltage.
time
Fig. 4: Operating voltage of the neuron circuit to propagate
spikes through the crossbar.
Reliability of a CMOS device at the start of execution is
R(t)|t=t0 = 1. (6)
At the end of the first interval, the reliability is
R(t
−
1 ) = e
−
(
t1
α(V0)
)β
. (7)
Using the term θ to represent reliability degradation during this
interval [to, t1), the reliability at the beginning of the second
interval (i.e., right after the start of the first idle period) is
R(t
+
1 ) = e
−
(
t1+θ
α(V1)
)β
. (8)
Due to the continuity of the reliability function, we can equate
Equations 7 & 8 to compute θ as
θ =
(
α(V1)
α(V0)
− 1
)
t1. (9)
Substituting Eq. 9 in Eq. 8, reliability at time t2 is
R(t2) = e
−
(
∆t1
α(V1)
+
∆t0
α(V0)
)β
. (10)
We can extend this equation to compute the reliability of a
CMOS device in the neuron circuit at the end of the spike
train as
R(ts) = e
−
(∑n
i=1
∆ti
α(Vi)
)β
. (11)
We define the TDDB aging of the neuron i, ATDDB(i), as
ATDDB(i) =
n∑
i=1
∆ti
α(Vi)
, such that R(ts) = e
−(ATDDB(i))β , (12)
where the scaling factor α(Vi) can be calculated using Eq. 5.
2) TDDB Aging of a Tile: The aging of Eq. 12 can be
extended to incorporate the aging of other neurons in the tile
(i.e., all input to a crossbar in the tile). To this end, we consider
a tile to be faulty if any input neuron fails due to TDDB.2
Using this series failure model, the TDDB aging of tile j is
AjTDDB = max{A
j
TDDB(i)},∀i ∈ 1, · · · , n. (13)
where n is the number of input ports of the tile, i.e., the number
of input neurons and AjTDDB(i) is the TDDB aging of the ith
neuron in the jth tile, computed using Eq. 12.
3) TDDB Aging of a Neuromorphic Hardware: To com-
pute the TDDB aging of the neuromorphic hardware with N
tiles, we similarly use a series failure model, where a single
faulty tile leads to the failure of the neuromorphic hardware.
The TDDB aging of the neuromorphic hardware is
ATDDB = max{AjTDDB}, ∀j ∈ 1, · · · , N. (14)
Formulating TDDB as a series failure model allows our
mapping framework to minimize the maximum aging, i.e.,
solving a minmax problem (see Section IV).
2In our future work, we consider a single faulty neuron to reduce the
capacity of the crossbar in a tile, rather than treating the entire tile to be
faulty. Nonetheless, considering this capacity reduction is a reactive solution.
The proposed work, on the other hand is an orthogonal proactive approach.
B. Combining Aging due to other Failure Mechanisms
Next, we consider NBTI, which is manifested as threshold
voltage shift in a CMOS device. Recent works such as [34]
suggest that NBTI is the collective response of two inde-
pendent mechanisms: the as-grown hole traps (AHTs) and
generated defects (GDs). AHTs and a small proportion of
GDs can be recovered by annealing at high temperatures if
the NBTI stress voltage is removed.
Fig. 5: Demonstration of degradation due to NBTI.
Figure 5 illustrates the stress and recovery of threshold
voltage of a CMOS device due to NBTI failure mechanism
on application of a high (Vread) and a low voltage (Vidle). We
observe that both stress and recovery depends on the time of
exposure to the corresponding voltage level. This implies that
when a neuron is idle, the NBTI aging of the neuron recovers
from stress. This is different from TDDB mechanism, where
the neuron continues to undergo TDDB aging even when idle.
Using the spike train of Fig. 4, NBTI aging is given by
ANBTI =
m−1∑
i=0
g0 · (Vi−VNBTI)m · (ti+1− ti)n, such that R(T ) = e−A
β
NBTI ,
(15)
where β, g0,m, n are material-dependent constants [34].
To combine the aging from different failure mechanisms
such as TDDB, NBTI and HCI, we use the Sum-of-Failure-
Rates (SOFR) model, which is used extensively in the indus-
try [37]. SOFR assumes an exponential lifetime distribution
for each failure mechanism, allowing us to compute the overall
aging of the neuron as the combined effect of aging due to
each failure mechanism individually.
Using Equations 4 & 12, the failure rate for TDDB is
λTDDB =
1
MTTFTDDB
=
∫ ∞
0
e(ATDDB)
β
. (16)
Using SOFR, the overall failure rate is computed as
λoverall =
1
MTTFoverall
= λTDDB + λNBTI + λHCI. (17)
The overall aging of the neuromorphic hardware is therefore
Aoverall = ln
(
e(ATDDB)
β+ + e(ANBTI)
β+ + e(AHCI)
β+
) 1
β
. (18)
Equation 18 can be extended to consider other failure
mechanisms, as well as other models to compute the aging.
C. Lifetime Computation
The lifetime of a neuromorphic hardware is usually much
longer than the execution time of a machine learning work-
load. To estimate how much a neuron circuit’s lifetime changes
due to the mapping of neurons and synapses to the hardware,
we estimate the aging over the time duration of a workload,
and then use it to extrapolate the lifetime, considering the same
workload being executed repeatedly until one of the hardware
components fails due to one of the failure mechanism. The
lifetime measured as MTTF is
MTTF =
∫ ∞
0
e(Aoverall)
−β (19)
To compute MTTF, the slope parameter of Weibull distri-
bution is set to β = 2, and the operating temperature is set
to 300K. Other fitting parameters are adjusted to achieve an
MTTF of 2 years in the baseline system. This is the typical
lifetime of neuromorphic products.
IV. PROPOSED SOLUTION
A. High-Level Overview
Figure 6 shows a high-level overview of the workflow
of RENEU, which can either directly input an SNN-based
machine learning application or an artificial neural network
(ANN)-based one after converting its ANN operations to SNN
using the N2D2 tool [38], [39]. The SNN model is then
partitioned into clusters, where each cluster accommodates
a fixed number of neurons and synapses, and can fit on its
entirety to a crossbar in the hardware. We use the clustering
technique of the DFSynthesizer tool [14]. The clustered SNN
model is then used by RENEU to find the mapping of the
clusters to the crossbars using an instance of the PSO. We
now describe in details the PSO step of RENEU.
ANN
SNN
ANN to SNN
Cl
u
st
e
rin
g
PSO-based 
Mapping
Neurommorphic 
Hardware
Aging Models
Fig. 6: Workflow of RENEU.
B. PSO-based Cluster Mapping
We consider the mapping of a clustered SNN G(C,E) with
a set C of clusters and a set E of edges, to the neuromorphic
hardware H(T, I), where T is the set of tiles in the hardware
and I is the set of connections of these tiles.
Mapping M : G(C,E) → H(T, I) is specified by a logical
matrix (mij) ∈ {0, 1}|C|×|T |, where mij is defined as
mij =
{
1 if cluster ci ∈ C is mapped to tile Tj ∈ T
0 otherwise
(20)
The mapping constraints are the following:
1. A cluster can be mapped to only one crossbar, i.e.,∑
j
mij = 1 ∀i (21)
2. A crossbar can accommodate at most one cluster, i.e.,∑
i
mij ≤ 1 ∀j (22)
We use an instance of Particle Swarm Optimization
(PSO) [21] to obtain this mapping. The fitness function of
the PSO is a joint metric λ defined as the product of aging
and execution time. This metric λ is computed as follows.
• τM = Execution time of mapping M , computed using the
DFSynthesizer tool [14].
• AM = Aging of mapping M , computed using (18) with
spike trains obtained from the SNN model.
• λM = τM · AM . This is the fitness function.
The PSO finds the optimum mapping with the minimum
value of the fitness function, i.e.,
λMopt = min{λMi |i ∈ 1, 2, · · · }, (23)
We instantiate np swarm particles. The position of these
particles are solutions to the fitness function, and they repre-
sent cluster mappings, i.e., M’s in Equation 23. Each particle
also has a velocity with which it moves in the search space to
find the optimum solution. During the movement, a particle
updates its position and velocity according to its own expe-
rience (closeness to the optimum) and also experience of its
neighbors. We introduce the following notations.
D = |C| × |V| = dimensions of the search space (24)
Θ = {θl ∈ RD}np−1l=0 = positions of particles in the swarm
V = {vl ∈ RD}np−1l=0 = velocity of particles in the swarm
Position and velocity of swarm particles are updated, and the
fitness function is computed as
Θ(t+ 1) = Θ(t) + V(t+ 1) (25)
V(t+ 1) = V(t) + ϕ1 ·
(
Pbest −Θ(t)
)
+ ϕ2 ·
(
Gbest −Θ(t)
)
F (θl) = λθl = τθl · Aθl
where t is the iteration number, ϕ1, ϕ2 are constants and Pbest
(and Gbest) is the particles own (and neighbors) experience.
Finally, local and global bests are updated as
P lbest = F (θl) if F (θl) < F (P
l
best)
Gbest = min
l=0,...np−1
P lbest (26)
Due to the binary formulation of the mapping problem (see
Equation 20), we need to binarize the velocity and position of
Equation 24, which we illustrate below.
Vˆ = sigmoid(V) =
1
1 + e−V
Θˆ =
{
0 if rand() < Vˆ
1 otherwise
(27)
In finding a new position of a PSO particle, we use the two
constraints (21) and (22).
C. Pareto-Optimization
We record all mappings generated using the PSO. In the
final step, we perform Pareto-optimization to select a mapping
that maximizes aging without compromising performance.
Figure 7 shows the Pareto front of LeNet-CIFAR application
and selection of the final mapping.
0
10
20
30
40
50
60
0 2 4 6 8 10 12 14 16 18 20
1/
Ag
in
g 
(a
rb
ita
ry
 u
ni
t)
1/Execution Time 
Lowest aging
Hi
gh
es
t 
pe
rfo
rm
an
ce
Selected 
Mapping
Fig. 7: Pareto optimization.
V. RESULTS AND DISCUSSIONS
We conduct all simulations on a system with 8 CPUs, 32GB
RAM, and NVIDIA Tesla GPU, running Ubuntu 16.04. We
model the DYNAP-SE neuromorphic hardware [2] with four
tiles. Each tile has one 128×128 crossbar with PCM synapses.
We evaluate 10 standard machine learning applications ob-
tained from [13] that are summarized in Table II.
Class Applications Synapses Neurons Topology
MLP
EdgeDet 272,628 1,372 FeedForward (4096, 1024, 1024, 1024)
ImgSmooth 136,314 980 FeedForward (4096, 1024)
MLP-MNIST 79,400 984 FeedForward (784, 100, 10)
CNN
CNN-MNIST 159,553 5,576 CNNa
LeNet-MNIST 1,029,286 4,634 CNNb
LeNet-CIFAR 2,136,560 18,472 CNNc
HeartClass [40] 2,396,521 24,732 CNNd
RNN
HeartEstm [41] 636,578 6,952 Recurrent Reservoir
SpeechRecog 636,578 6,952 Recurrent Reservoir
VisualPursuit 636,578 6,952 Recurrent Reservoir
a. Input(24x24) - [Conv, Pool]*16 - FC*150 - FC*10
b. Input(32x32) - [Conv, Pool]*6 - [Conv, Pool]*16 - Conv*120 - FC*84
- FC*10
c. Input(32x32x3) - [Conv, Pool]*6 - [Conv, Pool]*6 - FC*84 - FC*10
d. Input(82x82) - [Conv, Pool]*16 - [Conv, Pool]*16 - FC*256 - FC*6
TABLE II: Applications used to evaluate our approach.
We evaluate the following state-of-the-art approaches.
• PYCARL: A performance-oriented approach to map neu-
rons and synapses to a neuromorphic hardware [13].
• Reliability Qualification: A conservative reliability qual-
ification technique, which estimates aging assuming
worst-case operating conditions [20].
• RENEU (proposed): We use a detailed circuit aging
model and use it to map neurons and synapses to
a neuromorphic hardware improving reliability without
compromising performance.
A. Lifetime Improvement
Figure 8 reports the MTTF of RENEU normalized to
PYCARL for each of our machine learning applications. We
observe that the MTTF of RENEU is better than PYCARL
by an average of 18%. This improvement is because RENEU
allocates clusters to tiles, minimizing the circuit aging of its
crossbars. Lower aging leads to higher MTTF. We observe no
noticeable improvement of MTTF for MLP-MNIST because
this is a very small application to begin with.
B. Aging Reduction
Figure 9 reports the circuit aging caused by RENEU
normalized to PYCARL for each of our machine learning
Ed
ge
De
t
Im
gS
mo
ot
h
ML
P-M
NI
ST
CN
N-
MN
IST
Le
Ne
t-M
NI
ST
Le
Ne
t-C
IFA
R
He
ar
tC
las
s
He
ar
tE
stm
Sp
ee
ch
Re
co
g
Vis
ua
lPu
rsu
it
AV
ER
AG
E
0.0
0.5
1.0
1.5
M
TT
F 
no
rm
al
ize
d 
to
 P
YC
AR
L
PYCARL RENEU
Fig. 8: MTTF normalized to PYCARL (higher is better).
applications. We observe that the aging of RENEU is lower
than PYCARL by an average of 38%. This improvement is
because RENEU formulates the detailed circuit aging of a neu-
romorphic hardware and allocates the neurons and synapses of
a machine learning application to minimize it.
Ed
ge
De
t
Im
gS
mo
ot
h
ML
P-M
NI
ST
CN
N-
MN
IST
Le
Ne
t-M
NI
ST
Le
Ne
t-C
IFA
R
He
ar
tC
las
s
He
ar
tE
stm
Sp
ee
ch
Re
co
g
Vis
ua
lPu
rsu
it
AV
ER
AG
E
0.0
0.5
1.0
Ag
in
g 
no
rm
al
ize
d 
to
 P
YC
AR
L
PYCARL RENEU
Fig. 9: Circuit aging in the neuromorphic hardware normalized
to PYCARL (lower is better).
C. Temperature Dependency
Figure 10 illustrates the temperature dependency of circuit
aging in a neuromorphic hardware. We report the aging results
of RENEU at two elevated temperatures, 325K and 350K,
for each of our machine learning applications. Aging results
are normalized to RENEU at 300K. We observe that aging
increases with an increase in temperature. Aging observed at
325K and 350K is higher than that observed at 300K by an av-
erage of 7% and 26%, respectively. These results follow from
our aging formulation, which incorporates temperature using
the scaling parameter α in (5) for TDDB and the parameter g0
in (15) for NBTI. These parameters grow exponentially with
temperature, resulting in a corresponding exponential increase
in the aging. Higher aging leads to lower lifetime.
Ed
ge
De
t
Im
gS
mo
ot
h
ML
P-M
NI
ST
CN
N-
MN
IST
Le
Ne
t-M
NI
ST
Le
Ne
t-C
IFA
R
He
ar
tC
las
s
He
ar
tE
stm
Sp
ee
ch
Re
co
g
Vis
ua
lPu
rsu
it
AV
ER
AG
E
0.0
0.5
1.0
1.5
Ag
in
g 
no
rm
al
ize
d 
to
 3
00
K
300K 325K 350K
Fig. 10: Circuit aging of RENEU at 325K and 350K normal-
ized to RENEU at 300K.
D. Diode vs. Transistor-based PCM
Figure 11 reports the circuit aging in a neuromorphic
hardware with transistor-based PCM normalized to RENEU
which uses diode-based PCM. We report results for each
of our machine learning applications. We observe that the
aging of a neuromorphic hardware with transistor-based PCM
is on average 10% lower than diode-based PCM. This is
because the operating voltages of a neuromorphic hardware are
comparatively lower for transistor-based PCM, which reduces
the circuit aging. However, a diode-based PCM cell is 33%
smaller than a transistor-based PCM cell, which means that
diode-based PCM cells can implement neuromorphic hard-
ware with high integration density. Nevertheless, our approach
RENEU can be applied to improve reliability of neuromorphic
hardware with both diode and transistor-based PCM.
Ed
ge
De
t
Im
gS
mo
ot
h
ML
P-M
NI
ST
CN
N-
MN
IST
Le
Ne
t-M
NI
ST
Le
Ne
t-C
IFA
R
He
ar
tC
las
s
He
ar
tE
stm
Sp
ee
ch
Re
co
g
Vis
ua
lPu
rsu
it
AV
ER
AG
E
0.0
0.5
1.0
1.5
Ag
in
g 
no
rm
al
ize
d 
to
 
di
od
e-
ba
se
d 
PC
M RENEU with diode-based PCM
RENEU with transistor-based PCM
Fig. 11: Circuit aging of RENEU with transistor-based PCM
normalized to RENEU with diode-based PCM.
E. Performance Impact
Figure 12 reports the performance of RENEU measured as
the execution time normalized to PYCARL for each of our
machine learning applications. We also report the results of a
conservative reliability qualification technique, which periodi-
cally de-stresses a neuron circuit to achieve similar MTTF as
RENEU [20]. We make the following two observations. First,
the execution time of a machine learning application using
RENEU is within 5% of the execution time of PYCARL.
This is because RENEU incorporates both performance and
reliability when finding a suitable mapping of neurons and
synapses to the neuromorphic hardware. Second, to achieve
a similar MTTF as RENEU, existing conservative flavor of
PYCARL periodically de-stresses the neuron circuit, which
introduces an average performance overhead of 35%.
Ed
ge
De
t
Im
gS
mo
ot
h
ML
P-M
NI
ST
CN
N-
MN
IST
Le
Ne
t-M
NI
ST
Le
Ne
t-C
IFA
R
He
ar
tC
las
s
He
ar
tE
stm
Sp
ee
ch
Re
co
g
Vis
ua
lPu
rsu
it
AV
ER
AG
E
0
1
2
3
Ex
ec
ut
io
n 
tim
e 
no
rm
al
ize
d 
to
 P
YC
AR
L
PYCARL
PYCARL+Reliability Qualification
RENEU
Fig. 12: Execution time of RENEU normalized to PYCARL
(lower is better).
F. Circuit Aging with More Crossbars
Figure 13 reports the circuit aging as the amount of hard-
ware resources are increased. We report the aging results of
RENEU with 9 and 16 crossbars for each of our machine
learning applications. Aging results are normalized to RENEU
with 4 crossbars. We make the following two observations.
First, aging reduces with increasing number of crossbars. This
is because with more crossbars in the system, the average load
on each crossbar reduces, which in turn reduces the stress on
its CMOS devices. Lower stress reduces the circuit aging. With
9 and 16 crossbars, the average circuit aging is respectively,
24% and 29% lower than RENEU with 4 crossbars. Second,
for MLP-MNIST, there is no noticeable improvement. This is
because MLP-MNIST is a small application to begin with.
Ed
ge
De
t
Im
gS
mo
ot
h
ML
P-M
NI
ST
CN
N-
MN
IST
Le
Ne
t-M
NI
ST
Le
Ne
t-C
IFA
R
He
ar
tC
las
s
He
ar
tE
stm
Sp
ee
ch
Re
co
g
Vis
ua
lPu
rsu
it
AV
ER
AG
E
0.0
0.5
1.0
Ag
in
g 
no
rm
al
ize
d 
to
 
4 
cr
os
sb
ar
s
4-crossbars 9-crossbars 16-crossbars
Fig. 13: Circuit aging of RENEU with 9 and 16 crossbars
normalized to RENEU with 4-crossbars (lower is better).
G. Optimization Time
Table III reports the optimization time (measured as wall
clock time) of RENEU in finding a mapping using the pro-
posed PSO. The optimization time depends on the size of the
application. The MLP-MNIST, which is a small application
requires 4.6s, while LeNet-CIFAR requires 98.2s.
TABLE III: Optimization time of RENEU.
Application Time (s) Application Time (s)
EdgeDet 53.1 ImgSmooth 47.2
MLP-MNIST 4.6 CNN-MNIST 84.4
LeNet-MNIST 70.12 LeNet-CIFAR 98.2
HeartClass 14.6 HeartEstm 59.93
SpeechRecog 82.0 Visual Pursuit 90.8
VI. CONCLUSION
We present RENEU, a reliability-oriented approach for
mapping neurons and synapses to the hardware resources of a
Non-volatile Memory (NVM)-based neuromorphic hardware.
Prior efforts in mapping neurons and synapses have mostly
considered performance. RENEU is built on two key contri-
butions. RENEU formulates the detailed circuit aging in a
neuromorphic hardware considering different failure mecha-
nisms such as Time Dependent Dielectric Breakdown (TDDB),
Negative Bias Temperature Instability (NBTI), and Hot Carrier
Injection (HCI). Using this formulation RENEU places the
neurons and synapses to the hardware using an instance of Par-
ticle Swarm Optimization (PSO), exploring the performance
and reliability trade-offs. We evaluate RENEU using machine
learning applications on a state-of-the-art neuromorphic hard-
ware with PCM synapses. Results demonstrate a significant
improvement in reliability of neuromorphic computing with
marginal impact on performance.
ACKNOWLEDGMENT
This work is supported by the National Science Foundation
Faculty Early Career Development Award CCF-1942697 (CA-
REER: Facilitating Dependable Neuromorphic Computing:
Vision, Architecture, and Impact on Programmability).
REFERENCES
[1] W. Maass, “Networks of spiking neurons: the third generation of neural
network models,” Neural Networks, 1997.
[2] S. Moradi, N. Qiao, F. Stefanini, and G. Indiveri, “A scalable multicore
architecture with heterogeneous memory structures for dynamic neuro-
morphic asynchronous processors (DYNAPs),” TBCAS, 2018.
[3] M. V. DeBole et al., “Truenorth: Accelerating from zero to 64 million
neurons in 10 years,” Computer, 2019.
[4] M. Davies et al., “Loihi: A neuromorphic manycore processor with on-
chip learning,” IEEE Micro, 2018.
[5] G. W. Burr et al., “Neuromorphic computing using non-volatile mem-
ory,” Advances in Physics: X, 2017.
[6] S. Song et al., “Exploiting inter- and intra-memory asymmetries for data
mapping in hybrid tiered-memories,” in ISMM, 2020.
[7] S. Song et al., “Improving phase change memory performance with data
content aware access,” in ISMM, 2020.
[8] S. Song et al., “Enabling and exploiting partition-level parallelism
(PALP) in phase change memories,” TECS, 2019.
[9] P.-Y. Chen et al., “Reliability perspective of resistive synaptic devices
on the neuromorphic system performance,” in IRPS, 2018.
[10] B. Gleixner et al., “Reliability characterization of phase change mem-
ory,” in NVMTS, 2009.
[11] A. Pirovano et al., “Reliability study of phase-change nonvolatile
memories,” TDMR, 2004.
[12] L. Jiang et al., “A low power and reliable charge pump design for phase
change memories,” in ISCA, 2014.
[13] A. Balaji et al., “PyCARL: A pynn interface for hardware-software co-
simulation of spiking neural network,” in IJCNN, 2020.
[14] S. Song et al., “Compiling spiking neural networks to neuromorphic
hardware,” in LCTES, 2020.
[15] A. Balaji et al., “Mapping spiking neural networks to neuromorphic
hardware,” TVLSI, 2019.
[16] Y. Ji et al., “Bridge the gap between neural networks and neuromorphic
hardware with a neural network compiler,” in ASPLOS, 2018.
[17] A. Das et al., “Mapping of local and global synapses on spiking
neuromorphic hardware,” in DATE, 2018.
[18] A. Das and A. Kumar, “Dataflow-based mapping of spiking neural
networks on neuromorphic hardware,” in GLSVLSI, 2018.
[19] A. Balaji and A. Das, “A framework for the analysis of throughput-
constraints of SNNs on neuromorphic hardware,” in ISVLSI, 2019.
[20] A. Balaji et al., “A framework to explore workload-specific performance
and lifetime trade-offs in neuromorphic computing,” CAL, 2019.
[21] R. Eberhart et al., “A new optimizer using particle swarm theory,” in
MHS, 1995.
[22] E. Chicca et al., “A VLSI recurrent network of integrate-and-fire neurons
connected by plastic synapses with long-term memory,” TNNLS, 2003.
[23] Y. Dan and M.-m. Poo, “Spike timing-dependent plasticity of neural
circuits,” Neuron, 2004.
[24] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, 2015.
[25] A. Mallik et al., “Design-technology co-optimization for OxRRAM-
based synaptic processing unit,” in VLSIT, 2017.
[26] D. Garbin et al., “HfO2-based OxRAM devices as synapses for convo-
lutional neural networks,” TED, 2015.
[27] S. G. Ramasubramanian et al., “Spindle: Spintronic deep learning engine
for large-scale neuromorphic computing,” in ISLPED, 2014.
[28] G. Indiveri, “A low-power adaptive integrate-and-fire neuron circuit,” in
ISCAS, 2003.
[29] A. Das et al., “Execution trace–driven energy-reliability optimization for
multimedia mpsocs,” TRETS, 2015.
[30] A. Das et al., “Energy-aware task mapping and scheduling for reliable
embedded computing systems,” TECS, 2014.
[31] A. Das et al., “Aging-aware hardware-software task partitioning for
reliable reconfigurable multiprocessor systems,” in CASES, 2013.
[32] J. Srinivasan et al., “The case for lifetime reliability-aware micropro-
cessors,” in ISCA, 2004.
[33] P. J. Roussel et al., “New methodology for modelling MOL TDDB
coping with variability,” in IRPS, 2018.
[34] R. Gao et al., “NBTI-generated defects in nanoscaled devices: fast
characterization methodology and modeling,” TED, 2017.
[35] X. Wan et al., “HCI improvement on 14nm FinFET io device by
optimization of 3D junction profile,” in IRPS, 2019.
[36] A. Fenner et al., “Making the connection between physics of failure and
system-level reliability for medical devices,” in IRPS, 2018.
[37] S. V. Amari and R. B. Misra, “Closed-form expressions for distribution
of sum of exponential random variables,” TR, 1997.
[38] O. Bichler et al., “N2D2-neural network design & deployment,” 2017.
[Online]. Available: https://github.com/CEA-LIST/N2D2
[39] A. Balaji et al., “Power-accuracy trade-offs for heartbeat classification
on neural networks hardware,” JOLPE, 2018.
[40] A. Das et al., “Heartbeat classification in wearables using multi-layer
perceptron and time-frequency joint distribution of ecg,” in CHASE,
2018.
[41] A. Das et al., “Unsupervised heart-rate estimation in wearables with
liquid states and a probabilistic readout,” Neural Networks, 2018.
