A Secure Back-up and Restore for Resource-Constrained IoT based on
  Nanotechnology by Uddin, Mesbah et al.
1A Secure Back-up and Restore for
Resource-Constrained IoT based on Nanotechnology
Mesbah Uddin, Md Badruddoja Majumder, Md Sakib Hasan, and Garrett S. Rose
Abstract—With the emergence of IoT (Internet of things), huge
amounts of sensitive data are being processed and transmitted
everyday in edge devices with little to no security. Due to their
aggressive power management schemes, it is a common and
necessary technique to make a back-up of their program states
and other necessary data in a non-volatile memory (NVM) before
going to sleep or low power mode. However, this memory is
often left unprotected as adding robust security measures tends
to be expensive for these resource constrained systems. In this
paper, we propose a lightweight security system for NVM during
low power mode. This security architecture uses the memristor,
an emerging nanoscale device which is used to build hardware
security primitives like PUF (physical unclonable function) based
encryption-decryption, true random number generators (TRNG),
and memory integrity checking. A reliability enhancement tech-
nique for this PUF is also proposed which shows how this system
would work even with less-than-100% reliable PUF responses.
Together, with all these techniques, we have established a dual
layer security protocol (data encryption+integrity check) which
provides reasonable security to an embedded processor while
being very lightweight in terms of area, power, and computation
time. A complete system design is demonstrated with 65nm
CMOS and emerging memristive technology. With this, we have
provided a detailed and accurate estimation of resource overhead.
Analysis of the security of the whole system is also provided.
Index Terms—IoT security, hardware security, IoT, embedded
system security, physical unclonable function, PUF, emerging
nanotechnology, memristor, RRAM.
I. INTRODUCTION
B ILLIONS of smart IoT devices are introduced into ourlives every year [1]. In coming years, this number will
only increase and with it the amount of sensitive information
such systems carry. However, most of these devices are
equipped with very little to no security. Therefore, it is not
too difficult to tap into these devices and gather sensitive
information. Data can be leaked from hardware architectures
as well as the apps or software frameworks that these devices
use, thereby compromising safety and privacy [2], [3]. While
there are works that target at how to restrict the use of such
data and mitigate data leaks [4], [5] to maintain privacy, we
are mainly concerned about the standalone data and memory
security of such devices in this particular work. There is a
This material is based upon work supported by the Air Force Office of
Scientific Research under award number FA9550-16-1-0301. Any opinions,
finding, and conclusions or recommendations expressed in this material are
those of the authors and do not necessarily reflect the views of the United
States Air Force.
M. Uddin, M. B. Majumder, and Garrett S. Rose are with the Department
of Electrical Engineering and Computer Science, University of Tennessee,
Knoxville, TN, 37996 USA (e-mail: [muddin6,mmajumde,garose)@utk.edu].
M. S. Hasan is with the Department of Electrical Engineering in the
University of Mississippi, Oxford, MS 38677 (email: mhasan5@olemiss.edu).
growing need to employ security measures in these devices.
However, many IoT devices are usually resource constrained,
battery operated or even batteryless and small in size. They
usually employ very aggressive power saving techniques and
often go into ultra low power or sleep mode to save energy. It
is often necessary to back-up their states, registers and other
information into a non-volatile memory (NVM) before going
into sleep so they can resume their operation quickly when
needed. Moreover, for batteryless energy-harvesting devices,
power failure may occur frequently such that it is beneficial
to back-up often [6]. NVM can retain their states with zero
standby power making it the ideal choice for storing the back-
up data. However, during a power failure or when the device is
in energy-saving mode, data left on the non-volatile memory
may be unprotected and might reveal sensitive information
to an adversary. In this work, we have designed a system to
provide lightweight security for NVM.
Embedded systems require lightweight cryptographic so-
lution and several such cryptosystems are discussed in [7].
However, even these lightweight solutions would incur an
overhead that can be considered too large for resource con-
strained systems such as IoT. As we have discussed later,
our proposed security solution would require both encryption-
decryption and hash or tag generation mechanism. Combining
traditional cryptographic algorithms like AES [8], PRESENT
[9], SIMON and hash functions like SHA-2, SHA-3], MD5
[7] etc. would present a large overhead to these small systems.
Moreover, we are also motivated by the notion of providing
device-specific unique security solution to each such system.
Considering all these, we propose a lightweight encryption
scheme using a PUF as the key generator to secure the non-
volatile memory of an embedded processor.
A PUF is a circuit that can generate a unique signature or
key specific to a particular implementation of an integrated
circuit (IC) [10]. Due to uncontrollable manufacturing pro-
cess variation, there are some differences across different IC
implementations or even among different physical locations
within an IC. In this work, we use a memristor based PUF
which takes a very small area, has smaller delay, and also low
power consumption. The memristor (memory+resistor) [11] is
an emerging nanoscale device that exhibits a large process
variation, useful for implementing PUFs that are very small
in size [12]. We also consider memristors in the design for the
NVM of our demonstrated system in this work.
As PUF responses depend on tiny process variations, re-
liability is a major concern which limits their usage in key
generation applications. Efficient error correction methods are,
therefore, essential in PUF based security applications. But
ar
X
iv
:2
00
7.
04
57
0v
1 
 [c
s.C
R]
  9
 Ju
l 2
02
0
2existing robust error correction techniques can be computation-
ally expensive as well as resource heavy, thus making them
unsuitable for embedded systems. In this work, we instead
propose an reliability enhancement technique to improve the
reliability of a memristor based PUF and also present a
security protocol that relaxes the required level of reliability of
a PUF response. After the reliability enhancement block, the
PUF response is used as the key to encrypt the back-up data.
The same key can be used for fast decryption as well. The
idea is to generate different keys from the PUF by applying
different challenges each time a back-up operation is needed.
To verify that the stored data has not been altered due to
a data error or maliciously by an adversary, we have also
used an integrity checking protocol during system wake-up,
thereby providing a second layer of security. The integrity
checking protocol considered here has in-memory tag genera-
tion capability leveraging sneak path currents in a memristive
(or any resistive) memory [13]. After performing a back-
up of the program state and other necessary data, a tag is
generated from the memory in order to verify data integrity
when the system wakes up. This integrity checking system for
memristive memory is a good fit for IoT systems as it provides
data integrity with significantly lower overhead. The PUF also
needs a random challenge on each backup operation. We have
decided to use a memristor based TRNG [14] to generate that
random challenge vector.
Among the contributions of this work are:
• proposition of a novel hardware based security protocol
for embedded system or IoT security,
• providing a complete security solution using emerging
nanotechnology, specifically memristive technology,
• design of a lightweight reliability enhancement technique
for memristor based PUFs while relaxing the reliability
requirement for unique key generation,
• complete transistor-level implementation of the whole
system for an accurate resource requirement estimation
and overhead analysis.
The paper is organized as follows: section II first gives
a quick introduction to memristors and PUFs. Section III
describes the key idea of this work and section IV presents
the proposed security protocol. Section V presents the theory
and analyses for reliability enhancement technique used in
this work. Section VI describes different components of our
proposed security architecture. Section VII presents a list
of probable attack scenarios where section VIII provides a
security analyses of individual components of our design.
Section IX provides a detailed security analysis against the
attacks list in section VII while section X discusses the
resource overhead. Finally, future prospects and concluding
remarks are provided in sections XI and XII, respectively.
II. BACKGROUND
A. Memristor
The memristor, also known as RRAM or ReRAM (resistive
random access memory), is one of the most promising emerg-
ing nanoscale devices of the last few years [11]. A memristor is
a two terminal electrical device whose instantaneous resistance
Fig. 1. A schematic of the XbarPUF showing memristor crossbar and the
peripheral circuitry [22]
is dependent on its previous history. In a bipolar memristor,
resistance can be switched between a high resistance state
(HRS) and low resistance state (LRS) by applying an ap-
propriate bias voltage. Switching also requires that the bias
voltage be applied above a particular threshold level for at least
some minimum amount of time called the switching time. As
an emerging and nanoscale device, memristors exhibit large
variability in their electrical characteristics which makes them
suitable for hardware security applications like PUFs. They
are very small in size, effectively fitting between cross-points
of on-chip interconnect wires (and thus effectively taking zero
area), CMOS-compatible, non-volatile in nature, fast (∼ ns),
have good retention time and endurance [15], [16]. Memristors
are being considered for a host of different applications in
recent years including hardware security [17], [18], in-memory
computation [19], and persistent memories (NVM) [20], [21].
B. PUF
Physical unclonable functions (PUF) are promising security
primitives useful for a number of security applications, includ-
ing mitigation of integrated circuit (IC) piracy, cloning, and
counterfeit [23], [24]. A PUF is any circuit that can produce
a variety of unique outputs when implemented on different
devices corresponding to the same given input. The input
and output of a PUF are also known as the challenge and
response, respectively. PUFs implemented on different chips
provide unique challenge-response combinations which can be
used as the authenticating signatures for that chip. The PUF
concept usually relies on exploiting random and uncontrollable
physical entropy sources in a device that make the response
unique over various implementations [23]. A PUF that has
a large number of challenge-response pairs (CRPs) is called
a strong PUF such as arbiter PUF (APUF) [23] and ring-
oscillator PUF [10].
A memristive crossbar PUF or XbarPUF is a strong PUF
implemented by a crossbar array of memristors where the
switching variability among memristors is harnessed to gener-
ate PUF responses. Originally proposed in [25], it was shown
3Fig. 2. An embedded processor (either regular or energy harvesting) backs-up
data in NVM for data-forwarding but the NVM is left unprotected when there
is no power. Proposed idea is to secure this NVM using hardware security.
to be very light-weight in terms of both area and energy. An
improved version of it was presented in [22]. This is the PUF
that we’ll be using in this work. An schematic diagram of an
XbarPUF is shown in Fig. 1.
III. PROPOSED SECURITY SOLUTION FOR BACK-UP AND
RESTORE IN NVM
A. Security vulnerability of IoT/embedded processor
The goal of this work is to provide security for the unpro-
tected NVM of an IoT edge device after the system processor
goes into an ultra low power mode or has a power failure for
the case of an energy-harvesting device [26]. In both scenarios,
the NVM is cut off from power and is left unprotected and
thus vulnerable to be maliciously read by an adversary to give
away sensitive information. This is illustrated in Fig. 2. To
prevent that from happening, we are proposing to secure that
NVM by doing a secure back-up before powering down and
a secure restore after waking up.
The general idea is to apply a random challenge (e.g.
generated from a TRNG) to an on-chip PUF to generate a
secure cryptographic key which can be used to encrypt the
data before back-up. During wake-up, the same key can be
generated again by applying the same challenge to that PUF
and data can be decrypted again before restore. This is shown
in Fig. 3. If the challenge applied to a strong PUF is random,
the response is random as well and also unique across different
devices, thereby providing device-specific security.
In this work, we propose to encrypt the back-up data using
a cryptographic key generated by the XbarPUF. As a strong
PUF, an XbarPUF can generate a large number of responses to
be used as keys. We can generate a unique random key each
time a backup operation is needed from this strong PUF and
thus effectively implementing an one time pad (OTP) [27].
Since responses from a PUF might have some unreliable bits,
these bits are eliminated during run-time and only reliable
bits are used in the final key. Then the data is encrypted with
this key and stored in an NVM. A tag is also generated from
the stored data for integrity checking purposes. Both this tag
and PUF challenge are also stored in an NVM (same as data
Fig. 3. A random challenge is applied to a PUF to generate a cryptographic
key for secure back-up and restoration of data in an embedded processor. A
new unique key is generated each time, thus effectively implementing an OTP.
or a separate secure memory). When power comes back on,
a new tag is first generated and checked with the existing
tag. If they both match with each other, then the same stored
challenge is applied on the XbarPUF again. Similar to back-
up operation, unreliable PUF response bits are discarded after
reliability enhancement, and a new key (which should be equal
to the key generated during backup) is generated. Using this
key, the encrypted data from NVM are decrypted and then
restored back into their respective registers or other memory
location. The whole process is illustrated in Fig. 3.
Since we are targeting IoT device with which would have
resource limitation, instead of using a single large PUF, we
have instead proposed to use several smaller PUFs in a time-
multiplexed manner. This would increase the system delay,
with a reduction in power consumption. A trade-off can
be made depending of the resource of a particular device.
Besides, if the data to be saved is larger in size, several
successive challenges could be applied to and then multiple
responses could be appended together for a key as large as
this data. More details about this system implementation are
also discussed later in this work.
B. System assumptions
The basic assumptions for implementing such a system are
as follows:
1) Random key: By definition, the response of a strong PUF
is random and unpredictable. The metrics to evaluate a PUF,
like bit-aliasing, uniformity etc. are calculated from Monte
Carlo simulations to show how random the responses from
the PUF that we have used.
2) Unique key: PUF responses vary from one implementa-
tion to another even for the same challenge, thereby generating
device-specific unique key. Thus, even if an adversary can gain
access to one chip, he/she won’t be able to break the security
of another such chip. Uniqueness metric of a PUF represents
this quality.
4Fig. 4. Flow graph for our proposed secure back-up and restore protocol
3) Key length: The key length should be equal to data
length for backup. This is only possible for small embedded
system where the amount of backup (only program states and
some registers) is small. For slightly larger systems, we can
perform back-up in regular intervals to reduce the amount of
back-up needed at a time, thereby meeting this requirement.
4) Large state-space: PUF should be able to provide a
sufficient number of unique keys throughout the lifetime of
the device under protection. A strong PUF with sufficiently
number of CRPs would meet this requirement by definition.
5) Relaxed reliability requirement: The response of a PUF
doesn’t need to be stable over a long period of time. It only
needs to be stable in successive clock cycles for one encryption
and decryption. If the response i.e. the key varies in a different
encryption-decryption operation e.g due to aging, that actually
helps the system to be more secure by making the key unique
even for the same PUF challenge, thus increasing the available
keys over the lifetime of a chip.
6) Temporary data security: Data is only valid for short
period of time. The importance of the data is only valid until
just after the system wakes up from sleep or a power failure.
Thus we only need to provide security during this interval.
These assumptions should not be difficult to meet for the
types of system for which we are trying to provide security.
Instead of a large PUF, several smaller PUFs are activated one
by one to generate a large key while not increasing the power.
A reliability enhancement block is used to get clean bits for
cryptographic key from the PUF response bits. The challenge
for the PUF is generated from a TRNG.
IV. PROPOSED SECURITY PROTOCOL
Figure 4 shows the overall flow of our proposed security
protocol. It clearly shows the sequence of operations that a
system would perform with our proposed security solution
when a power failure occurs or low power warning is asserted.
The minimum level of this low power warning is set by the
amount of power or energy required to back-up necessary data
successfully and is expected to be handled at the processor
level. Figure 4 also shows that the key generation is required
both during back-up and restore as the key is not stored
anywhere and is generated during run-time. Thus, one major
benefit is that we don’t need to physically store the key and
we can generate it whenever is needed. Moreover, for the
same seed (i.e. PUF challenge), a PUF circuit would produce
different keys in different ICs with the same circuit. Therefore,
even if an adversary can gain access to one device, he/she
would need to give the same effort to gain access to another
such device. Integrity verification step is crucial as it ensures
data has not been altered while the system sits on zero power.
The key generation, secure back-up, and restore protocols are
described below.
A. Key generation
Algorithm 1 describes the key generation process from a
PUF along with the reliability enhancement technique. As
mentioned before, this involves applying a challenge to the
PUF, storing responses and finally generating a clean key by
applying run-time reliability enhancement technique.
Responses are generated multiple times (twice at least)
from the XbarPUF to detect the more error-prone responses
bits. The main source of producing unreliable bits for an
XbarPUF would be the case when two memristor’s cycle-to-
cycle variation are larger than their process variation. During
device testing, severely unreliable bits i.e. bits with very high
flipping probability, can be discarded. By applying the same
challenge multiple times, we should be able to find other cases
and reduce bit-error during run-time. Using temporary storage
like SRAMs are used to implement this. Different challenges
cause the response to depend on different sets of memristors
of the crossbar and thus for different challenges, different bits
of the response would cause bit-flips but they should be the
same for the same challenge. In an extreme case, where the
bit-error rate is so high that the no. of necessary keys bits can
not be produced from the response, VALID KEY token is set
to false which indicates an unsuccessful key generation.
B. Secure back-up
The data is encrypted (XORed in simplest case) with the
generated key and saved in an NVM. Then a tag is generated
for memory integrity purposes. This is a hardware security
(HS) module that will be activated only when there is a
power failure or the processor wants to go into an ultra low
power mode. The sequence of operations when such a situation
occurs are described in algorithm 2 below:
Since memristors are non-volatile and have long retention
time [15], the power failure phase can be long and still won’t
affect the data.
C. Secure restore
When the power is back on, it needs to ensure first that
there is sufficient power for the decryption of the data. Then
5Result: Cryptographic key: KEY; status: VALID KEY;
Initialization:
T ← no. of samples for reliability enhancement ;
m ← no. of parallel small PUF blocks ;
n ← no. of response bits from each small PUF block ;
resp ← m×n ; // total response
Function Key_Generator():
Enable(PUFs) ;
C = TRNG() ; // random challenge
for i← 0 to T do
for j ← 0 to m do
Ri,j = apply challenge to PUF(C) ;
save in SRAM(Ri,j) ;
end
end
for k ← 0 to resp do
if bit error then
discard bit(R[k]) ;
else
KEY.append(R[k]) ;
end
if length(KEY)==nKey then
VALID KEY ← TRUE ;
break ;
end
end
if length(KEY)<nKey then
VALID KEY ← FALSE ;
end
return KEY
End Function ;
Algorithm 1: Proposed secure and error-free key generation
from PUF
Data: States and other data to be backed-up, DAT
Result: Encrypted data, enc dat, and tag in NVM
Secure backup:
if Low-power-warning is asseted then
Enable(HS block) ;
KEY = Key Generator(C) ;
enc dat = Encrypt(KEY, DAT) ;
// XOR(KEY ,DAT)
write in NVM(enc dat) ;
tag = gen tag(enc dat) ;
save(C, tag) ;
Disable(HW block) ;
else
Continue regular operation ;
end
Algorithm 2: Proposed secure data back-up operation
it generates a new tag from the data and check if it matches
with the stored tag for the same data. If they matches, key
generation process is activated again by activating the PUF
along with the reliability enhancement technique to generate
a secure key. If there is a tag mismatch, the process would
simply discard the data. The sequence of operation when
power is back on are described in algorithm 3.
Data: Encrypted data: enc dat, and tag
Result: Restored data, DAT back in registers
Restore data:
while Pavailable < Pthreshold do
wait ;
end
Enable(HS block) ;
enc dat = read from NVM(stored data) ;
C = read from NVM(stored challenge) ;
old tag = read from NVM(tag) ;
new tag = gen tag(enc dat) ;
if new tag = old tag & VALID KEY = TRUE then
KEY = Key Generator(C) ;
dec dat = Decrypt(KEY, enc dat) ;
// XOR(KEY ,enc_dat)
write back in registers(dec dat) ;
else
flush data() ;
restart processor() ;
end
Disable(HS block) ;
Algorithm 3: Proposed secure data recovery operation
V. RELIABILITY ENHANCEMENT TECHNIQUE
To improve the reliability of our PUF response, we have
considered two different voting approach to identify bit-flips
of a PUF response and eliminate those from the final key.
These two techniques, namely majority voting and all-agree
voting are described below.
A. All-agree voting
For this technique, we would apply the same challenge
some ‘N’ number of times to produce the same response ‘N’
times. Now, some unstable bits in the response would flip
and according to ‘all-agree’ voting technique, we would only
accept a response bit as part of the final key if that bit has not
flipped even once in ‘N’ evaluations. One of the benefits of
this technique that this can be implemented very fairly with
minimum hardware requirement, e.g. using XORs. However,
even after using this technique, some bits still may remain
undetected. Our target is make that probability very small.
According to this technique, a bit-error would result in a key
error if a bit happens to be stable during encryption, but flips
during decryption, resulting in a different key. Suppose, the
probability of a particular bit being stable (stable ‘1’ or stable
‘0’) is p. Thus the probability of that bit being the opposite
value is 1-p. Thus for N evaluation a bit being stable is pN
while at least the bit flips at least once is (1-pN ). Now to
propagate such an error due to bit-flip without being detected
by this ‘all-agree’ voting is given by equation 1.
PE1 = p
N ∗ (1− pN ) (1)
There is another way of error being propagated is when that
bit remains stable in its less-stable state during encryption and
60 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability of being stable
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Pr
ob
ab
ilit
y 
of
 a
n 
er
ro
r b
ei
ng
 u
nd
et
ec
te
d
All-agree (Veto) voting
N = 2
N = 4
N = 10
N = 20
N = 50
N = 100
Fig. 5. ‘All-agree’ reliability enhancement technique for different number of
evaluations for different bit-flip probability. It is more efficient at detecting
highly unstable bits with increasing number of evaluations.
flips during decryption. The probability of that is given by
equation 2.
PE2 = (1− p)N ∗ (1− (1− p)N ) (2)
Now in both of these cases, we consider the bit remains
stable during encryption and flips during decryption. The same
equations hold for the case where a bit flips during encryption
but remains stable during decryption. Thus the probability of
an error being propagated into the cryptographic key is given
by the following equation 3:
PE1,2 = 2∗[(pN ∗(1−pN ))+((1−p)N ∗(1−(1−p)N ))] (3)
However, this equation counts two unique cases twice. The
situation is when it produces all ‘1’s during encryption, but
all ‘0’s during decryption and vice-versa. The probability of
such a case is:
PE(allzeros+ allones) = 2 ∗ [(pN ∗ (1− p)N ))] (4)
Both equations 1 and 2 consider this situation and thus
this is being calculated twice while calculating the overall
error propagation probability as given by equation 3. Thus by
subtracting this from equation 3, we get the final expression
for a bit-flip being undetected in all-agree voting system is:
PE(all-agree voting) = 2 ∗ [(pN ∗ (1− pN )) + ((1− p)N∗
(1− (1− p)N ))− (pN ∗ (1− p)N ))]
(5)
We have plotted this equation in Fig. 5 for a few different
‘N’ (no. of samples) and all different values of ‘p’ (probability
of a particular bit being stable at a binary value).
In a PUF, there could be a few bits with a high bit-flip
probability while most of the other bits should remain stable.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability of being stable
0
0.1
0.2
0.3
0.4
0.5
0.6
Pr
ob
ab
ilit
y 
of
 a
n 
er
ro
r b
ei
ng
 u
nd
et
ec
te
d
Majority Voting
N = 3
N = 5
N = 11
N = 25
N = 51
N = 101
Fig. 6. Effectiveness of majority voting for reliability enhancement technique
for different number of evaluations and with different bit-flip probability
In our particular XbarPUF, as we have shown later that out
of the 32-bit PUF response, on average there are less than
2 bits with a high flipping probability while the other bits
remain stable over a long period of time. Thus if we can
eliminate these few bits with very high bit-flip probability,
then the system would have a small probability of generating
different keys during encryption and decryption and thus the
need to restart the system would be less.
From Fig. 5, it is clear that with more number of samples
(N), all-agree voting scheme is more efficient at eliminating
bits with high flipping probability. On the other hand, majority
voting scheme performs poorly at identifying bits with high
flipping probability (near 50%) even when a large number of
samples are used, as shown in Fig. 6. Both of these results
are intuitive. Therefore, a good way to reduce bit-error would
be to identify highly unstable bits during chip testing and
keep a record for each different challenge bit. Thus, during
regular operation, bits in the PUF response would have a very
low probability of flipping and can be detected using a low
cost all-agree voting using a very small number of samples
(i.e. 2). Increasing the number of samples would increase the
overhead but would be more efficient at eliminating higher
bit-error rates.
B. Majority voting
Majority voting scheme is a very well-known error correc-
tion technique. To employ this technique here, the same PUF
challenge would be applied for some fixed number of times
and a bit would be considered either a ‘1’ if it produces ‘1’
more than ‘0’ among those evaluations or vice-versa. Thus
unlike all-agree voting, bits are not discarded here, rather
voting is used to determine their more stable binary value.
The probability of determining correct binary value of a bit
using majority voting technique with ‘N’ number of samples
are given by this following equation:
7PE(majority voting) =
N∑
r=N2 +1
(
N
r
)
pr(1− p)N−r
= 1−
N
2∑
r=0
(
N
r
)
(1− p)rpN−r
(6)
Fig. 6 shows the plot of this equation for different number
of samples and for different bit-flip probability. With larger
number of N, the curve gets narrower i.e. it can detect more
bit-flips effectively. This technique, however, is bad for bits
which have high flipping probability (0.4-0.6) as the peak of
the curve remains the same for any number of samples.
The overhead associated with majority voting could be high
for larger number of evaluations. For example, for an N-sample
majority voting, we would need counters that can count up to
N for each response bit, and log2N bits of memory to store
the results for each response bit as well.
C. Proposed reliability enhancement technique
We have seen from Fig. 5 and 6 that all-agree voting is
effective at eliminating high bit-flips while majority voting is
good at detecting smaller bit-flips. Overhead increases with
increasing number of samples for majority voting where all-
agree voting can be implemented with using XORs and one
additional bit per response bit for any number of evaluations.
To reap the benefit of both techniques, we are proposing to
use all-agree voting with larger ‘N’ during chip functionality
testing to eliminate bits with high flipping probability and use
majority voting with small ‘N’ during run-time to reduce bit-
error. If our all-agree voting can eliminate highly unreliable
bits beforehand, then majority can be used with a very few
samples during run-time, thereby also ensuring lightweight
operation. However, in case where there are some bits with
high flipping probability, all-agree voting should be used as
shown in our hardware implementation later.
Depending on the bit-flipping profile of a particular PUF,
the overhead associated with the implementation of either all-
agree or majority voting schemes along with the allowable
power, area, delay constraints and security requirements of a
system, we can choose a particular reliability enhancement
technique over the other. The idea is not to eliminate bit-
error altogether, but to reduce its probability to such extent
that the cost or overhead associated with full system restarts
(when there is a key error) would be overcome by the smaller
overhead associated with that particular technique.
VI. SYSTEM IMPLEMENTATION
One of the main contributions of this work is that we have
implemented our proposed security protocol at circuit level to
get more realistic idea about the implementation complexity
and system overhead. Specifically we have designed our
system at transistor level with 65nm CMOS technology in
Cadence Virtuoso. Thus all our data, including security metrics
and resource overhead are calculated at this level, thereby
giving us more accurate estimation of real-world implemen-
tation. This prototype system generates a 16-bit key from a
Fig. 7. System block diagram showing different components of our proposed
hardware security design.
32×32 XbarPUF. Fig. 7 displays the different components of
our proposed system and these are described in the following
subsections.
A. True random number generator (TRNG)
We need to provide random, favorably unique challenge to
the PUF during each back-up to generate a unique response.
Thus a random challenge would act as a random seed for
this key generation block. For that purpose, we are proposing
to use a true random number generator (TRNG) to produce
random challenge vectors. TRNG output should not be af-
fected by environmental changes as that might compromise
the randomness of the TRNG and thus the security of the
whole system. Deploy-able embedded device might go through
various environmental changes. Thus the TRNG needs to be
robust against environmental changes. Researchers have shown
good TRNG built from memristor [18]. In this work, we
propose to use the memTRNG introduced in [14]. The required
area of this TRNG is small, compatible for our design as
it also uses memristors to generate random numbers, and
importantly, robust against temperature and supply voltage
change as analyzed in [14].
B. Time-multiplexed XbarPUF
For a memristive crossbar PUF, the major contributing factor
to total power consumption is its RESET phase [28] when all
the memristors are RESET from LRS to HRS and static current
flows through the memristor crossbar. Response generation
is parallel in the XbarPUF and, therefore, with increasing
number of response bits, there is a linear increase in static
current as well. Thus power consumption in RESET phase
might be the bottleneck of the system by having a larger
peak power demand. To reduce this power, we have divided
up the XbarPUF into separate smaller PUFs, each working
on different times, generating one portion of the response
vector. For this prototype design, the 32×32 XbarPUF is
divided up into four 32×8 XbarPUFs. Depending on the
system requirement and resource available, this number would
change. A state machine with four states for these four PUFs is
8designed so that on each state, a single PUF is being RESET,
challenge is applied and response is generated and saved in a
temporary memory. This also implies that it would require 4×
more clock cycle to generate all the responses compared to a
single 32×32 PUF, but the average power during RESET is
also reduced approximately by a factor of four. Area is slightly
increased due to the need of separate read-write-circuitry for
each XbarPUF block. Each XbarPUF design is identical and
implemented as shown in Fig. 1 [22], [28].
C. Reliability Enhancement Block
We have introduced the theory and rationale behind imple-
menting our reliability enhancement block. For our demon-
stration, we are using a 32×32 PUF to generate a 32-bit
response from which a 16-bit key is extracted. The amount
of extra bits to be used can be estimated by simulating the
reliability of the PUF in its worst case operating condition. We
are implementing a simple two-count majority/all-agree voting
system here using two SRAMs. A single unique challenge is
applied to the PUF two times and the two sets of response
vectors are generated and saved in two different temporary
memory i.e. SRAMs. One SRAM holds the PUF response
generated in the first clock cycle while the second one holds
the response generated in the second cycle. We have designed
the two SRAM blocks ourselves, each sized 4×8 to hold 8-
bit response from each of the 4 PUFs. SRAM cells are sized
appropriately to ensure minimum delay and area as well as
minimum chance of a read or write error. For a complete
system, existing volatile memory can be used for this purpose,
thus requiring no extra hardware. To increase the confidence
over the reliability of the key, we can repeat these steps
over multiple cycles for the same challenge to have more
information about bit-flip, at the expense of larger overhead.
D. Stable Key Generation and Cryptographic Block
During key generation phase, contents from the two SRAMs
are compared bit by bit and the first 16 reliable bits are taken
as the key. A counter is used to keep track of the number
of reliable bits and once that count reaches 16, it stops and
the 16 ‘clean’ bits are ready to be used as key. To reduce
delay, a whole row of 8-bit data is written at a time. Thus it
takes 4 clock cycles to write all 32 response bits from the 4
XbarPUFs. The second SRAM takes another 4 cycles. Since
the data access to RRAM is 1 bit at a time, the two SRAM
are also read bit by bit. Two bits from the same location of
the SRAMs are read at a time, XORed to check if they are
same or not and only used as part of the key if they are the
same. Then one bit from the data is XORed with this valid key
bit and can be temporarily saved in a FIFO (first in first out).
Finally data from the FIFO can be written into the non-volatile
memory which is RRAM in this case. The 16-bit XOR block
is used as the both encryption and decryption block. Thus our
proposed system implements a homomorphic encryption. If
the system can allow more resource overhead, then we can
swap this XOR based encryption with any robust encryption
system using that same key. Moreover, instead of comparing
bit by bit, we can also compare multiple bits at a time to
increase the speed of back-up operation, although speed gain
is limited by the memory access speed.
E. RRAM as Non-Volatile Memory
We have used memristors to design completely two different
circuits in this work. First one is to build the XbarPUF. But
we are also using memristors as the NVM of our system. We
need a minimum of 16-bit RRAM to hold 16-bit encrypted
data. However, to keep things generic, we are assuming that
this NVM is being used for holding other information as well
and thus we are using a larger 5×12 RRAM. The HRS and
LRS of a memristor are considered as logic ‘0’ and logic
‘1’, respectively. This RRAM is an 1T1R structure i.e. each
memory cell consists of a memristor and an access transistor.
The sizing of this access transistor depends on the HRS and
LRS values of a memristor to maximize read-write noise
margin but minimize power/area/delay. More details about this
RRAM design are skipped for brevity.
F. Sneak-path based Tag Generation for Data Integrity Veri-
fication
The purpose of the PUF based encryption is to provide
confidentiality to the system. It prevents an attacker from
reading out sensitive information from the system. However, it
doesn’t provide data integrity and therefore any modification
in the non-volatile memory in the sleep mode would be
unnoticed. A lightweight integrity checking system targeted
for memristive memory is used here in order to detect unau-
thorized modification in the system information saved in the
memory. This integrity checking scheme leverages the sneak
path current based tag generation from a crossbar memristive
memory. Sneak path currents in a crossbar memristive array
are read using multiple columns and converted to digital bits
in this tag generation method. Details about this system can
be found in [13]. This scheme does not require an additional
hash function or message authentication code (MAC) since the
memory itself is used as the tag generator. It has been therefore
proposed as a security primitives for resource constrained
systems [29]. This data or memory integrity scheme would
also help to detect any data error or unreliable key bits. This
data integrity check is very crucial for our system as it would
be able to detect any data error, key error or unauthorized
modification of backed-up data during low power mode of
a processor. During wake-up, it calculates a new tag and
compare with the stored tag. If these two tags do not match,
then the processor would reject the back-up data and restart
the whole computation.
G. State Machine and Control Logic Design
We have used Johnson counters and ring counters to control
different areas of the circuit and to design the whole state
diagram for our proposed system. For the case of a re-
source constrained system, this security block is only enabled
whenever there is a low power warning or whenever system
is recovering from a power failure. For regular embedded
processors, this security block would be enabled just before
9a sleep and just after a wake-up to allow for encryption or
decryption, respectively. In first two states, the XbarPUF is
activated twice to produce two sets of 32-bit responses using
the same challenge. To reduce the peak power consumption
during a whole matrix of memristor write in these states,
the XbarPUF is divided into four smaller PUFs as mentioned
before. Therefore, in each first and second state of the system,
there are four sub-states, one for each smaller XbarPUF block.
Responses from each smaller XbarPUF are saved in two
SRAMs. Each of these sub-states again has three different
phases: Reset-all, challenge, and Read to generate a response
vector from an XbarPUF block. The next few states together
are used to read from these SRAMs one bit at a time and
then compared with each other using an XOR block. If they
match, the corresponding bit is saved in the RRAM. In the
next state, the tag generation control block is activated and a
tag is generated from the RRAM. This tag needs to be saved
and is stored in a secure memory.
During wake up, a new tag is regenerated from the RRAM
data and compared with the stored tag. If they do not match,
the system sends an alert to the processor, that the system
might be compromised and all backup data are discarded.
But if the tags match, then the PUF is activated again and
the response and key are regenerated. Then data from the
RRAM is read bit by bit, XORed with the corrected PUF
responses (enc-dec key) from the two SRAMs to get the
original decrypted data bit. This process can easily made faster
by allowing multiple bit (i.e. a byte at a time) read-write access
into the RRAM.
It is very important to ensure that no two states are active
at the same time even for a very short period of time because
that might enable multiple security blocks or PUFs. Non-
overlapping clock generation circuits are used at the output
of state decoders to prevent this situation arising from slow
transitioning signals which might create metastability in the
design. Two flip-flop based synchronizers are also used to
ensure realiable data transfer from one clock domain to another
where both clocks are of the same frequency and constant
phase difference in this application.
VII. PROBABLE ATTACK SCENARIO
A. Malicious read
This is the main motivation behind this work. An attacker
might try to read the contents from the backup non-volatile
memory and thus gain sensitive information. Our PUF-key
based one time pad (OTP) encryption scheme encrypts the
data to prevent direct interpretation of sensitive information.
As we have explained before, we are effectively implementing
a OTP here. OTP is theoretically the most secure encryption
if we can fulfill its requirements: (1) random key, (2) key
changed on during each encryption, and (3) key as large as
the data. Our key is random as it comes from a PUF and in a
small embedded system where amount of backup is small, key
can be made as large as the data. Moreover, since the backup
operation is infrequent and the state-space of a strong PUF
is large, a PUF key is unlikely to be repeated in a practical
time-frame and thus replay attack is improbable. Key sharing
is another weakness an OTP implementation which is not a
concern here as we are not communicating with outside world
using this key. Therefore, we are fulfilling all the requirements
of an OTP and ensuring maximum security with a unique
random key.
B. Malicious write
Instead of trying to read the information, an attacker might
try to change the contents of the RRAM arbitrarily, thereby
creating an erroneous calculation. Data error or key error can
also result in an error, especially during power failure and
time sensitive back-up operation. Sneak-path based one way
memory integrity checking should be able to detect any kind
of data alteration. During wake-up, the program calculates the
tag from the backup memory and check with its saved tag. If
they do not match, the processor simply rejects the backup data
and restarts the operation. This would increase the overhead
as processor can’t make use of previous states and data, but
it makes the system robust against any erroneous or harmful
procession of information.
C. Readout or alteration of the PUF challenge or secure tag
The challenge to generate the PUF response and the tag
are saved in a secure NVM, assumed inside the CPU of the
embedded processor. Here, we assume that attacker wouldn’t
have access to these small bits of secure memory. If the
PUF is robust against modeling attack, as shown in [30]
for XbarPUF, just getting access to the challenge shouldn’t
compromise the security. Moreover, continuous access to the
PUF is not permitted in this system as the PUF is only used
at some certain stages. Finally, illegally changing the PUF
challenge would almost definitely change the PUF response
and, therefore, the encryption key which would result in an
erroneous data. Pre-computing a tag or hash with the encrypted
data would help to prevent such scenarios and detect alteration
in either data or key.
The produced tag is calculated from the sneak path currents
of the crossbar of memristors [13], i.e. this is an analog
in-memory computation. Because of the analog nature of
memristors and its die-to-die and cycle-to-cycle variation, it
should be very difficult to repeat a memristor’s exact resistive
state to regenerate the same sneak-path current and the same
tag. Thus once the data memory and tag are written, any
attempt of writing should corrupt the data and the tag and
thus processor would be able to detect it easily.
D. Modeling attacks
One of the concerns of PUFs, especially those which can be
modeled as an additive linear delay model (like arbiter PUF),
are susceptible to machine learning based modeling attacks
[31]. The idea is that an attack would gather a subset of
challenge-response pairs from a strong PUF and use that small
subset, would build a machine learning model that can predict
responses for unknown challenges. This effectively reduces the
possible number of useful keys from a strong PUF. In previous
works, it was shown that memristor based PUFs are also
10
0 5 10 15 20 25
Different Challenges
0.48
0.485
0.49
0.495
0.5
0.505
0.51
0.515
0.52
Un
iq
ue
ne
ss
Fig. 8. Average uniqueness results for 500 different chips. Even for different
challenges, this value is very close to ideal value of 0.5
vulnerable to machine learning modeling attacks. However,
unlike authentication applications where a PUF is requested
to generate a few different responses from different challenges
to compare with the stored database for that PUF, the PUF in
our proposed system would not give any access to its keys
during its operation because it won’t be saved anywhere.
In an extreme scenario, an adversary can gain control of
a device, cause it to do back-up and restore continuously
with known data value so that it can gain information about
the responses by doing a malicious read of the back-up data
and challenge bits. To prevent this from happening, a PUF
should be able to resist modeling attacks. A simple and
lightweight modification is to introduce response bit XORing
and column shuffling technique which can drastically reduce
the modeling accuracy of an XbarPUF, thereby increasing the
robustness against machine learning based modeling attacks.
These techniques are discussed in details in [30]. Moreover,
any device with our proposed system should also have a
tamper detection mechanism so that the expected number of
back-up in a given time doesn’t exceed a certain value. This
would prevent an attacker from building a database in a short
period of time.
VIII. SECURITY PROPERTIES
As we have mentioned before, we have implemented a
32×32 XbarPUF as the key generator. First, we have evaluated
this PUF in terms of several security metrics listed below [32].
• Uniqueness
• Uniformity
• Bit-Aliasing
• Diffuseness
• Reliability
• Steadiness
These metrics are defined and explained elaborately in [32].
Interested readers are encouraged to learn the formal defini-
tions and equations for these metrics from there. Uniqueness
measures how different the responses are from one chip to
another for the same challenge. Bit-aliasing determines the
evenness of 1’s and 0’s in a each bit position of a response
for different chips. Thus both of these metrics evaluate a PUF’s
response across different devices. To evaluate uniqueness and
bit-aliasing, we have performed Monte Carlo analysis for 500
0 5 10 15 20 25 30
Response bits
0.35
0.4
0.45
0.5
0.55
0.6
0.65
B
it-
Al
ia
si
ng
Average
Maximum
Minimum
Fig. 9. Summary of results for bit-aliasing for all 32 bits from 500 different
chips. The minimum and maximum value for each bit are also shown.
5 10 15 20 25
Different chips
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Uniformity
Diffuseness
Steadiness
Fig. 10. Summary of results for average uniformity, diffuseness and steadiness
from 500 different challenges and for 25 different chips.
different chips, each with 25 different challenges. Uniqueness
results are shown in Fig. 8. Our designed XbarPUF displays
near ideal (50%) uniqueness value for all these different
challenges which shows the strength of this implemented
system as an unclonable hardware security module. The results
for bit-aliasing for each of the 32 bits of a response vector for
different challenges are shown in a box-plot in Fig. 9. For all
bits, the bit-aliasing value is within the [0.45 0.55] window
and the average bit-aliasing of all these bits is 0.5, very close
to its ideal.
Uniformity and diffuseness measure a PUF’s performance
across the challenge space. For the same chip, if different
challenges are applied, the responses should be as different
as possible and ideally these metrics should be equal to
0.5. Steadiness is a metric that evaluates the stability of a
PUF’s response. It represents bias of individual response bits
of a PUF on average. Specifically, it measures the degree
of bias of a particular bit towards a ‘1’ or ‘0’ over many
different cycles as defined in [32]. To evaluate uniformity,
and diffuseness, we have run Monte Carlo simulation for 500
random challenges, for 25 different chips. We have evaluated
steadiness for 25 different chips for 500 cycles. Results for
uniformity, diffuseness, and steadiness are shown in Fig. 10.
As it can be seen, although for different chips, these numbers
11
Fig. 11. Detailed reliability results generated from 500 different cycles for
25 different chips and 25 different challenges.
deviate from this ideal value, they do not show very large
deviation and display an average uniformity of near 0.5 and
an average diffuseness of near 0.5. This shows the applicability
of this design as a strong PUF, capable of generating unique
keys for different challenges. Also from this figure, it can be
seen that steadiness for these chips are very close to the ideal
value of 1.
One of the major concerns around PUF is their reliability.
Because of the way how a PUF operates to generate device-
specific signature from tiniest process variation, any small
change in environment or the presence of noise can make
a PUF’s response prone to change undesirably. Reliability,
closely related to stability (distinction is explained in [32]) are
used to describe how reliable a PUF’s response is when a same
challenge is applied again and again [32]. Because, reliability
might be the most important and most concerning metric to
evaluate a PUF, we have shown a detailed reliability result
using Monte Carlo analysis for 500 different clock cycles,
for 25 different challenges in 25 different chips. This result
is shown in Fig. 11. This 3-D plot shows that for all chips
and all different challenges, the XbarPUF shows at least 92%
and on average 98% reliability. However, it is to be noted
that this result is for the XbarPUF directly, before applying
any reliability enhancement technique. Using our proposed
reliability enhancement technique, the idea is to take only
reliable bits from the response.
After evaluating the security of the PUF, now we are
interested in evaluating the security of the tag generation
method that we have used in this work. This is a sneak-
path based tag generation, specific to RRAM or memristive
memory and the details can be found in [13]. Three metrics to
evaluate any tag generation or hashing method are: uniformity
(different than PUF’s uniformity), diffusion, and avalanche
effect. Uniformity dictates that the probability of each tag bit
being either 0 or 1 should be equal to each other. To fulfill the
diffusion property, if 1-bit of the data is changed, each tag bit
should have a probability of being flipped to be equal to 0.5.
TABLE I
SECURITY PROPERTIES OF THE TAG GENERATION METHOD
Tag size Uniformity Avalanche Diffusion
6 0.9869 0.4953 0.4971
8 0.9637 0.4951 0.5062
1 2 3 4 5 6 7 8 9 10
Challenge
16
16.2
16.4
16.6
16.8
17
17.2
17.4
17.6
17.8
18
N
o.
 o
f b
its
2%
5%
10%
Fig. 12. Figure shows the average number of bits to produce 16 ‘good’ bits
from 50 different chips. The results are generated for three different cycle-to-
cycle variations and a 500C temperature change.
Avalanche effect means the property of the output tag to be
very different for two very similar data, maybe differing by
just 1 bit. This means even if 1-bit of data is changed, almost
half of the tag bits should be changed. Table I presents the
results for this tag generation method. This result is slightly
different from the one presented in [13] as it is regenerated
for the memristor type and crossbar used in this work.
IX. SECURITY EVALUATION
A. Malicious read
An attacker can read stored back-up data and can gain
sensitive information. Since our data is encrypted, this would
increase the complexity to learn anything from the data. We
have implemented an OTP or one time pad here. Since an
OTP key is the same length as the data itself and random, for
a 16-bit data, the whole space of 216 possible combinations
are equally likely to be a key. Thus OTP is not vulnerable
against brute force attack because the attacker doesn’t gain
any new information from a brute force on a data encrypted
using OTP, as explained in [33]. For example, with an N-bit
key, the possible number of key combinations, Nkey before
and after brute force attacks are:
Nkey,before = 2
N
Nkey,after = 2
N
(7)
Thus for any two different data, d1 and d2 in all possible
data space D, the probability of any ciphertext c being equal
to either d1 or d2 is:
P [(d1D] = c) = P [(d2D) = c] (8)
12
Since brute force doesn’t add any new information which
wasn’t already available to an attacker, OTP can maintain
perfect secrecy.
B. Malicious write
Another security goal for this work is to verify integrity of
the backed up data on power regain. When the system goes
to a critically low power stage, system state is saved to a
non-volatile memory. An attacker may be able to launch a
spoofing attack by connecting the non-volatile memory from
a different source in the network in order to perform malicious
write. However, the integrity verification method described
earlier should be able to detect any offline modification to the
memory by generating a tag. Attacker’s goal in the spoofing
attack is to modify the memory content in such a way that
it matches the tag. The probability of being successful in this
attack depends on the uniformity of tag distribution and also
the number of trials attacker perform before the memory is
updated by the authorized user. The tag generation method
exhibits a uniform distribution for generated tags. Therefore,
we analyze the probability of a successful spoofing attack as
a function of number of trails as shown in Fig. 13. As it
can be seen that the probability increases with the number
of trials. For a given trials, the probability depends on the tag
size. A hypothetical scenario that an attacker may leverage
is that the back up data stored in the non-volatile memory
was not changed for few cycles of power down and regain. In
that case, an attacker can perform more spoofing trials on the
data in order to be successful where the data matches the tag.
However, the tag generation protocol randomly reconfigure its
reserved bits as a timestamp before every back up stage and
generate a new tag even if the data is not changed at all.
Therefore, the effective number of trials for an attacker in order
to perform a spoofing attack is 1. The success probability for
a 8 bit tag with a single trial is nearly 1/256. This is sufficient
due to the consideration that the tag is updated on each power
down stage regardless of the data and the attacker cannot get
multiple trials on guessing a data-tag pair. Due to the same
reason, this protocol can also prevent replay attack where an
attacker remembers the tag on a past data and replace the
present (data,tag) pair with the past one. Each data has a large
number of variants for the tag depending on its timestamp. A
tag on a data at one instant would be completely different than
the tag on the same data at a different instant.
C. Modeling attacks
In this work, we are assuming that the attacker can read
any non-volatile data which means he can gain access to
the stored challenge vector as well. Since the response of
a PUF depends on the tiniest variations caused by unin-
tentional manufacturing variation and, thus, is very difficult
to predict. However, researchers have shown that with the
help of machine learning techniques, PUFs can be modeled
too using only a subset of the total CRPs [31], [34]. In a
previous work, we have also shown how a 32×2 XbarPUF
can also be modeled and predicted with high accuracy and
shown how to apply mitigation techniques [30] to reduce that
Fig. 13. Probability of success in a spoofing attack with number of trials.
The more trials an attacker can perform, the higher the chance that the data
matches the tag. No. of effective trial is 1 in this protocol since the tag is
updated on each cycle.
TABLE II
RESULTS FROM MODELING ATTACKS FOR AN XBARPUF APPLIED FOR
DIFFERENT MACHINE LEARNING ALGORITHMS
SVM (RBF) L. R. Gauss. N. B. AdaB. Ensem.
Train Test Train Test Train Test Train Test
51.27 49.48 56.17 50.16 56.31 50.25 56.16 50.24
accuracy significantly. Here, we are recreating the work with
the memristor models and circuit parameters that we have used
in this work. We have used machine learning toolbox from
python scikit-learn [35] and presented the result against four
different models here in Table IX-D. We have collected 5000
CRPs from an abstract model of our XbarPUF, developed in
[30] and then used two-third of the data for training and the
rest for testing. It is clear that the accuracy is almost like
a random guess of a coin flip (50%) for all four models,
namely support vector machine (SVM) with Gaussian kernel
(RBF), logistic regression (LR), naive Bayes Gaussian, and
AdaBoost ensemble. The mitigation techniques, XORing and
column swapping are discussed in [30].
D. Readout/alteration of secure information
PUF challenge and generated tag from backup data also
need to be saved in NVM. Besides read/write of regular
backed-up data, we are pessimistically assuming that an at-
tacker can also gain access to these sensitive information.
Now without a good prediction model, this challenge wouldn’t
reveal any information about the back-up data. As we just saw
that the prediction accuracy using modern ML models against
our PUF is random, this wouldn’t reveal anything about the
data itself. Moreover, changing the challenge would change
the response vector significantly which are confirmed from the
very good uniformity, bit-aliasing values of this PUF. Because
of the good collision property of our tag, it won’t be easy to
find the data even if the tag is known. Finally, because of good
avalanche property as shown in Table I, even if a single bit
of tag is changed, almost half of the data bits would change,
thus making it robust against such malicious changes.
Figure 14 shows a visual demonstration of an attacker trying
to perform malicious read or write on an NVM. Maliciously
13
Fig. 14. Demonstration of malicious read and write attacks on an NVM
during power down or sleep. Attacker can read data maliciously but it won’t
give him any new information about the data which wasn’t already available
to him. Malicious write on the other hand is detected by the integrity checking
method as the altered data would produce a different tag. This causes in a
data rejection and processor restart which results in a data loss but malicious
data doesn’t enter the system.
reading the data won’t reveal anything to an attacker, even
by performing brute force attack, as the data is encrypted
beforehand using our OTP encryption. Altering the data ma-
liciously would change the tag generated from the encrypted
NVM and since the old stored tag during back-up is compared
with a new tag generated during restore, this integrity checking
method would be able to detect any malicious write. This
means the data won’t be reused, processor loses information,
but an attacker cannot inject any maliciously altered data into
the IoT device processor.
X. PERFORMANCE EVALUATION
We are generating a reliable cryptographic key from the
PUF response. We are accepting the fact that there would be
unreliable bits in the PUF response and that we’d get at least
the required number of reliable bits (16-bit here) from the
PUF response (32-bit). These extra bits present an overhead
to the system. To measure the minimum number of such
extra bits, we have run Monte Carlo simulation to predict
this situation. Memristor’s cycle-to-cycle variation is the root
cause of unreliability in the XbarPUF. In cases where the the
cycle-to-cycle variation of a pair of memristors dominate over
their process variation, then that pair would have a higher bit-
flip probability in different cycles. Moreover, since our main
application here is deploy-able IoT or embedded system, we
also have to consider rapid environmental changes. To emulate
this situation, we have considered the case when temperature
changes drastically from room temperature to 500C above
room temperature between successive clock cycles. A change
in supply voltage would change the switching speed of mem-
ristors but since we are using a large enough switching time to
ensure complete state transition, this voltage change shouldn’t
affect the final memristive values and the PUF response.
Thus we have considered only cycle-to-cycle variation and
temperature change for this analysis.
Fig. 12 shows the average number of bits it takes to
produce 16 ‘clean’ bits to be used as the cryptographic key
for three different cycle-to-cycle variation of memristors. As
cycle-to-cycle variation is the main culprit behind producing
unreliable responses from this XbarPUF, we have used three
different sets, 2%, 5%, and 10%. From fig. 12, we can see
that with increasing number of cycle-to-cycle variation, the
minimum no. of required bits increases. However, because
of the robustness of our XbarPUF design, the amount of bit
loss is small (≈0.8) even for a 10% cycle-to-cycle variation
of memristance, with 500C temperature change. The goal of
our system is to produce 16 error-free bits in two consecutive
cycles of encryption and decryption.
To get an idea about the amount of bit-flip, we have applied
different challenges in the same chip, each for 500 times.
The percentage of time that the bits flipped are shown in
Fig. 15 for 10 such challenges and for a particular chip using
Monte Carlo analysis. As we can see from this figure, there
are one or two bits per challenge which might have a much
larger bit-flip probability and can be detected and discarded
during functionality testing of the chip. The purpose of our
reliability enhancement block is to mitigate the impact of bit-
errors that remain undetected during testing and might cause
a key-error during run-time using all-agree voting scheme.
From this figure, we can also see that our XbarPUF based
key generation method is able to produce 30 ‘clean’ key bits
on average from a 32-bit response. To account for higher bit-
errors in some chips, the ratio of no. of key bits with no. of
response bits may be reduced which would increase the yield
of the design.
The area, power, and delay overhead for different compo-
nents and in different phases of the system are shown in Table
III, IV, and V, respectively. From table III, it might seem like
the required area overhead of our proposed system is large.
However, if we take a closer look at this table, we can see
that except for XbarPUF blocks, most other units like SRAM,
decoder, RRAM, counters etc. are actually parts of a regular
processor and memory and can be reused for our proposed
security implementation. We have designed all of these blocks
in CMOS 65nm technology where we have used 60nm and
120nm as the minimum length and width of a transistor.
All digital gates or components are sized to have minimum
area with added buffers to match their drive strength with the
required load. All analog components like sense amplifiers,
pass-gates etc. are sized accordingly, usually larger than digital
ones, to reduce the impact of mismatch and noise. Table IV
presents the power consumption in terms of overall current in
14
0 4 8 12 16 20 24 28 32
Response bit
0
0.1
0.2
0.3
0.4
0.5
Pr
ob
ab
ili
ty
 o
f b
it-
fli
p
Fig. 15. Probability of bit-flip for a same chip for 10 unique challenges. Plots showing the bit-flip probability for all 32 bits in a response, evaluated for 500
cycles. Different challenges cause different bits to flip i.e. there is no single set of globally unreliable bit.
TABLE III
TOTAL AREA/TRANSISTOR COUNT FOR DIFFERENT COMPONENTS OF THE SYSTEM
Design block Component Count Comment
XbarPUF Memristor 64×16 10nm × 10nm
×4 Sense amplifier ×8 9-T cell (base width 1.2 µm) [22]
Row controller ×64 8 pass-gates
Column controller ×8 10-T 2-R
Encryption-Decryption block XOR × 16
Tag generation block MUX2to1 (regular) 3×12 120nm NMOS and PMOS
MUX2to1 (wide) 3×12 12µm NMOS and PMOS
Sense Amplifier ×12 9-T cell (base width 1.2 µm) [22]
Resistor ×12 load resistor = √HRS ∗ LRS
memTRNG Memristor, NMOS ×2 (10nm × 10nm)
Differential op-amp ×1 5-T cell (base width 1.2µm)
RRAM 1T1R 5×12 1 memristor (10nm × 10nm), 1-NMOS (4.8µm)
Sense amplifier ×12 9-T cell (base width 1.2 µm) [22]
Pass gate ×12 large (12µm) NMOS & NMOS
Decoders ×1 4to16 & 2to4 decoders
SRAM SRAM cell ×32 6-T cell (120nm,240nm)
×2 Pre-charge ×8 3-PMOS (1.2µm)
Sense Amplifier (SA) ×8 Current Latched SA (9-T) [36]
Column buffer ×8 4-T (1.2µm), 2-NOT
Address decoder ×8 2to4 & 3to8 decoder
Basic gates ×8 (AND, OR, NOT); min. width
Others Counters, Buffers, flip-flops,
Non-overlapping clock generator etc.
state decoders, basic gates etc.
TABLE IV
POWER CONSUMPTION OF THE SYSTEM IN DIFFERENT STAGES OF
OPERATION (STATE 1 AND 2 INVOLVE A RESET OF 64×64 MEMRISTORS
AT ONCE AND THUS HAVE LARGE STATIC CURRENT)
State Average current (µA) Comment
State 1 143.9 PUF response generation
+ SRAM-1 write
State 2 151.8 PUF response generation
(again) + SRAM-2 write
State3to10 0.143 Both SRAM read
+ RRAM write
State11 3.28 Tag generation
Total 24.80 Overall average current
different states of the system. As we have discussed before,
XbarPUF can have a larger power consumption due to the
state where all the memristors are reset to HRS. Therefore,
during state 1 and 2, the power consumption is high but it is
fairly small during other times. Thus, the average current of the
TABLE V
DELAY OVERHEAD OF THE SYSTEM IN DIFFERENT STAGES OF OPERATION
State Clock cycles Comment
PUF RESET 0.5 8×
PUF challenge 0.25 8× short spike
PUF read 0.25 8×
SRAM write 4 overlapped with PUF read
RRAM write 16+x for 16 clean bit + ‘x’ bit error
RRAM read 16 for 16 bit
Tag generation 3 8 bit Tag
whole system is roughly 24.80µA overall with a 0.85V supply
voltage. We can further reduce this current by utilizing a dual-
voltage scheme, with smaller VDD for digital circuits, and a
separate larger VDD for analog and memristive components.
Table V shows the delay of our system in terms of required
number of clock cycles. We have 4 XbarPUFs in our design
and each one is activated twice during either encryption or de-
cryption. Thus response generation from these four XbarPUFs
15
TABLE VI
PERFORMANCE COMPARISON WITH STATE-OF-THE-ART LIGHTWEIGHT HARDWARE SECURITY TECHNIQUES
Overhead Encryption-Decryption Tag generation
nanoAES [37] AES [8] PRESENT [9] This work Hong et al. [38] Yan et al. [39] This work [13]
Avg. Power (µW) 170 18.5 - 21.08 1920 3923 487
Delay (clock cycles/bit) 2.62 1.75 0.5 0.27 7.5/128 15/128 3/128
Area (NAND G.E.) 2090 2400 1570 856 2339 3835 864
twice would take m×4×2 clock cycles where m is the number
of clock cycles required to generate a response from one
XbarPUF which is designed to have 1 clock cycle delay (
different phases of same clock cycle for reset, challenge, and
read). One row of SRAM are written together and thus our
4×8 SRAM would require 4 clock cycles for a complete write.
However, this is overlapped with the response generation phase
and thus doesn’t add to the overall delay. The slowest phase
of our design is when the encrypted data is written one bit
at a time to the RRAM. For a ‘n’-bit key, it would require
at least ‘n’ clock cycles and for this prototype system, ‘n’ is
16. In practice, this stage would require more than n clock
cycles as ‘x’ number of unreliable or noisy bits would add
‘x’ extra clock cycles. However, this delay can be reduced by
allowing to write multiple bits in the RRAM during back-up.
Tag generation takes about 3 clock cycles. During decryption,
RRAM is read one bit at a time for a total of ‘n’ clock cycles
(again n=16 here). The time required during decryption thus
would be almost the same as the encryption as the operations
are very similar.
We have presented a comparison of resource overhead
for our proposed security vs traditional security mechanisms
employed in embedded system. Table VI lists the resource
overhead for traditional encryption algorithms, nanoAES [37]
and PRESENT [9] and tag generation schemes presented in
[38], [39] alongside our proposed security solution. To calcu-
late the area overhead, we have omitted the circuit blocks that
would already be present in a system (e.g. counters, SRAMs,
registers etc.). However, the delay and power from these blocks
are taken into consideration. Although our proposed system
can generate 30 clean bits on average, one can generate a
larger key by applying more challenges. This would increase
the delay and energy requirement while keeping the overall
area and average power the same. Alternatively, larger PUF
blocks can be used to generate a larger key with no additional
delay at the expense of larger area and power.
It should be noted that the overhead for existing techniques
reported in Table VI are without the overhead associated with
generating and storing the key. Thus their actual implemen-
tation should have even larger resource overhead. However,
this key generation is the main contributing factor to overhead
in our proposed system. This comes at the added benefit of
random unique keys for each system and the key doesn’t need
be saved physically. Tag generation results were presented in
[13] and shown here again to highlight the lightweight nature
compared to other tag generation schemes. The overhead
associated with this tag generation method is significantly
lower than another comparable method as can be seen from
Table VI. Overall, the system proposed in this work provides
both confidentiality and integrity for the non-volatile back-up
memory of a resource-constrained system.
XI. FUTURE WORK
In this work, we have demonstrated the error correction
method to generate a secure cryptographic key from the re-
sponse of an XbarPUF. Then using this PUF based encryption
plus integrity checking can be incorporated to an embedded
processor to emulate a full system which saves it processor
states and other relevant information securely in a non-volatile
memory when a power failure occurs. The amount of overhead
depending on the frequency of this power failure from a
particular energy source can be analyzed. Finally, the overall
security gain vs. cost of additional circuitry could be evaluated
for a practical device.
For energy harvesting devices where the power profile
is relatively predictable, and regular periodic power failure
happens, our proposed system can be tweaked to give more
accuracy. Suppose for a system based on only solar power
would have regular power failure at the end of each day and
power would come back on at the beginning of next day. To
make up for the temperature difference between these two
times (dawn and dusk), the first set of PUF response can
be generated at the start of each day. Then during power
failure, second set of PUF response can be generated and
compared with the first response to have a better estimation of
environmental condition and thus more reliable PUF response.
PUF key can also be generated beforehand while the system
is running, thereby reducing the power and delay requirement
during critical back-up situation.
XII. CONCLUSION
We have presented a novel two layer security protocol for
a resource constrained IoT during power failure or power
saving mode. This is a two layer security where the PUF
based OTP encryption being the first layer of protection and
the tag based data integrity being the second layer of secu-
rity. As data forward progression is a common and efficient
method for resource limited systems, our proposed scheme
adds reasonable security to these systems. We have imple-
mented our whole system in transistor level with emerging
memristive technology to get an accurate idea about system
implementation. The key generated from a PUF is unclonable
and random by nature while being volatile i.e. the key is not
saved anywhere and generated only when a back-up operation
is needed. Thus we are able to provide unique device-specific
security for each different IC even with the same functionality.
16
The target application for our implemented system would be
an embedded processor where processor goes to ‘sleep’ or
power saving mode and saves state information for a quick
wake-up. It can also be useful for batteryless system where
power failure could occur and data are backed up in non-
volatile memory for forward progression. Our system would
fit into any system where aggressive power saving techniques
are employed and data is backed-up in an NVM to avoid a
large computational penalty. With the increasing number of
edge devices with very small resource and little to no security,
our proposed idea could be an effective solution to provide
practical level of security that these systems can carry.
ACKNOWLEDGMENT
The authors would like to thank Abhishek Bhandari and
Grayson Bruner of our research group for important discus-
sions on this topic.
REFERENCES
[1] S. R. Department, “Internet of things (iot) connected devices
installed base worldwide from 2015 to 2025 (in billions),” Nov
2016. [Online]. Available: https://www.statista.com/statistics/471264/
iot-number-of-connected-devices-worldwide/
[2] A. K. Sikder, G. Petracca, H. Aksu, T. Jaeger, and A. S. Uluagac, “A
survey on sensor-based threats to internet-of-things (IOT) devices and
applications,” arXiv preprint arXiv:1802.02041, 2018.
[3] A. Acar, H. Fereidooni, T. Abera, A. K. Sikder, M. Miettinen,
H. Aksu, M. Conti, A.-R. Sadeghi, and A. S. Uluagac, “Peek-a-boo:
I see your smart home activities, even encrypted!” arXiv preprint
arXiv:1808.02741, 2018.
[4] T. OConnor, R. Mohamed, M. Miettinen, W. Enck, B. Reaves, and A.-
R. Sadeghi, “Homesnitch: behavior transparency and control for smart
home iot devices,” in Proceedings of the 12th Conference on Security
and Privacy in Wireless and Mobile Networks, 2019, pp. 128–138.
[5] L. Babun, Z. B. Celik, P. McDaniel, and A. S. Uluagac, “Real-
time analysis of privacy-(un) aware IoT applications,” arXiv preprint
arXiv:1911.10461, 2019.
[6] K. Ma, Y. Zheng, S. Li, K. Swaminathan, X. Li, Y. Liu, J. Sampson,
Y. Xie, and V. Narayanan, “Architecture exploration for ambient energy
harvesting nonvolatile processors,” in 2015 IEEE 21st International Sym-
posium on High Performance Computer Architecture (HPCA). IEEE,
2015, pp. 526–537.
[7] C. Manifavas, G. Hatzivasilis, K. Fysarakis, and K. Rantos, “Lightweight
cryptography for embedded systems–a comparative analysis,” in Data
Privacy Management and Autonomous Spontaneous Security. Springer,
2013, pp. 333–349.
[8] A. Moradi, A. Poschmann, S. Ling, C. Paar, and H. Wang, “Pushing
the limits: A very compact and a threshold implementation of AES,”
in Advances in Cryptology – EUROCRYPT 2011, K. G. Paterson, Ed.
Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 69–88.
[9] A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J.
Robshaw, Y. Seurin, and C. Vikkelsoe, “Present: An ultra-lightweight
block cipher,” in International workshop on cryptographic hardware and
embedded systems. Springer, 2007, pp. 450–466.
[10] G. E. Suh, C. W. O’Donnell, I. Sachdev, and S. Devadas, “Design
and implementation of the AEGIS single-chip secure processor using
physical random functions,” in Proc. of the 32nd Annual Int. Symp. on
Comput. Architecture, 2005, pp. 25–36.
[11] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, “The
missing memristor found,” Nature, vol. 453, pp. 80–83, May 2008.
[12] M. Uddin, M. B. Majumder, K. Beckmann, H. Manem, Z. Alamgir,
N. C. Cady, and G. S. Rose, “Design considerations for memristive
crossbar physical unclonable functions,” J. Emerg. Technol. Comput.
Syst., vol. 14, no. 1, pp. 2:1–2:23, sep 2017.
[13] M. B. Majumder, M. S. Hasan, M. Uddin, and G. S. Rose, “A
secure integrity checking system for nanoelectronic resistive ram,” IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, pp. 1–14,
2018.
[14] M. Uddin, M. S. Hasan, and G. S. Rose, “On the theoretical analysis of
memristor based true random number generator,” in Proceedings of the
2019 on Great Lakes Symposium on VLSI. ACM, 2019, pp. 21–26.
[15] J. J. Yang, M. Zhang, J. P. Strachan, F. Miao, M. D. Pickett, R. D. Kelley,
G. Medeiros-Ribeiro, and R. S. Williams, “High switching endurance
in TaOx memristive devices,” Applied Physics Letters, vol. 97, no. 23,
p. 232102, 2010.
[16] G. Medeiros-Ribeiro, F. Perner, R. Carter, H. Abdalla, M. D. Pickett, and
R. S. Williams, “Lognormal switching times for titanium dioxide bipolar
memristors: origin and resolution,” Nanotechnology, vol. 22, no. 9, p.
095702, 2011.
[17] G. S. Rose, N. McDonald, L. Yan, B. Wysocki, and K. Xu, “Foundations
of memristor based PUF architectures,” in Proc. of the IEEE/ACM Int.
Symp. on Nanoscale Architectures (NANOARCH), July 2013, pp. 52–57.
[18] H. Jiang, D. Belkin, S. E. Savele´v, S. Lin, Z. Wang, Y. Li, S. Joshi,
R. Midya, C. Li, M. Rao et al., “A novel true random number generator
based on a stochastic diffusive memristor,” Nature communications,
vol. 8, no. 1, p. 882, 2017.
[19] S. Hamdioui, L. Xie, H. A. D. Nguyen, M. Taouil, K. Bertels, H. Corpo-
raal, H. Jiao, F. Catthoor, D. Wouters, L. Eike et al., “Memristor based
computation-in-memory architecture for data-intensive applications,” in
Proceedings of the 2015 design, automation & test in Europe conference
& exhibition. EDA Consortium, 2015, pp. 1718–1725.
[20] C. Xu, X. Dong, N. P. Jouppi, and Y. Xie, “Design implications of
memristor-based rram cross-point structures,” in 2011 Design, Automa-
tion Test in Europe, March 2011, pp. 1–6.
[21] M. A. Zidan, J. P. Strachan, and W. D. Lu, “The future of electronics
based on memristive systems,” Nature Electronics, vol. 1, no. 1, p. 22,
2018.
[22] M. Uddin and G. S. Rose, “A practical sense amplifier design for
memristive crossbar circuits (puf),” in 2018 31st IEEE International
System-on-Chip Conference (SOCC), Sep. 2018, pp. 209–214.
[23] G. Suh and S. Devadas, “Physical unclonable functions for device
authentication and secret key generation,” in 44th ACM/EDAC/IEEE
Design Automation Conference (DAC), June 2007, pp. 9–14.
[24] Z. Paral and S. Devadas, “Reliable and efficient PUF-based key genera-
tion using pattern matching,” in IEEE Int. Symp. on Hardware-Oriented
Security and Trust (HOST). IEEE, 2011, pp. 128–133.
[25] G. Rose and C. Meade, “Performance analysis of a memristive crossbar
PUF design,” in 52nd ACM/EDAC/IEEE Design Automation Conference
(DAC), June 2015, pp. 1–6.
[26] K. Ma, X. Li, S. Li, Y. Liu, J. J. Sampson, Y. Xie, and V. Narayanan,
“Nonvolatile processor architecture exploration for energy-harvesting
applications,” IEEE Micro, vol. 35, no. 5, pp. 32–40, 2015.
[27] R. Horstmeyer, I. M. Vellekoop, S. Assawaworrarit, B. Judkewitz, and
C. Yang, “Physical key-protected one-time pad,” Scientific Reports,
Nature, vol. 3, no. 3543, December 2013.
[28] M. Uddin, M. B. Majumder, G. S. Rose, K. Beckmann, H. Manem,
Z. Alamgir, and N. C. Cady, “Techniques for improved reliability in
memristive crossbar PUF circuits,” in IEEE Comp. Society Annual Symp.
on VLSI (ISVLSI), July 2016, pp. 212–217.
[29] M. Uddin, B. Majumder, and G. S. Rose, “Nanoelectronic security
designs for resource-constrained internet of things devices: Finding
security solutions with nanoelectronic hardwares,” IEEE Consumer
Electronics Magazine, vol. 7, no. 6, pp. 15–22, Nov 2018.
[30] M. Uddin, M. B. Majumder, and G. S. Rose, “Robustness analysis of a
memristive crossbar PUF against modeling attacks,” IEEE Transactions
on Nanotechnology, vol. 16, no. 3, pp. 396–405, May 2017.
[31] U. Ruhrmair, J. Solter, F. Sehnke, and X. Xu, “PUF modeling attacks
on simulated and silicon data,” IEEE Trans. on Inform. Forensics and
Security, vol. 8, pp. 1876–1891, August 2013.
[32] A. Maiti, V. Gunreddy, and P. Schaumont, “A systematic method to eval-
uate and compare the performance of physical unclonable functions,” in
Embedded Systems Design with FPGAs, P. Athanas, D. Pnevmatikatos,
and N. Sklavos, Eds. Springer New York, 2013, pp. 245–267.
[33] Wikipedia, “One-time pad,” Feb 2020. [Online]. Available: https:
//en.wikipedia.org/wiki/One-time pad
[34] G. Hospodar, R. Maes, and I. Verbauwhede, “Machine learning attacks
on 65nm arbiter PUFs: Accurate modeling poses strict bounds on us-
ability,” in 2012 IEEE International Workshop on Information Forensics
and Security (WIFS), Dec 2012, pp. 37–42.
[35] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-
esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine
Learning Research, vol. 12, pp. 2825–2830, 2011.
17
[36] T. Kobayashi, K. Nogami, T. Shirotori, and Y. Fujimoto, “A current-
controlled latch sense amplifier and a static power-saving input buffer
for low-power architecture,” IEEE J. of Solid-State Circuits, vol. 28,
no. 4, pp. 523–527, Apr 1993.
[37] S. Mathew, S. Satpathy, V. Suresh, M. Anders, H. Kaul, A. Agar-
wal, S. Hsu, G. Chen, and R. Krishnamurthy, “340 mv–1.1 v, 289
gbps/w, 2090-gate NanoAES hardware accelerator with area-optimized
encrypt/decrypt gf (2 4) 2 polynomials in 22 nm tri-gate cmos,” IEEE
Journal of Solid-State Circuits, vol. 50, no. 4, pp. 1048–1058, 2015.
[38] M. Hong, H. Guo, and S. X. Hu, “A cost-effective tag design for memory
data authentication in embedded systems,” in Proceedings of the 2012
international conference on Compilers, architectures and synthesis for
embedded systems. ACM, 2012, pp. 17–26.
[39] C. Yan, D. Englender, M. Prvulovic, B. Rogers, and Y. Solihin,
“Improving cost, performance, and security of memory encryption
and authentication,” in ACM SIGARCH Computer Architecture News,
vol. 34, no. 2. IEEE Computer Society, 2006, pp. 179–190.
