Efficient Fault Tolerant SHA-2 Hash Functions for Space Applications. by Marcio Juliato et al.
Efﬁcient Fault Tolerant SHA-2 Hash Functions
for Space Applications.
Marcio Juliato Catherine Gebotys Reouven Elbaz
Department of Electrical and Computer Engineering
University of Waterloo
200 University Avenue West
Waterloo, ON, Canada, N2L 3G1
+1 (519) 888-4567
fmrjuliat,cgebotys,reouveng@uwaterloo.ca
Abstract—Satellites are extensively used by public and pri-
vate sectors to support a variety of services. Considering
the cost and the strategic importance of these spacecrafts, it
is fundamental to utilize strong cryptographic primitives to
assure their security. However, it is of utmost importance
to consider fault tolerance in their designs due to the harsh
environment found in space, while keeping low area and
power consumption. Therefore, this paper proposes novel
fault tolerant schemes for the SHA-2 family of hash func-
tions and analyzes their resistance to SEUs. Results obtained
through FPGA implementation show that our best fault toler-
ant scheme for SHA-512 uses up to 32% less area and con-
sumes up to 43% less power than the commonly used TMR
technique. Moreover, its memory and registers are 435 and
175 times more resistant to SEUs than TMR. These results
are crucial for supporting low area and low power fault toler-
ant cryptographic primitives in satellites.
TABLE OF CONTENTS
1 INTRODUCTION ................................... 1
2 RELATED WORK.................................. 2
3 SHA-2 ALGORITHMS............................. 3
4 HARDWARE DESIGN .............................. 4
5 ERROR DETECTION AND CORRECTION SCHEMES 5
6 EVALUATION OF ROBUSTNESS AGAINST SEUS .. 8
7 EXPERIMENTAL RESULTS ........................ 11
8 CONCLUSIONS .................................... 14
ACKNOWLEDGEMENTS ........................... 14
REFERENCES ..................................... 14
BIOGRAPHY....................................... 16
1. INTRODUCTION
Satellites are extensively used by public and private sectors
to support communication services, conduct scientiﬁc exper-
iments, provide navigation and meteorological support, or in-
crease homeland security. Some countries also employ com-
mercial satellites for military communications [1]. The Con-
sultative Committee for Space Data Systems (CCSDS) has
highlighted in [2] that advances in technology allow for com-
Date: April 30, 2009.
plex attacks to be easily carried out against satellites, making
security an important goal in satellite designs. Considering
the cost and the strategic importance of these spacecrafts, it
is not recommended to assure their security by relying on the
uniqueness and obscurity of their designs. Actually, due to
the lack of appropriate security, some satellites have already
been compromised [3], [4], [5]. Threats to satellites can pose
severe risks to communications infrastructures [1], [6] and
architectures must be designed to provide security services
such as authentication, data integrity and conﬁdentiality, thus
increasing the security of those spacecrafts.
Security architectures [7], [8] have been recently proposed
to authenticate communications between ground stations and
spacecrafts or to provide security services to data processed
on-board. To do so, cryptographic primitives are utilized,
such as hash functions for integrity checking/authentication
and block encryption for conﬁdentiality. These architectures
encounter implementation challenges to protect the security
mechanisms from the harsh environment found in space. Ra-
diation coming from space can hit satellites’ circuitry and
cause errors known as Single Event Upsets (SEUs) [9]. SEUs
are forms of soft-errors, in the sense that they cause dynamic
bit ﬂips but are not damaging to the hardware.
Spacecrafts include both ASICs and a variety of FPGAs in
their subsystems. ASICs and non-volatile FPGAs (Anti-
Fuse or Flash) offer an increased SEU resistance compared
to FPGAs based on Static Random Access Memory (SRAM)
technology; the construction of the SRAM cells make them
more sensitive to SEUs. In some cases a single bit-ﬂip in
a conﬁguration element of an SRAM FPGA is able to dis-
rupt the entire functioning of the design implemented in the
chip. However, the low production volume of satellites makes
FPGAs an attractive alternative to reduce non-recurring engi-
neering costs. In addition, their reconﬁgurability allows post-
launch updates and patches to the satellite hardware. Regard-
ing more speciﬁcally SRAM FPGAs, they provide both high
density and high speed therefore resulting in a good trade-off
between performance and ﬂexibility. Thus, to overcome the
issue of SEUs occurring in conﬁguration elements, designers
utilizing SRAM FPGAs implement methods like read-back,
CRC checking and reconﬁguration. These methods basically
1consist in reading the conﬁguration from the FPGA, checking
its correctness using CRC, and then reconﬁguring the device
with a correct bitstream. Device hardening is another tech-
nique to make FPGAs more resistant against radiation.
Considering that the hardware description is protected against
SEUs by either the underlying technology (i.e. ASICs and
non-volatile FPGAs) or by the aforementioned techniques,
the present work focuses on protecting the data processed by
the device. Indeed, the issue of SEUs in registersand memory
persists for all kinds of FPGAs as well as for ASICs. There-
fore, whatever the underlying technology is, we assume that
the conﬁguration of the device is safe regarding errors. Thus,
the proposed techniques target error detection and correction
on the data processed within a device.
This work focuses on fault tolerant design of hash func-
tions for space applications. Hash functions are employed
in satellite systems in many different ways. They are used
as building blocks in authentication schemes like digital sig-
natures [10], [11], and Hash-based Message Authentication
Codes (HMAC) [12]. These cryptographic primitives allow
for the integrity checking of data received from a ground sta-
tion to assure that they were not accidentally or maliciously
compromised. Furthermore, it could also be employed in in-
vasion detection and recovery schemes [7]. By using those
schemes, it becomes possible, for example, to determine
whether an attacker, who may have broken into the satellite,
has tampered with the system’s program memories or FPGA
conﬁguration. Considering the properties of hash functions
and their applications it is clear that a single bit-ﬂip can have
disastrous consequences like provoking unjustiﬁed satellite
reset or intrusion alert. Therefore, SEUs should be prop-
erly addressed to guarantee the correct and reliable operation
of spacecrafts employing cryptographic algorithms. For that
purpose, fault-tolerant designs for hash function are required.
In this paper we investigate fault tolerant architectures for the
family of Secure Hash Algorithms (SHA) [13] which is the
most commonly used hash function in integrity checking and
authentication architectures. More speciﬁcally, the SHA-2
familyofhashfunctionsisrecommendedtobeadoptedbythe
Consultative Committee for Space Data Systems (CCSDS)
as a standard for space systems [14]. Thus, we propose novel
fault-tolerant solutions for SHA-2 that combine existing tech-
niques, i.e. Triple Modular Redundancy (TMR) and Ham-
ming Codes (HC), in order to offer several performance and
cost trade-offs. Moreover, while the applicability of these so-
lutions are independent of the underlying technology, we pro-
vide a detailed study on power consumption and area based
on FPGA implementations. We also carry out an analysis
to determine the robustness of each scheme against SEUs,
and we show that the proposed solutions offer a better re-
sistance to bit-ﬂips when compared to the traditional TMR.
For instance, the best scheme proposed for SHA-512 con-
sumes 32% less area and consumes up to 43% less power
than TMR. Furthermore, compared to TMR, its memory and
registers are, respectively, 435 and 175 times more resistant
to SEUs.
The remainder of this paper is organized as follows. Related
works are presented in Section 2. Section 3 describes the
SHA-2 algorithms, while Section 4 introduces non-fault tol-
erant designs for SHA-2. The fault tolerant architectures pro-
posed in this work are presented in Section 5 and their ro-
bustness against SEUs are evaluated in Section 6. Section 7
reports our experimental results in term of power, area and
frequency of operation, and provides a comprehensive com-
parison of the proposed schemes applied to the SHA-2 family
of hash functions. Our conclusions are presented in Section 8.
2. RELATED WORK
With the growing worldwide demand for satellite-based ser-
vices, the dependence on these spacecrafts tends to increase.
Consequently, in case of a satellite failure, the risk of losses
tends to go higher over the years. As a result, the disruption
of satellite services, whether intentional or not, can have a
major economic impact. This context motivated the United
States General Accounting Ofﬁce to issue a report [1] pre-
senting several threats to satellite systems. Its conclusion
stresses that the security of commercial satellites should be
more fully addressed in order to achieve higher levels of pro-
tection for the country’s critical infrastructure. CCSDS has
also highlighted [2] the importance of including security in
space missions. Actually, several proposals have been made
for CCSDS to standardize the use of strong cryptographic
mechanisms for integrity checking, authentication and en-
cryption [14], [15].
Several security architectures have been proposed for satel-
lites to remotely manage their hardware conﬁguration from
the ground station [16], and for purposes of key distribu-
tion [17] and management [18], [19], [20]. All of those works
make use of cryptographic primitives such as hash functions
to achieve their goals, but none of them considers the harsh
environment surrounding spacecrafts and more speciﬁcally
the resistance to SEUs. In [7] only, a security scheme for key
recovery in satellites is proposed with SEU-resistant hard-
ware. The authors, however, employ existing techniques for
fault tolerance like TMR and HC, but do not speciﬁcally in-
vestigate new approaches for fault tolerance.
Hardware implementations of hash functions have been pro-
posed by several works. In [21] a single chip implementation
of SHA-384 and SHA-512 based on FPGAs is introduced.
A SHA-256 processor is presented in [22], also employ-
ing FPGAs for its implementation. In [23], [24] the whole
SHA-2 family is implemented in FPGAs and compared in
terms of area, frequency of operation and throughput. Some
other works [25], [26], [27] provide further comparisons of
FPGA-based implementations of hash functions. Also, some
optimizations based on pipelining, loop unrolling, operation
rescheduling and hardware reutilization are proposed in [28],
2[29]. Previous research has extensively addressed the imple-
mentation of the SHA-2 family of hash functions. However,
none of them consider the resistance against SEUs and are
therefore not appropriate for space applications. Only in [30]
error detection was considered for SHA-512 in FPGAs. Since
this approach employs parity prediction for the internal hash
function operations, it is therefore unable to correct errors, as
it is proposed in this paper.
Fault-tolerant designs of cryptographic primitives have
mainly been proposed for block encryption algorithms like
the Advanced Encryption Standard (AES) [31]. In [8] a
fault detection and correction capabilities are included into
AES implemented in FPGAs. SEUs are detected in each
round transformation by using parity prediction, and cor-
rected through the use of Hamming codes applied to the
round data matrix. Another approach is due by [32], in which
single bit-ﬂips in the substitution box of the AES algorithm
are detected by using look-up tables and parity prediction.
The hardware designs of chips is made resistant to SEUs
through different techniques like radiation hardening, which
is the case of Actel’s Anti-Fuse [33] and Flash [34] FPGAs.
Anti-Fuse devices provide higher reliability compared to
Flash ones, but their main drawback is that they can be con-
ﬁgured only once. Xilinx also provides radiation-hardened
SRAM-based FPGAs [35] to meet the requirements of space
applications. SRAM-based FPGAs are available in higher
densities than the Anti-Fuse and Flash counterparts, but they
are more sensitive to SEUs because of the features of SRAM
cell itself. Altera also proposes a strategy to protect the
conﬁguration of SRAM-based FPGAs against SEUs at run-
time. Some Altera FPGAs employ built-in Cyclic Redun-
dancy Check (CRC) circuitry [36]. Although the aforemen-
tioned strategy is able to check the internal FPGA’s conﬁgu-
ration, it does not detect errors in the user data stored or be-
ing processed within the device. Other designs can employ
ASICs to achieve high reliability of their hardware imple-
mentations at the cost of very high non-recurring engineer-
ing costs. However, whatever the underlying technology is,
the user data being processed within the device remain un-
protected against SEUs.
Aerospace applications have traditionally used techniques
based on modular redundancy for mitigating SEUs. For in-
stance, TMR together with FPGA reconﬁguration is proposed
in [37] to fully protect systems from SEUs. TMR can be
very costly, though. It requires the triplication of all archi-
tectural elements along with a voter, thus demanding consid-
erable amounts of resources. Attempts were made in [38] to
reduce the costs associated with fault mitigation by only ap-
plying TMR to the most critical components of a design.
In contrast to previous research, we propose novel fault-
tolerant designs of the SHA-2 family of hash functions.
Therefore, we investigate original combinations of TMR and
Hamming codes (HC) that support not only error detection,
but also correction. We explore different levels of granular-
ity involving TMR and HC, and analyze all the implementa-
tions in terms of area, frequency of operation, throughput and
power consumption. Additionally, this research provides an
analysis of the resistance of each new scheme against SEUs
and compare them to the traditional TMR.
3. SHA-2 ALGORITHMS
In this section, the SHA-2 family of hash algorithms [13] is
presented. It comprises four algorithms, namely, SHA-224,
SHA-256, SHA-384, and SHA-512. SHA-224 and SHA-256
have several commonalities, thus in the following descrip-
tion SHA224/256 is a shortcut that refers to both hash func-
tions. Similarly, SHA384/512 refers to both SHA-384 and
SHA-512. Moreover, the datapath width in bits of these func-
tions is denoted by D; D is equal to 32 bits for SHA224/256
and to 64 bits for SHA384/512.
These algorithms are one-way hash functions able to pro-
cess messages of up to 264 bits for SHA224/256 and 2128
bits for SHA384/512. The output of the algorithms is a mes-
sage digest, which varies in size depending on the algorithm
used. For instance, SHA-224 produces a 224-bit message
digest, SHA-256 produces a 256-bit message digest, and so
on. These algorithms can be divided into two computational
parts, denoted preprocessing and hash computation.
Preprocessing
SHA-2 algorithms take as input blocks of different sizes.
The preprocessing stage ﬁrst splits the original message in
N blocks, namely M(1);M(2);:::;M(N). SHA224/256 pro-
cess 512-bit blocks of data, whereas SHA384/512 process
1024-bit blocks. Then, padding is performed if the message
length is not a multiple of the underlying block size. Next,
eight initial hash values, H
(0)
0 ;:::;H
(0)
7 , are set. Each algo-
rithm uses a distinct set of initial hash values given in [13].
Hash Computation
The entire computation of the message digest is based on
operations over D-bit words. The SHA-2 algorithms com-
prise sixty four message schedule words (W0;:::;W63), eight
working variables (a;b;c;d;e;f;g;h), and eight hash values
(H
(i)
0 ;:::;H
(i)
7 ). In addition, SHA224/256 and SHA384/512
use respectively sixty four (K0;:::;K63) and eighty D-bit
constants (K0;:::;K79), as speciﬁed in [13]. Furthermore,
six logical functions are also employed, and are shown be-
low. The operations ROTRn(x) and SHRn(x) are rotation
and shift of x by n bits to the right.
SHA-224 and SHA-256:
Ch(x;y;z) = (x ^ y) © (¹ x ^ y);
Maj(x;y;z) = (x ^ y) © (x ^ z) © (y ^ z); P
0(x) = ROTR2(x) © ROTR13(x) © ROTR22(x); P
1(x) = ROTR6(x) © ROTR11(x) © ROTR25(x);
¾0(x) = ROTR7(x) © ROTR18(x) © SHR3(x);
¾1(x) = ROTR17(x) © ROTR19(x) © SHR10(x):
3SHA-384 and SHA-512:
Ch(x;y;z) = (x ^ y) © (¹ x ^ y);
Maj(x;y;z) = (x ^ y) © (x ^ z) © (y ^ z); P
0(x) = ROTR28(x) © ROTR34(x) © ROTR39(x); P
1(x) = ROTR14(x) © ROTR18(x) © ROTR41(x);
¾0(x) = ROTR1(x) © ROTR8(x) © SHR7(x);
¾1(x) = ROTR19(x) © ROTR61(x) © SHR6(x):
Further, for each message block i, 1 · i · N, a four-step
digest round is performed as follows:
Step 1: Initialize the eight working variables
a = H
(i¡1)
0 ; b = H
(i¡1)
1 ; c = H
(i¡1)
2 ; d = H
(i¡1)
3 ;
e = H
(i¡1)
4 ; f = H
(i¡1)
5 ; g = H
(i¡1)
6 ; h = H
(i¡1)
7 :
Step 2: Prepare the message schedule
Wt =
n
M
(i)
t ; 0 · t · 15
¾1(Wt¡2) + Wt¡7 + ¾0(Wt¡15) + Wt¡16; 16 · t · j ¡ 1:
The number of words processed by the message scheduler is
given by j. Actually, j corresponds to the number of itera-
tions performed by the algorithm. For SHA224/256 j = 64,
whereas for SHA384/512, j = 80.
Step 3: For t = 0 to j ¡ 1 do:
T1 = h +
P
1(e) + Ch(e;f;g) + Kt + Wt;
T2 =
P
0(a) + Maj(a;b;c);
h = g;
g = f;
f = e;
e = d + T1;
d = c;
c = b;
b = a;
a = T1 + T2;
Step 4: Compute the ith intermediate hash value H(i):
H
(i)
0 = a + H
(i¡1)
0 ; H
(i)
1 = b + H
(i¡1)
1 ;
H
(i)
2 = c + H
(i¡1)
2 ; H
(i)
3 = d + H
(i¡1)
3 ;
H
(i)
4 = e + H
(i¡1)
4 ; H
(i)
5 = f + H
(i¡1)
5 ;
H
(i)
6 = g + H
(i¡1)
6 ; H
(i)
7 = h + H
(i¡1)
7 :
After processing all N blocks of message M, the ﬁnal mes-
sage digest is obtained by concatenating the hash values
(H
(i)
0 ;:::;H
(i)
7 ). More precisely, the message digest for each
algorithm is given by the concatenations shown below. The
concatenation of words is represented by the symbol jj.
SHA-224:
H
(N)
0 jjH
(N)
1 jjH
(N)
2 jjH
(N)
3 jjH
(N)
4 jjH
(N)
5 jjH
(N)
6 :
SHA-256 and SHA-512:
H
(N)
0 jjH
(N)
1 jjH
(N)
2 jjH
(N)
3 jjH
(N)
4 jjH
(N)
5 jjH
(N)
6 jjH
(N)
7 :
SHA-384:
H
(N)
0 jjH
(N)
1 jjH
(N)
2 jjH
(N)
3 jjH
(N)
4 jjH
(N)
5 :
4. HARDWARE DESIGN
In this section a non-fault tolerant hardware design for SHA-2
algorithms is presented; this design is used in following sec-
tions to support the description of the evaluated fault-tolerant
techniques. It basically consists of shift-registers, logical op-
erations, D-bit adders, and a memory to store the algorithm’s
initialization values and constants. Similarly to hash hard-
ware implementations mentioned in Section 2, we do not per-
form message padding in hardware. Our main focus is on the
hash computation datapath.
The architectural elements of the SHA-2 implementation, as
shown in Figure 1, can be divided into four main blocks: In-
termediate Hash Computation, Compressor, Message Sched-
uler, and Constants Memory. The constants memory utilizes
the FPGA’s RAM blocks. Since this design does not involve
any kind of fault tolerance, it is referred to as NoFT.
Figure 1. SHA-2 architectural elements.
4The message scheduler’s registers W0;:::;W15 are initialized
by shifting in the ﬁrst 16 words Mt of the message M; this
processing takes 16 clock cycles. Simultaneously, the con-
stants memory provides the initialization values for the work-
ing variables (a;:::;h). Initial hashes (H0;:::;H7) are also
set within this period of time. After that, the compressor em-
ploys the current values of a;:::;h, as well as Wt and Kt to
determine the new values of a;:::;h. As described in Step 3
of Section 3, this is performed in t iterations and is controlled
internally by a counter. Again, SHA224/256 utilizes 64 itera-
tions, while SHA384/512 uses 80 iterations. In each of these
iterations, registers W0;:::;W15 and a;:::;h are shifted in the
direction of the arrows shown in Figure 1.
In the end of t iterations, the intermediate hash computation
must be performed. This operation could be executed in one
clock cycle, but it would require eight adders for that. How-
ever, in order to save implementation area, only two adders
are utilized. This way, the computation of the intermediate
hash is spread over the last 4 iterations by computing two ad-
ditions per clock cycle. More precisely, the additions are per-
formed when t = 60;:::;63 for SHA224/256. For instance,
in SHA224/256, when t = 60, H3 and H7 are computed,
when t = 61, H2 and H6 are computed, and so on. The
same strategy is followed by SHA384/512, but the additions
are executed when t = 76;:::;79.
In case of a multi-block message, a new execution cycle ini-
tiates with 16 more words Mt being shifted into the module.
Then, the same procedure described above is executed. For
the last message block, read operations are performed to shift
out the message digest. Speciﬁcally, SHA-224, SHA-256,
SHA-384 and SHA-512 require, respectively, 7, 8, 6, and 8
read operations.
The total memory requirement to store the constants Kt and
H
(0)
0 ;:::;H
(0)
7 is 2304 bits for SHA224/256, and 5632 bits
for SHA384/512, as shown in Table 1. Given the variables
W0;:::;W15, a;:::;h and H0;:::;H7, the total register re-
quirements are 1024 bits for SHA224/256, and 2048 bits for
SHA384/512, as listed in Table 2.
Table 1. Memory Requirements
Scheme SHA224/256 SHA384/512
(bits) (bits)
NoFT 2304 5632
FullTMR 6912 16896
TMR&HCMem 2736 6248
TMRRegs&HCMem 2736 6248
HCAllRegs 2736 6248
HCMainRegs 2736 6248
Considering the number of iterations involved in a hash com-
putation, it is clear that a single bit-ﬂip in a memory that pro-
vides constants or input blocks, or in registers propagating
intermediate values, can be devastating for the applications
using the hash function (e.g., data integrity checking, authen-
tication). In order to make hash function designs appropriate
for space applications, error detection and correction schemes
mustbeincorporatedsothatSEUsdonotcompromiseitsnor-
mal operation. This issue is addressed in Section 5.
Table 2. Register Requirements
Scheme SHA224/256 SHA384/512
(bits) (bits)
NoFT 1024 2048
FullTMR 3072 6144
TMR&HCMem 3072 6144
TMRRegs&HCMem 3072 6144
HCAllRegs 1035 2060
HCMainRegs 1216 2272
5. ERROR DETECTION AND CORRECTION
SCHEMES
In this paper, we propose four new fault-tolerant schemes that
combine two existing techniques for error detection and cor-
rection: modular redundancy and information redundancy.
Modular redundancy is usually employed as the traditional
TMR: three identical modules are instantiated and their out-
puts are sent to a voter. The voter is then in charge of majority
voting the three intermediate results coming from modules.
After masking out a potential incorrect result, the voter out-
puts the ﬁnal result. Notice that if two modules fail, the voter
is unable to properly select the correct intermediate result,
i.e. it tolerates up to one module failure. The second tech-
nique employed is based on HCs. The Hamming codes used
in this work have Hamming distance of three, which means
that they are capable of either detecting double bit-ﬂips or
detecting and correcting a single bit-ﬂip. Since we are con-
cerned with error correction, we are using HCs to detect and
correct single bit-ﬂips.
In total, ﬁve fault-tolerant schemes are considered. The ﬁrst
one is the regular TMR that we called Full Triple Modular
Redundancy. The four schemes we introduce in this pa-
per are named TMR with Shared Encoded Memory, TMR for
Registers and Shared Encoded Memory, Encoding/Decoding
All Registers with Hamming Codes, and Encoding/Decoding
Main Registers with Hamming Codes. They are hybrid solu-
tions based on the two techniques aforementioned. The goal
is to decrease implementation area and power consumption,
as well as to make designs more resistant against SEUs. Al-
ternatives based on checkpoint and restart were not consid-
ered due to their potential negative impact in the module’s
speed and power consumption.
To show the improvements offered by these new solutions,
we implemented these schemes for every hash function of
the SHA-2 family (i.e. SHA-224, SHA-256, SHA-384, and
SHA-512). In the following, when we refer to SHA-2, we
imply the four functions belonging to the family; when dis-
tinction between functions is required we speciﬁcally name
the hash function. We deﬁne a terminology for Hamming
5codes: (w,v), where v is the number of data bits, and w is
the number of data bits along with parity bits. Furthermore,
Tables 1 and 2 summarizes the memory and register require-
ments discussed below.
Full Triple Modular Redundancy
As described above, triple modular redundancy consists in
triplicating the circuit and using a voter to determine the out-
put. In this work, three SHA-2 hardware modules are instan-
tiated and share the same inputs as depicted in Figure 2. This
scheme is referred as FullTMR for short. During implemen-
tation of FullTMR, special attention was paid to the design
partitioning in order to avoid the synthesizer merging com-
mon circuitry and registers, which would lead to misleading
synthesis results. FullTMRisusedonly asa referencemodel
for the comparisons performed in the next sections.
Figure 2. FullTMR block diagram
Since FullTMR uses three instances of the NoFT module,
it triplicates the memory and registers requirements. Pre-
cisely, the memory requirements of FullTMR is 6912 bits
for SHA224/256, and 16896 bits for SHA384/512. The regis-
ter requirements is now 3072 bits for SHA224/256, and 6144
bits for SHA384/512. An advantage of FullTMR is that it
includes fault-tolerance without a big impact on the module’s
speed, since it employs three NoFT modules working in par-
allel. On the other hand, a drawback is the big area penalty
imposed by the use of replication. Thus, other schemes are
proposed to reduce the memory requirements and to achieve
smaller implementation area and lower power consumption.
Regarding SEU resistance, FullTMR has three times as
much memory and registers as the NoFT module. Consid-
ering that these memories do not employ any fault-tolerant
mechanism, a single bit-ﬂip in one of these memories com-
promises the processing of the entire NoFT module. A sec-
ond bit-ﬂip in any other location of the other two memories,
compromises the entire FullTMR module. The same fail-
ure condition applies to the registers; one bit-ﬂip in a register
of one of the modules, and a second bit-ﬂip in any of the
registers of the other two modules. Considering that space-
crafts may have mission lifetimes reaching decades, it is very
likely to have multiple bit-ﬂips in the memory elements. As
a result, it is very important to protect memory against SEUs
while reducing memory requirements; this is the idea behind
the scheme presented in the following section.
TMR with Shared Encoded Memory
In order to scale down the number of memory bits used, as
well as the power consumption, the constants memory could
be shared among the three SHA-2 modules. This scheme is
named TMR&HCMem, is depicted in Figure 3. However, the
common memory needs to be protected against SEUs. This is
accomplished by encoding the memory of SHA224/256 with
a (38,32) Hamming code, whereas SHA384/512 employ a
(71,64) Hamming code. For each memory read, a Hamming
decoder detects and corrects any potential bit ﬂip, thus send-
ing the correct value to modules.
Figure 3. TMR&HCMem block diagram.
The use of a single memory in TMR&HCMem decreases the
memory overhead compared to FullTMR, even when par-
ity bits are attached to each memory element. As a result,
the memory requirements of TMR&HCMem is 2736 bits for
SHA224/256, and 6248 bits for SHA384/512. The conse-
quence of using HC, though, is the inclusion of the Hamming
decoder between the memory and modules. This decoder im-
plies a longer critical path of the circuit and thus decreases its
frequency of operation. The register requirements is exactly
the same as in FullTMR, i.e. 3072 bits for SHA224/256 and
6144 bits for SHA384/512.
On the other hand, the main advantage of encoding the mem-
ory is that it now tolerates up to one bit-ﬂip in each of its
memory elements. In order to have a failure, two bit-ﬂips
must happen in the same memory element. Hence, the en-
coded memory is less likely to fail compared to the tripli-
cated, unprotected memory in FullTMR. A deeper analysis
of the probability of memory failure is provided in Section 6.
Due to its better resistance to bit-ﬂips, encoded memory is
used in all of the following schemes introduced next.
TMR for Registers and Shared Encoded Memory
Given our concern in protecting only the data being pro-
cessed, a further optimization to TMR&HCMem is to move
the redundancy to the register level instead of keeping it at
the modular level. This scheme is called TMRReg&HCMem.
More precisely, instead of triplicating the entire SHA-2 mod-
ule, only one SHA-2 module is used, but all its registers are
triplicated, as illustrated in Figure 4. The register require-
ments is exactly the same as in FullTMR and TMR&HCMem,
i.e. 3072 bits for SHA-224 and SHA-256, and 6144 bits for
SHA-384 and SHA-512. But now, in order to mask out regis-
ters errors, one voter is used for each trio of registers. In total,
32 voters are needed, resulting in higher implementation area.
6Figure 4. TMRReg&HCMem block diagram.
Similarly to TMR&HCMem, this approach uses an encoded
memory to keep SHA-2 constants protected against SEUs.
Given the Hamming decoder and the multiple voters used,
lower frequencies of operation are expected compared to
FullTMRandTMRReg&HCMem. Themainadvantageofthis
scheme, however, is that errors are masked in every clock cy-
cle. In other words, the design fails if two bit-ﬂips occur in
the same register and in the same clock cycle. Since this is
very unlikely to happen, a very high protection against SEUs
in register is achieved with this scheme, as is more formally
discussed in Section 6.
Encoding/Decoding All Registers with Hamming Codes
As an alternative to replicating registers and modules, a
scheme named HCAllRegs is proposed. In this scheme,
Hamming codes are used in place of TMR to protect the reg-
isters contents. The goal is to detect and correct potential
bit-ﬂips in each clock cycle of SHA-2 functions. In order to
achieve that, register contents are encoded/decoded on every
clock cycle before each write/read operation. As a conse-
quence, registers are kept encoded all the time and therefore
protected against bit-ﬂips. This scheme is depicted in Fig-
ure 5, where the shaded and dark areas represent Hamming
encoders and decoders.
In order to reduce the number of parity bits used, all the 32
registers were merged and treated as a single register. For
example, if SHA224/256 encoded their thirty two 32-bit reg-
istersseparatelyusing a (38,32)Hamming code, a total of192
parity bits would have been necessary. By treating the regis-
ters as one 1024-bit register, a (1035,1024) Hamming code
can be used and only 11 parity bits are needed. Likewise, if
SHA384/512 encoded their thirty two 64-bit registers sepa-
rately using a (71,64) Hamming code, a total of 224 parity
bits would be needed. When the registers are merged into a
single 2048-bit register, a (2060,2048) Hamming code can be
employed, thus demanding only 12 parity bits. Notice that,
although Figure 5 show each register associated with an en-
coder and a decoder, a single encoder and a single decoder is
employed. In sum, the register requirements are 1035 bits for
SHA224/256, and 2060 bits for SHA384/512.
The main disadvantage of this approach is that the use of
Hamming encoders and decoders for all registers increases
the critical path of the SHA-2 modules and thus reduces the
Figure 5. HCAllRegs block diagram.
frequency of operation of the design. However, since the
merged register is decoded and re-encoded in every clock cy-
cle, it will only have a failure if two bit-ﬂips happen in the
same clock cycle. As a consequence, this scheme provides a
high protection against SEUs, as described in Section 6.
Encoding/Decoding Main Registers with Hamming Codes
Since HCAllRegs keeps all the registers always encoded,
they are protected against SEUs all the time. Although
HCAllRegs offers a high resistance against SEUs, that
comes at the cost of employing large Hamming encoders and
decoders to detect and correct errors in every clock cycle.
This can be translated to a higher demand on implementa-
tion area and power consumption. In this context, it would be
interesting to achieve a better trade-off between SEU protec-
tion, area and power consumption. This trade-off is explored
in the scheme described in this section.
By analyzing Figure 5, it can be noticed that, in a given algo-
rithm iteration, only registers a;:::;h, H0, H4, W0, W1, W9,
and W14 are involved in the SHA operations. Thus, those are
the only registers that need to be decoded for fault detection
and correction, while all the rest can remain encoded. Anal-
ogously, only a, e, H0, H4, and W15 are updated with new
values. So, only these registers must be re-encoded. This ap-
proach, which encodes and decodes only the main registers is
named HCMainRegs, and depicted in Figure 6.
7Figure 6. HCMainRegs block diagram.
In this scheme, registers are not merged, but have their
own Hamming encoder/decoder (when they are needed).
Differently from HCAllRegs, HCMainRegs uses sepa-
rate encoders and decoders. More precisely, SHA224/256
encode their thirty two 32-bit registers separately using a
(38,32) Hamming code, demanding 192 parity bits. Sim-
ilarly, SHA384/512 encode their thirty two 64-bit registers
separately using a (71,64) Hamming code, and uses 224
parity bits. The register requirements are 1216 bits for
SHA224/256, and 2272 bits for SHA384/512.
From Figure 6 it is easy to observe error detection and cor-
rection is only performed when reading registers a;:::;h, H0,
H4, W0, W1, W9, and W14. The data present in these reg-
isters have been waiting for a number of clock cycles until
their use in the SHA-2 operation. The number of clock cy-
cles may vary from 1 (for a;:::;h) to 60 (for H0 and H4) in
the case of SHA224/256 or 70 in the case of SHA384/512.
A failure will occur if two bit-ﬂips happen in the same data
word (a;:::;h, H0, H4, W0, W1, W9, W14) while they are in
their idle period. This case is detailed in Section 6.
Similarly to HCAllRegs, HCMainRegs uses an encoded
memory to keep the constants protected against SEUs. Be-
sides, itusesencodersanddecodersinitsdatapath, whichwill
certainly increase the circuit’s critical path. As a result, lower
frequencies of operation are expected for HCMainRegs
compared to the ones achieved in FullTMR.
6. EVALUATION OF ROBUSTNESS AGAINST
SEUS
In addition to factors such as implementation area and power
consumption, it is important to evaluate the robustness of the
proposed schemes against SEUs. A qualitative analysis, as
done in Section 5, give us a good idea on how resilient each
scheme is. However, it is appropriate to conduct a quanti-
tative evaluation of the strategies proposed in this work, so
that we can better compare them with the traditional schemes
such as TMR.
The evaluations conducted in this section analyze the prob-
ability of failure of two main elements: memory and regis-
ters. We assume that there is one scheme per device (ASIC
or FPGA), and that its implementation is spread uniformly
over the device resources. Therefore, we deﬁne the following
terms:
M : Total memory resources in bits,
R : Total number of registers in bits,
m : Used memory resources in bits, and
r : Used registers in bits.
In order to simplify the discussions, we assume that all de-
signs are performing at the same frequency of operation. We
consider the computation of 1 hash for our evaluations. We
further assume that the hardware devices are in the same envi-
ronmentandsubjecttothesamebit-ﬂiprate. Then, wedeﬁne:
¹ : Bit-ﬂip rate per memory bit per clock cycle,
½ : Bit-ﬂip rate per register bit per clock cycle, and
n : Period of time in which bit-ﬂips may occur expressed
in number of clock cycles.
Given that all schemes based on TMR and Hamming codes
tolerate one bit-ﬂip, we analyze the condition for failure,
which is to have two bit-ﬂips corrupting the SHA-2 computa-
tion. Then, we deﬁne the following terms:
P(FM) : Probability of a memory failure,
P(FR) : Probability of a register failure,
P(X1) : Probability of the ﬁrst bit-ﬂip in X, and
P(X2) : Probability of the second bit-ﬂip in X,
where X can be either memory or register elements.
From these deﬁnitions it is possible to determine the proba-
bility of memory and register failures for each of the schemes
presented in Section 5.
Full Triple Modular Redundancy
The traditional TMR triplicates the memory requirements of
a non fault tolerant module. In order to have a failure in
FullTMR, it is necessary to have one bit-ﬂip in one of the
memories, and a second bit-ﬂip in any location of the other
two memories. Let the total memory usage of TMR be m
bits out of M bits available in the device. First analyzing
8the case of a ﬁrst bit-ﬂip, we have that the probability of a
particle hitting one memory element on-chip is 1=M. Given
that m bits are used for TMR, the probability of having this
particle hitting one memory element of the TMR design is
m=M. Further, the three modules take n clock cycles to
compute a SHA-2 operation. Moreover, while in operation
their memory elements suffer a bit-ﬂip rate of ¹ per clock
cycle. Thus, the probability for the ﬁrst bit-ﬂip is given by
P(M1) = mn¹=M. Assume that a bit-ﬂip happened in the
memory of one of the modules; this corrupted memory oc-
cupies m=3 bits of the total memory requirements. So, in
order to have a failure, another bit-ﬂip must happen in the
remaining 2m=3 bits of the other two healthy modules. The
probability of this event to happen is then 2m=(3M). Con-
sequently, the probability for the second bit-ﬂip is given by
P(M2) = 2mn¹=(3M). Since both events must occur, the
ﬁnal probability to have a memory failure in FullTMR is
P(FM) = 2m2n2¹2=(3M2): (1)
The probability of register failure in FullTMR is deﬁned ex-
actly the same way as the memory one. A failure will happen
with the occurrence of one bit-ﬂip in a register of one of the
modules, and a second bit-ﬂip in any of the registers of the
other two modules. The total register usage of TMR is r bits
out of R bits available. Given that n clock cycles to compute
a SHA-2 operation, and that the chip suffers a register bit-ﬂip
rate of ½ per clock cycle, the probability of the ﬁrst bit-ﬂip in
a register is P(R1) = rn½=R. Furthermore, the probability
of the second bit-ﬂip is given by P(R2) = 2rn½=(3R). As
a result, the ﬁnal probability of having a register failure in
FullTMR is
P(FR) = 2r2n2½2=(3R2): (2)
TMR with Shared Encoded Memory
In order to achieve a higher protection level for the mem-
ory, each memory position was encoded using Hamming
codes. The condition for failure in TMR&HCMem is to have
two bit-ﬂips happening in the same memory element. Let
the total encoded memory be m bits out of M bits avail-
able. The probability of a particle to hit the encoded mem-
ory element is m=M. Since TMR&HCMem takes n clock cy-
cles to compute a SHA-2, and considering a bit-ﬂip rate of
¹ per clock cycle, the probability of the ﬁrst memory bit-ﬂip
is P(M1) = mn¹=M. Given that each memory position is
individually encoded, the condition for failure is to have a
second bit-ﬂip in the same memory location that received the
ﬁrst bit-ﬂip. Assume that each encoded memory location uti-
lizes l bits, and that the ﬁrst bit-ﬂip happened in one of these
l bits. Then, a failure will happen if a second bit-ﬂip happens
in the remaining (l¡1) bits, so that probability of that occur-
ring is (l ¡ 1)=M. As a consequence, the probability of the
second bit-ﬂip is given by P(M2) = (l ¡ 1)n¹=M. Thus,
as a result, the ﬁnal probability of having a memory failure in
TMR&HCMem is
P(FM) = m(l ¡ 1)n2¹2=M2: (3)
SinceTMR&HCMemisstillaTMR-basedschemeforregisters,
the probability of having a register failure is exactly the same
as in FullTMR, i.e. is given by Equation (2).
TMR for Registers and Shared Encoded Memory
TMRRegs&HCMem uses the same encoded memory as in
TMR&HCMem. Therefore the probability of a memory
failure is given by Equation (3). Despite the fact that
TMRRegs&HCMem is also a TMR-based scheme, the re-
dundancy is implemented at the register level. Thus, the
probability of register failure is slightly different from the
other TMR schemes. In order to have a register failure in
TMRRegs&HCMem, two bit-ﬂips must occur in the same reg-
ister and in the same clock cycle.
Consider a total register usage of r bits out of R bits available,
and a register bit-ﬂip rate of ½ per clock cycle. Notice that in
FullTMR and TMR&HCMem, errors occurring in the middle
processing must wait n clock cycles to be masked out by the
voter. Now, inTMRRegs&HCMem, errorsaremaskedinevery
clock cycle for each trio of registers. So, the bit-ﬂip analysis
is now performed in a single clock cycle. As a result, the
probability of the ﬁrst register bit-ﬂip is P(R1) = r½=R. The
failure condition for TMRRegs&HCMem is to have two regis-
ters corrupted in the trio, so that a voter would not be able to
decide for the correct result. Thus, assume that a trio of regis-
ters occupies t bits, and that one of these registers suffered a
bit-ﬂip. The probability of having a second register corrupted
in the trio, is 2t=(3R). Hence, the probability of the second
bit-ﬂip to happen is given by P(R2) = 2t½=(3R). Thus, the
ﬁnal probability of register failure in TMRRegs&HCMem is
P(FR) = 2tr½2=(3R2): (4)
Encoding/Decoding All Registers with Hamming Codes
Even though HCAllRegs adopts a totally different fault-
tolerant technique, it uses the same encoded memory as in
TMR&HCMem. Then, the probability of having a memory fail-
ure is given by Equation (3). To save parity bits, this scheme
merges all the registers before encoding. Then, the resulting
Hamming-encoded register can be treated as a single element
occupying r bits out of R bits available in the hardware de-
vice. Since this single register is decoded and re-encoded in
everyclockcycle, afailurewilloccuronlyiftwobit-ﬂipshap-
pen in the encoded register during the time frame of 1 clock
cycle.
The probability of having a bit-ﬂip in the encoded register is
r=R. By considering a single clock cycle period to have a
bit-ﬂip and a bit-ﬂip rate of ½ per clock cycle, the probabil-
ity of the ﬁrst register bit-ﬂip is P(R1) = r½=R. Assum-
ing that a ﬁrst bit-ﬂip happened, the condition for failure is
to have a second bit-ﬂip in the encoded register. Now, there
are (r ¡ 1) bits that may suffer a bit-ﬂip. As a consequence,
probability of having a second bit-ﬂip in the encoded register
is P(R2) = (r ¡ 1)½=R. Subsequently, the ﬁnal probability
9of register failure in HCAllRegs is
P(FR) = (r2 ¡ r)½2=R2: (5)
Encoding/Decoding Main Registers with Hamming Codes
Although HCAllRegs is an optimization of HCAllRegs
which reduces the register requirements, it also uses the same
encoded memory as HCAllRegs and TMR&HCMem. Thus,
the probability of having a memory failure is given by Equa-
tion (3). In HCMainRegs, though, all registers are encoded
individually through the use of Hamming codes.
The condition for a register failure is to have two bit-ﬂips in
the same register (a;:::;h, H0, H4, W0, W1, W9 or W14)
while they are in their idle period. Suppose that all encoded
registers employ r bits out of the R bits available in the
hardware device. Then, the probability of a bit-ﬂip in any
bit of the encoded registers is r=R. Further, assume a bit-
ﬂip rate of ½ per clock cycle. In this scheme the registers
are kept encoded all the time, but are not decoded and re-
encoded in every clock cycle. Instead of that, they have a
idle period of time, which we deﬁne as i. More precisely,
i is the maximum number of clock cycles without perform-
ing error detection and correction on the data present in the
registers. Therefore, the probability of the ﬁrst bit-ﬂip in a
register is P(R1) = ri½=R. To have a failure, a second
bit-ﬂips must occur in the same encoded register during i
clock cycles. Furthermore, consider that a ﬁrst bit-ﬂip has al-
ready occurred in an encoded register, and that each encoded
register uses g bits. Then, the second bit-ﬂip must happen
in one of the (g ¡ 1) remaining bits. Thus, the probability
of having a second bit ﬂip in the same encoded register is
P(R2) = (g ¡ 1)i½=R. As a result, the ﬁnal probability of
register failure in HCMainRegs is
P(FR) = r(g ¡ 1)½2i2=R2: (6)
Device-Speciﬁc Probability of Failure
In order to employ the aforementioned equations to perform
an analysis of device-speciﬁc probability of failure, it is nec-
essary to consider the parameters deﬁning the target hardware
device, such as the total number of registers (R) and mem-
ory (M) available. Given that we use an Altera CycloneII
EP2C35F672C6FPGAtoobtainourexperimentalresults, the
same device was used to conduct the analysis in this section.
Moreover, information on the design of each scheme must
also be known. The memory and register requirements are
shown in Tables 1 and Tables 2, respectively. The speciﬁc
values for all variables deﬁned while determining the proba-
bility equations are listed in Table 3.
Memory Analysis—The memory usage (m) for each scheme
is listed in Table 1. By using Table 3 and the equations pro-
vided in this section, it is possible to determine, in terms
of ¹2, the probability of memory failure of all fault tolerant
schemes. The results for all schemes are organized in Table 4.
Table 3. Variables used for Resistance Analysis
Variable SHA224/256 SHA384/512
M (bits) 483840
R (bits) 33216
n (clock cycles) 64 80
i (clock cycles) 60 76
l (bits) 38 71
t (bits) 96 192
g (bits) 38 71
From Equation (1), it results that the probabilities
of memory failure of FullTMR are 557.28x10¡3¹2
and 5202.99x10¡3¹2, respectively, for SHA224/256 and
SHA384/512. Since TMR&HCMem, as well as all the remain-
ing schemes, use a Hamming-encoded memory, their proba-
bility of memory failure are given by Equation (3). More pre-
cisely, the Hamming-encoded memory decreases the proba-
bility of memory failure to 1.77x10¡3¹2 and 11.96x10¡3¹2,
respectively, for SHA224/256 and SHA384/512.
Table 4. Probability of Memory Failure
Scheme SHA224/256 SHA384/512
(x10¡3¹2) (x10¡3¹2)
FullTMR 557.28 5202.99
TMR&HCMem 1.77 11.96
TMRRegs&HCMem 1.77 11.96
HCAllRegs 1.77 11.96
HCMainRegs 1.77 11.96
Table 5. Normalized Memory Resistance
Scheme SHA224/256 SHA384/512
FullTMR 1 1
TMR&HCMem 314.63 435.15
TMRRegs&HCMem 314.63 435.15
HCAllRegs 314.63 435.15
HCMainRegs 314.63 435.15
In order to have a better picture of the resistance provided
by the Hamming-encoded memory, a normalized analysis
is performed taking as reference the triplicated memory of
FullTMR. From Table 5, it is possible to conclude that, by
encoding the memory, the memory resistance against SEUs
of SHA224/256 is increased 314.63 times. This increase
is even higher for SHA384/512, i.e. their memory become
435.15 times more resistant by using Hamming codes instead
of TMR.
Register Analysis—Similar analysis for the registers can be
done by utilizing the register usage (r) listed in Table 2
along with Table 3. Equation (2) provides us the prob-
ability of register failure for FullTMR and TMR&HCMem.
When SHA224/256 and SHA384/512 employ those schemes,
the probabilities of register failure are 23356.97x10¡3½2
and 145981.04x10¡3½2, respectively. The probabilities
10Table 6. Probability of Register Failure
Scheme SHA224/256 SHA384/512
(x10¡3½2) (x10¡3½2)
FullTMR 23356.97 145981.04
TMR&HCMem 23356.97 145981.04
TMRRegs&HCMem 0.18 0.71
HCAllRegs 0.97 3.84
HCMainRegs 146.81 832.61
Table 7. Normalized Register Resistance
Scheme SHA224/256 SHA384/512
FullTMR 1 1
TMR&HCMem 1 1
TMRRegs&HCMem 131072 204800
HCAllRegs 24079.65 37972.36
HCMainRegs 159.1 175.33
of register failures for all schemes are listed in Table 6.
Now, if TMRRegs&HCMem is used, Equation (4) determines
that the probabilities of register failures are 0.18x10¡3½2
and 0.71x10¡3½2, respectively, for SHA224/256 and
SHA384/512. Furthermore, when HCAllRegs is adopted,
Equation (5) determines that the probabilities of register
failure for SHA224/256 and SHA384/512 are, respectively,
0.97x10¡3½2 and 3.84x10¡3½2. Lastly, Equation (6) de-
termines the probabilities of failure for SHA224/256 and
SHA384/512 when using HCMainRegs, which are, respec-
tively, 146.81x10¡3½2 and 832.61x10¡3½2.
If a normalized analysis is performed taking as refer-
ence FullTMR, it is easy to observe, from Table 7,
that TMRRegs&HCMem increases the register resistance of
SHA224/256 and SHA384/512 by a factor of 131072 and
204800, respectively. Another scheme with a good register
protection is HCAllRegs. Precisely, that scheme increases
the register resistance of SHA224/256 and SHA384/512,
respectively, by a factor of 24079.65 and 37972.36. If
HCMainRegs is employed, the register resistance is 159.1
and 175.33 times higher, respectively, for SHA224/256 and
SHA384/512.
7. EXPERIMENTAL RESULTS
In order to better analyze the advantages of each scheme pro-
posed in Section 5 it is necessary to evaluate them in terms of
implementation area, throughput, frequency of operation, and
power consumption. Although all the schemes discussed here
can be implemented using both ASICs and FPGAs (Flash,
Anti-Fuse, SRAM), we selected an SRAM FPGA capable
of perform CRC checks automatically: an Altera CycloneII
EP2C35F672C6 FPGA. Hence, we described the proposed
schemes using VHDL, and performed the FPGA implemen-
tation. The tool employed in the description, synthesis, sim-
ulation and power estimation of all hardware modules was
QuartusII [39] version 7.2, service pack 1.
We conducted the synthesis targeting low implementation
area and low power consumption. In order to minimize the
power consumption even further, we performed a two-step
synthesis and simulation procedure. Once the ﬁrst synthesis
and simulation were complete, a signal activity ﬁle was cre-
ated. This ﬁle was then fed back to the tool allowing for bet-
ter synthesis optimization, thus leading to additional power
savings. The data shown in Tables 8, 9, and 10 reﬂect the
results of the ﬁnal synthesis and simulation. Table 11 shows
dynamic power resulting from the simulation of the modules
at their maximum frequency of operation (Fmax), whereas
Table 12 provides the normalized power estimates with all
modules running at 33.33MHz.
Area Results
Implementation area is measured in terms of the number of
logic elements (LEs) used to implement a given scheme in the
FPGA. According to Table 8, the non-fault tolerant (NoFT)
SHA-224 occupies 1559 LEs. As expected, in compari-
son with the NoFT, FullTMR occupies slightly more than
three times as much area (4709 LEs). Now, considering the
ﬁrst design adopting shared encoded memory, TMR&HCMem,
it is easy to observe that it uses slightly more area than
FullTMR (4843 LEs). Supposedly, the TMRRegs&HCMem
design should be smaller than TMR&HCMem, given that the
SHA-2 datapath is not triplicated. However, due to the num-
ber of voters used, LEs are used exclusively for implementing
all the logic needed. It utilizes 6234 LEs, which means four
times the area of NoFT, and 1.3 times the area of FullTMR.
The HCAllRegs design is quite inefﬁcient in terms of area
(6507 LEs), i.e. it is more than four times bigger than NoFT,
and has 1.38 times the size of FullTMR. Finally, a de-
sign with relatively small implementation area (3617 LEs) is
HCMainRegs. HCMainRegs employs only 2.3 times as
much area as NoFT, and employs approximately 3/4 of the
area of FullTMR.
By analyzing Table 8, it is possible to observe certain trends
in the SHA-2 area utilization. FullTMR occupies about 3
times as much area as NoFT, while TMR&HCMem is slightly
bigger than FullTMR. Further, both TMRRegs&HCMem and
HCAllRegs employs, on average, 3.9 times more area
than NoFT, and thus are more inefﬁcient than FullTMR.
HCMainRegs provides the higher area efﬁciency. On av-
erage, it utilizes 2.2 times as much area as NoFT, and less
than 3/4 of the area of FullTMR.
Frequency Results
The frequency of operations of the schemes considered in this
work is presented in Table 9. By analyzing the SHA-224 re-
sults in that table, it is possible to notice that the fault-tolerant
scheme with higher frequency of operation is FullTMR. Due
to the high parallelism involved it can operate at 73.05MHz.
Because TMR&HCMem uses a Hamming decoder between the
memory and the TMR part of the module, its critical path
is impacted negatively. As a consequence, it operates at a
11Table 8. Area Results
Scheme SHA-224 SHA-256 SHA-384 SHA-512
(LEs) (LEs) (LEs) (LEs)
NoFT 1559 1591 3275 3341
FullTMR 4709 4807 9896 10084
TMR&HCMem 4843 4919 10161 10288
TMRRegs&HCMem 6234 6232 12379 12377
HCAllRegs 6507 6494 12419 12365
HCMainRegs 3617 3657 6914 6897
Table 9. Frequency Results
Scheme SHA-224 SHA-256 SHA-384 SHA-512
(MHz) (MHz) (MHz) (MHz)
NoFT 76.24 79.18 61.65 60.58
FullTMR 73.05 71.69 59.99 60.66
TMR&HCMem 47.44 46.93 38.71 38.80
TMRRegs&HCMem 45.41 45.28 39.08 35.58
HCAllRegs 28.52 28.93 17.42 20.73
HCMainRegs 42.92 42.93 37.73 35.03
Table 10. Throughput Results
Scheme SHA-224 SHA-256 SHA-384 SHA-512
(Mbps) (Mbps) (Mbps) (Mbps)
NoFT 443.58 455.51 612.91 590.80
FullTMR 425.02 412.42 596.41 591.58
TMR&HCMem 276.01 269.98 384.85 378.39
TMRRegs&HCMem 264.20 260.49 388.52 346.99
HCAllRegs 165.93 166.43 173.19 202.17
HCMainRegs 249.72 246.97 375.10 341.63
lower frequency than FullTMR, i.e, 47.44MHz. Besides a
memory decoder, TMRRegs&HCMem has also its frequency
of operation impacted negatively by voters, so that it oper-
ates at 45.41MHz. Since HCAllRegs performs an encoding
and a decoding operation in each clock cycle, its frequency
of operation is dramatically reduced to 28.52MHz. By en-
coding and decoding the main registers only, as it is done
in HCMainRegs, a frequency of operation of 42.92MHz is
achieved.
By observing the results for the other SHA-2 algorithms in
Table 9, it is possible to realize that FullTMR is in fact
the fault tolerant scheme that provides higher frequencies of
operation. Although, TMR&HCMem and TMRRegs&HCMem
have similar frequencies of operation, the former is slightly
faster than the latter. Further, these two modules are followed
quite closely by HCMainRegs. The slower scheme in all
cases, though, is HCAllRegs.
Throughput Results
For the purpose of simulation and throughput estimation,
we consider the hash of only one block of message. The
block size is 512 bits for SHA224/256, and 1024 bits for
SHA384/512. More precisely, the throughput is deﬁned as:
message block size=(#cycles=Fmax). In order to com-
pute a message digest, SHA-224, SHA-256, SHA-384 and
SHA-512 take, respectively, 88, 89, 103, and 105 clock
cycles. The number of clock cycles reported include the
complete SHA-2 processing, as well as the time spent writ-
ing/reading data to/from the module.
As shown in Table 10, the frequency of operation has a
strong inﬂuence on the throughput, i.e. the higher the fre-
quency, the higher the throughput. In fact, SHA-224 us-
ing FullTMR has highest throughput among the fault tol-
erant modules. Its throughput (425.02Mbps) is 96% of
the NoFT throughput (443.58Mbps). Further, the through-
put of TMR&HCMem (276.01Mbps) and TMRRegs&HCMem
(264.20Mbps) achieve, respectively, 62% and 60% of the
NoFT throughput. HCAllRegs presents a relative low
throughput (165.93Mbps), which is 37% of the NoFT
throughput. Moreover, HCMainRegs presents a through-
put of 249.72Mbps, which is comparable with the ones of
TMR&HCMem and TMRRegs&HCMem. This represents 56%
of the NoFT throughput.
Following the same analysis, the throughput of the SHA-256,
SHA-384 and SHA-512 using FullTMR are, on average,
12Table 11. Dynamic Power Consumption Results
Scheme SHA-224 SHA-256 SHA-384 SHA-512
(mW) (mW) (mW) (mW)
NoFT 78.17 81.73 147.59 143.95
FullTMR 222.04 228.27 430.52 426.18
TMR&HCMem 146.58 148.35 297.13 294.73
TMRRegs&HCMem 149.89 144.9 261.6 252.22
HCAllRegs 267.94 290.2 511.42 541.54
HCMainRegs 125.04 126.18 249.84 241.63
Table 12. Normalized Dynamic Power Consumption Results
Scheme SHA-224 SHA-256 SHA-384 SHA-512
(mW) (mW) (mW) (mW)
NoFT 36.06 34.96 83.9 81.69
FullTMR 103.57 107.18 244.3 241.83
TMR&HCMem 108.66 108.95 256.82 255.01
TMRRegs&HCMem 115.73 111.28 266.55 243.53
HCAllRegs - - - -
HCMainRegs 99.84 101.07 224.32 233.31
higher than 90% of the NoFT. On average, TMR&HCMem,
TMRRegs&HCMem and HCMainRegs have similar relative
throughput. Precisely, they achieve respectively 62%, 60%
and 57% of the NoFT throughput.
Power Results
Power consumption is another important factor to be an-
alyzed in space systems. Table 11 reports the dynamic
power consumption of the implementations performing one
hash computation at their maximum frequency of opera-
tion. The SHA-224 NoFT design consumes 78.17mW. When
FullTMR is used, its power consumption is 2.8 times higher,
i.e. 222.04mW. This is an expected result, given that it
triplicates the SHA-224 datapath. By using an encoded
memory, TMR&HCMem and TMRRegs&HCMem consume 1.9
times as much power as NoFT, respectively, 146.58mw and
149.89mW. Moreover, HCAllRegs consumes 267.94mW,
i.e. 3.4 times more power than NoFT. The scheme with the
least power consumption (125.04mW) is HCMainRegs. It
consumes 1.6 times as much power as NoFT.
The overall power increase of FullTMR is slightly less
than 3 times the power of NoFT. In addition, TMR&HCMem
uses on average twice as much power as NoFT, whereas
TMRRegs&HCMem uses 1.9 times more power than NoFT.
The power increase caused by HCMainRegs is on average
1.6 times than the NoFT one.
In order to provide a fair comparison among the implemen-
tations, we performed a normalized power estimation by run-
ning all the designs at a common frequency of 33.33MHz.
Given that HCAllRegs has a frequency of operation lower
than 33.33MHz, they were not included in this compari-
son. At 33.33MHz, the throughput of SHA-224, SHA-256,
SHA-384, and SHA-512 are 193.94, 191.76, 331.39, and
325.08Mbps, respectively.
The normalized power results are listed in Table 12. The non-
fault tolerant (NoFT) SHA-224 implementation consumes
36.06mW of dynamic power to generate a message digest.
Further, SHA-224 FullTMR consumes 103.57mW, which is
slightly less than three times the dynamic power of NoFT.
Furthermore, TMR&HCMem and TMRRegs&HCMem normal-
ized power consumptions are, respectively, 108.66mW and
115.73mW. This represents increases of 3 and 3.2 times in the
power consumption of TMR&HCMem and TMRRegs&HCMem,
respectively, compared to NoFT. Once again, HCMainRegs
provides the lowest power consumption among all fault
tolerant schemes, which is 99.84mW. In other words,
HCMainRegs consumes 2.8 times as much power as NoFT
at 33.33MHz.
By looking at the normalized power consumption of
SHA-256, SHA-384 and SHA-512 in Table 12, one can con-
clude that FullTMR, TMR&HCMem and TMRRegs&HCMem
uses, on average, 3 times as much normalized power as
NoFT. HCMainRegs, in turn, leads to an average normal-
ized power increase of 2.8 times, compared to NoFT.
Discussion
This section highlights the most important improvements
brought by the proposed schemes compared to TMR. The
beneﬁts of using an encoded memory is twofold. First, as
shown in Table 1, it uses 60% less memory than FullTMR.
Second, as listed in Table 5, the memory becomes 314.63
times more resistant against SEUs in SHA224/256, and
435.15 times more resistant in SHA384/512.
13TMRRegs&HCMem offers the highest level of protection
against SEUs. More precisely, as listed in Table 7,
the registers become 131072 times more resistant against
SEUs in SHA224/256, and 204800 times more resistant in
SHA384/512 when compared to FullTMR. However this
scheme employs 1.3 more area than FullTMR to achieve
higher levels of resistance to SEUs.
Figure 7. Comparison of Implementation Area.
Figure 8. Comparison of Power Consumption.
HCAllRegs also offers a high protection against SEUs, but
as can be noticed from Figures 7 and 8, that it is the most
inefﬁcient scheme in terms of area, throughput, and power
consumption.
The best trade-off among implementation area, power con-
sumption and protection against SEUs is achieved with
HCMainRegs. As depicted in Figures 7 and 8, this scheme
uses up to 32% less area and consumes up to 43% less power
than FullTMR. Further, it employs up to 63% less registers
thanFullTMR.Additionally, whenHCMainRegsisapplied
to SHA224/256 and SHA384/512 their registers become, re-
spectively, 159.1 and 175.33 times more resistant against
SEUs than when using FullTMR. Also, the memory of
SHA224/256 and SHA384/512 become, respectively, 314.63
and 435.15 times more resistant than the one in FullTMR.
8. CONCLUSIONS
The paper proposes fault tolerant schemes for the SHA-
2 family of hash functions, providing both error detection
and correction. Although all schemes can be applicable to
ASICs and FPGAs (SRAM, Flash, Anti-Fuse), we imple-
mented them in an SRAM FPGA to perform their evalua-
tion in terms of area, frequency of operation, throughput, and
power consumption. Additionally, a comprehensive analysis
of the resistance of the schemes against SEUs is performed.
For the sake of comparison we implemented the traditional
TMR and we showed that this fault tolerance technique ap-
plied to SHA-2 hash function demands three times as much
area resource as a non-fault tolerant approach. The proposed
scheme named HCMainRegs provides a better trade-off for
area and power consumption and improves the resistance
to errors caused by SEUs. For instance, SHA-512 adopt-
ing HCMainRegs employs 6897 LEs and 6248 memory
bits, and has a dynamic power consumption of 241.63mW.
By comparing with NoFT, those results represent area and
power increases of 2.1 and 1.7 times, respectively. However,
HCMainRegs uses up to 32% less area and consumes up to
43% less power than FullTMR. Moreover, its memory and
registers are, respectively, 435 and 175 times more resistant
against SEUs than they would be by using FullTMR.
As a result, HCMainRegs can successfully replace TMR for
achieving fault tolerance in the SHA-2 family of hash func-
tions. Besides, it provides higher levels of protection against
SEUs, as well as favors low power and low implementation
area, which are crucial in space applications. To the best of
our knowledge, this is the ﬁrst implementation and analysis
of the SHA-2 family of hash functions providing both error
detection and correction reported in the literature.
ACKNOWLEDGMENTS
This research is supported in part by grants from NSERC and
OCE. We would like to thank Kryptus Technologies for pro-
viding some golden models used to validate our architectures.
REFERENCES
[1] “Critical infrastructure protection: Commercial satellite
security should be more fully addressed,” US General
Accounting Ofﬁce, Tech. Rep. GAO-02-781, 2002.
[2] “Security threats against space missions,” CCSDS, In-
formational Report 350.1-G-1, October 2006.
[3] “Hackers seize UK military satellite,”
Online, The Landﬁeld Group, March
1999, http://www.landﬁeld.com/isn/mail-
archive/1999/Mar/0001.html.
[4] S. Fleming, “Hacker inﬁltrates military satel-
lite,” Online, The Register, March 1999,
http://www.theregister.com/1999/03/01/hacker
inﬁltrates military satellite/.
14[5] “British hackers attack MoD satellite,” On-
line, Telegraph Newspaper, March 1999,
http://www.telegraph.co.uk/connected/main.jhtml?
xml=/connected/1999/03/04/ecnhack04.xml.
[6] K. Poulsen, “Satellites at risk of hacks,” Security Focus,
October 2002.
[7] M. Juliato and C. Gebotys, “An approach for recover-
ing satellites and their cryptographic capabilities in the
presence of SEUs and attacks,” in Proceedings of the
NASA/ESA Conference on Adaptive Hardware and Sys-
tems (AHS-2008), June 2008, pp. 101–108.
[8] T. Vladimirova, R. Banu, and M. Sweeting, “On-board
security services in small satellites,” in MAPLD 2005
Proceedings, 2005.
[9] G. Messenger and M. Ash, Single Event Phenomena.
Springer, 2006.
[10] A. Menezes, P. Oorschot, and S. Vanstone, Handbook of
Applied Cryptography. CRC Press, 1996.
[11] D. Stinson, Cryptography Theory and Practice, 3rd ed.
CRC Press, 2005.
[12] “The keyed-hash message authentication code
(HMAC),” NIST, Federal Information Processing
Standards Publication FIPS PUB 198, March 2002.
[13] “Secure hash standard (SHS),” NIST, Federal Informa-
tion Processing Standards Publication FIPS 180-3, Oc-
tober 2008.
[14] “Authentication / integrity algorithm issues survey,”
CCSDS, Informational Report CCSDS 350.1-G-1,
March 2008, green Book.
[15] “Encryption algorithm trade survey,” CCSDS, Informa-
tional Report CCSDS 350.2-G-1, March 2008, green
Book.
[16] B. Badrignans, R. Elbaz, and L. Torres, “Secure FPGA
conﬁguration architecture preventing system down-
grade,” in Proceedings of the International Conference
on Field Programmable Logic and Applications, 2008,
pp. 317–322.
[17] I. Ingemarsson and C. Wong, “Encryption and authen-
tication in on-board processing satellite communica-
tion systems,” IEEE Transactions on Communications,
vol. 29, no. 11, pp. 1684–1687, November 1981.
[18] A. Roy-Chowdhury, J. Baras, M. Hadjitheodosiou, and
S. Papademetriou, “Security issues in hybrid networks
with a satellite component,” IEEE Wireless Communi-
cations, vol. 12, no. 6, pp. 50–61, 2005.
[19] M. Arslan and F. Alagoz, “Security issues and perfor-
mance study of key management techniques over satel-
litelinks,” inProceedingsofthe11thIntenationalWork-
shop on Computer-Aided Modeling, Analysis and De-
sign of Communication Links and Networks, 2006, pp.
122–128.
[20] D. Fischer, M. Merri, and T. Engel, “Key management
for CCSDS compliant space missions,” in Proceedings
of the International Conference on Space Operations
(Spaceops 2008), May 2008.
[21] M. McLoone and J. V. McCanny, “Efﬁcient single-chip
implementation of SHA-384 and SHA-512,” in Pro-
ceedings of the 2002 IEEE International Conference on
Field-Programmable Technology (FPT’02), 2002, pp.
311–314.
[22] K. K. Ting, S. C. L. Yuen, K. H. Lee, and P. H. W.
Leong, “An FPGA based SHA-256 processor,” in
Proceedings of the Reconﬁgurable Computing Is Go-
ing Mainstream, 12th International Conference on
Field-ProgrammableLogicandApplications(FPL’02).
Springer-Verlag, 2002, pp. 577–585.
[23] N. Sklavos and O. Koufopavlou, “On the hardware im-
plementations of the SHA-2 (256, 384, 512) hash func-
tions,” Proceedings of the 2003 International Sympo-
sium on Circuits and Systems (ISCAS’03)., vol. 5, pp.
V–153–V–156, May 2003.
[24] ——, “Implementation of the SHA-2 hash family stan-
dard using FPGAs,” The Journal of Supercomputing,
vol. 31, no. 3, pp. 227–248, 2005.
[25] T.Grembowski, R.Lien, K.Gaj, N.Nguyen, P.Bellows,
J. Flidr, T. Lehman, and B. Schott, “Comparative analy-
sis of the hardware implementations of hash functions
SHA-1 and SHA-512,” in Proceedings of the 5th In-
ternational Conference on Information Security ISC’02.
Springer-Verlag, 2002, pp. 75–89.
[26] I. Ahmad and A. S. Das, “Hardware implementa-
tion analysis of SHA-256 and SHA-512 algorithms
on FPGAs,” Computers and Electrical Engineering,
vol. 31, no. 6, pp. 345–360, 2005.
[27] R. P. McEvoy., F. M. Crowe, C. C. Murphy, and W. P.
Marnane, “Optimisation of the SHA-2 family of hash
functions on FPGAs,” IEEE Computer Society Annual
Symposium on Emerging VLSI Technologies and Archi-
tectures, 2006, vol. 00, p. 6 pp., March 2006.
[28] H. Michail, A. Kakarountas, G. N. Selimis, and C. E.
Goutis, “Optimizing SHA-1 hash function for high
throughput with a partial unrolling study,” in PATMOS
2005 Proceedings, 2005, pp. 591–600.
[29] R. Chaves, G. Kuzmanov, L. Sousa, and S. Vassiliadis,
“Improving SHA-2 hardware implementations,” in Pro-
ceedings of the Cryptographic Hardware and Embed-
ded Systems (CHES 2006), 2006, pp. 298–310.
[30] I. Ahmad and A. S. Das, “Analysis and detection of
errors in implementation of SHA-512 algorithms on
FPGAs,” TheComputerJournal, vol.50, no.6, pp.728–
738, 2007.
[31] “Advanced encryption standard (AES),” NIST, Fed-
eral Information Processing Standards Publication FIPS
PUB 197, November 2001.
[32] S. Ghaznavi and C. Gebotys, “A SEU-resistant, FPGA-
15based implementation of the substitution transformation
in AES for security on satellites,” in Proceedings of the
Tenth International Workshop on Signal Processing for
Space Communications (SPSC 2008), October 2008.
[33] RTAX-S/SL RadTolerant FPGAs, 5th ed., Actel Corpo-
ration, October 2008.
[34] Radiation-Tolerant ProASIC3 Low-Power Space-Flight
Flash FPGAs, Actel Corporation, September 2008.
[35] Radiation-Hardened Virtex-4 QPro-V Family Overview,
1st ed., Xilinx Inc., March 2008, DS653.
[36] Error Detection and Recovery Using CRC in Altera
FPGA Devices, Altera Corporation, July 2008, AN 357.
[37] P.Blain, C.Carmichael, E.Fuller, andM.Caffrey, “SEU
mitigation techniques for Virtex FPGAs in space appli-
cations,” in MAPLD 1999 Proceedings, 1999.
[38] P. Samudrala, J. Ramos, and S. Katkoori, “Selec-
tive triple modular redundancy (STMR) based single-
event upset (SEU) tolerant synthesis for FPGAs,” IEEE
Transactions on Nuclear Science, vol. 51, pp. 2957–
2969, 2004.
[39] Quartus II Version 8.0 Handbook, Altera Corporation,
May 2008.
BIOGRAPHY
Marcio Juliato received his bache-
lor’s degree in Computer Engineering in
2003, and his M.Sc. degree in Computer
Science in 2006, both from the Uni-
versity of Campinas, Campinas, Brazil.
Marcio is currently a PhD candidate in
Computer Engineering in the Dept. of
Electrical and Computer Engineering,
University of Waterloo, Waterloo, Canada. He is also a
member of the Center for Applied Cryptography Research
(CACR) at the University of Waterloo. His research interests
include cryptography, computer architecture, fault tolerance,
andefﬁcientimplementationsofsecuritymechanismsforem-
bedded and space systems.
CatherineGebotysreceivedherB.A.Sc.
degree in Engineering Science in 1982
and her M.A.Sc. degree in Electrical En-
gineering in 1984, both from University
of Toronto, Toronto, Canada. She re-
ceived her PhD degree in Electrical En-
gineering in 1991 from the University
of Waterloo, Waterloo, Canada. She is
currently a Professor in the Dept. of Electrical and Com-
puter Engineering, University of Waterloo, Canada. Her
research interests include embedded systems security, side
channel (power/electromagnetic) analysis of cryptographic
algorithms, and reconﬁgurable architectures.
Reouven Elbaz received his M.Sc. de-
gree in Electrical Engineering in 2003
from Polytech’Montpellier, France, and
his PhD degree in Electrical and Com-
puter Engineering in 2006 from the Uni-
versity of Montpellier, France. During
his PhD studies, he was also with STMi-
croelectronics within the Security Group
of AST (Advanced System Technology) Rousset, France
working on trusted computing in embedded systems. Then,
he joined the Dept. of Electrical Engineering at Princeton
University, USA, for eighteen months as a research asso-
ciate. He is now research associate in the Dept. of Electrical
and Computer Engineering, University of Waterloo, Water-
loo, Canada. His research interests include trusted comput-
ing and security, cryptography, reconﬁgurable architectures,
computer architecture and parallel processing.
16