Lightweight Hardware Architectures for Efficient Secure Hash Functions
  ECHO and Fugue by Kermani, Mehran Mozaffari et al.
ar
X
iv
:1
80
4.
06
49
7v
1 
 [c
s.C
R]
  1
7 A
pr
 20
18
1
Lightweight Hardware Architectures for Efficient
Secure Hash Functions ECHO and Fugue
Mehran Mozaffari Kermani, Reza Azarderakhsh, Siavash Bayat-Sarmadi
Abstract—In cryptographic engineering, extensive attention
has been devoted to ameliorating the performance and security
of the algorithms within. Nonetheless, in the state-of-the-art,
the approaches for increasing the reliability of the efficient
hash functions ECHO and Fugue have not been presented to
date. We propose efficient fault detection schemes by presenting
closed formulations for the predicted signatures of different
transformations in these algorithms. These signatures are derived
to achieve low overhead for the specific transformations and
can be tailored to include byte/word-wide predicted signatures.
Through simulations, we show that the proposed fault detection
schemes are highly-capable of detecting natural hardware failures
and are capable of deteriorating the effectiveness of malicious
fault attacks. The proposed reliable hardware architectures are
implemented on the application-specific integrated circuit (ASIC)
platform using a 65-nm standard technology to benchmark
their hardware and timing characteristics. The results of our
simulations and implementations show very high error coverage
with acceptable overhead for the proposed schemes.
I. INTRODUCTION
Cryptographic hash functions take arbitrary-length inputs
and generate fixed-length outputs. The output of hash function
is then utilized to provide authentication and integrity for
the transferred data. In this paper, due to the efficiency of
the algorithms ECHO [1] and Fugue [2] (which has been
improved to Fugue 2.0), and the fact that these are inspired by
the widely-utilized Advanced Encryption Standard (AES), we
present their respective fault detection schemes. These AES-
inspired hash functions (which have been part of the NIST
competition) have received much attention in the literature.
For instance, in [3] and [4], differential and side-channel
analysis attacks for ECHO are presented. Moreover, much
effort has been put into developing high-performance and
efficient hardware implementations of these algorithms, see,
for instance, [5], [6], and [7]. As discussed in [8], one
important feature of these hash functions is that one can share
some resources between the AES and these hash algorithms.
Thus, low-complexity implementations are achieved.
Fault attacks pose serious threats to the implementations
of the crypto-algorithms. Therefore, many fault detection
schemes have been proposed to date for cryptographic and
arithmetic entities, see, for instance, [9], [10], [11], [12],
Mehran Mozaffari Kermani is with Department of Computer Science
and Engineering, University of South Florida, Tampa, FL 33620, Email:
mehran2@usf.edu.
Reza Azarderakhsh is with Department of ECE and Computer Sci-
ence, Florida Atlantic University, Boca Raton, FL 14623, Email: razarder-
akhsh@fau.edu.
Siavash Bayat-Sarmadi is with Department of Computer Engineering, Sharif
University of Technology, Tehran, Iran, Email: sbayat@sharif.edu.
[13], [14], [15], [16], [17], and [18] for some examples.
Nonetheless, to the best of our knowledge, the schemes for
increasing the reliability of these algorithms have not been pre-
sented in the open literature. Effective fault detection schemes
with minimal overhead on these algorithms are essential for
achieving reliable hardware architectures.
The summary of our contributions is presented in the
following.
• We have obtained new formulations for the predicted sig-
natures of different transformations for hash algorithms,
i.e., ECHO [1] and Fugue [2]. The presented closed
formulations are used for proposing high-performance
and effective fault detection schemes.
• Our simulation results show high fault detection capabil-
ity for the proposed schemes for all the algorithms. This
makes the proposed architectures reliable in practice.
• We have used ASIC implementations to benchmark the
hardware and timing characteristics of the proposed
schemes. The high efficiency of the proposed schemes
makes the proposed architectures suitable for high-
performance applications.
II. PRELIMINARIES
ECHO (presented by Benadjila et al.) [1] supports any hash
output of length from 128 to 512 bits. The hash function
ECHO takes a message and a salt as input. Although the output
can be of any length from 128 to 512 bits, the four outputs
for NIST competition were 224, 256, 384, and 512 bits. The
ECHO algorithm with the output size (Hsize) less than 256,
i.e., 128 ≤ Hsize ≤ 256, uses the compression function
called Compress512. However, for 257 ≤ Hsize ≤ 512,
the compression function is called Compress1024 which is
very similar to Compress512 [1]. More details are presented
throughout the paper as needed.
In what follows, we explain the hash function Fugue (pre-
sented by the IBM) [2]. Fugue-256 generates a 256-bit output
H for the message M which is split into 32-bit blocks mi,
1 ≤ i ≤ t. The chaining value of Fugue-256 (denoted by h)
is also split to 32-bit blocks denoted by Si, 0 ≤ i ≤ 29. The
following transformation sequence is used for updating h from
mi: TIX, ROR3, CMIX, SMIX, ROR3, CMIX, and SMIX
(called one round R). The sequence ROR3, CMIX, SMIX is
called a sub-round. Therefore, a round R consists of the TIX
transformation followed by two sub-rounds [2]. More details
are presented throughout the paper as needed.
2C1 salt
0
0
v
1
0
v
2
0
v
3
0
v
2
1
m
4
1
m
5
1
m
6
1
m
0
1
m
1
1
m
3
1
m
7
1
m
8
1
m
9
1
m
10
1
m
11
1
m
Ci salt
0
1i
v 
1
1i
v 
2
1i
v 
3
1i
v 
2
i
m
4
i
m
5
i
m
6
i
m
0
i
m
1
i
m
3
i
m
7
i
m
8
i
m
9
i
m
10
i
m
11
i
m
BIG.SubWords
BIG.ShiftRows
BIG.MixColumns
BIG.Final
BIG.SubWords
BIG.ShiftRows
BIG.MixColumns
BIG.Final
BIG.SubWords
BIG.ShiftRows
BIG.MixColumns
saltCt
BIG.Final
0
1t
v 
1
1t
v 
2
1t
v 
3
1t
v 
2
t
m
4
t
m
5
t
m
6
t
m
0
t
m
1
t
m
3
t
m
7
t
m
8
t
m
9
t
m
10
t
m
11
t
m
128 128 128
8u 8u 8u
16 128u 16 128u 16 128u
"
T
h
Compress512
"
Compress512 Compress512
Fig. 1. The ECHO algorithm for 128 ≤ Hsize ≤ 256 [1].
III. THE PROPOSED FAULT DIAGNOSIS APPROACHES
In what follows, for each of the algorithms presented in this
paper, we propose respective fault detection schemes.
A. ECHO
An overview of the ECHO algorithm for 128 ≤ Hsize ≤
256 including the Compress512 functions is presented in Fig.
1. As seen in Fig. 1, each of the t Compress512 functions gets
the 128-bit salt, a 4×4 state of 128-bit entries, and the counter
Ci, 1 ≤ i ≤ t (used to count the number of message bits being
hashed). The first column of the state consists of four 128-bit
values which construct the chaining variable of the previous
Compress512, i.e., Vi−1 = (v
0
i−1, v
1
i−1, v
2
i−1, v
3
i−1), 1 ≤ i ≤ t.
The other three columns include the 128-bit blocks of the input
message. Therefore, in total, there are 12× t 128-bit message
blocks to be processed to give the output (see Fig. 1).
As in seen Fig. 1, each Compress512 consists of four
different transformations, i.e., BIG.SubWords, BIG.ShiftRows,
BIG.MixColumns, and BIG.Final. Each BIG.SubWords con-
tains two AES rounds. The first transformation SubBytes
which includes 16 S-boxes is the only nonlinear AES trans-
formation. In the AES S-box, the irreducible polynomial of
M(x) = x8 + x4 + x3 + x + 1 is used to construct the
binary field GF (28). Let X ∈ GF (28) and Y ∈ GF (28)
be the 8-bit input and output of each S-box, respectively.
Then, the S-box consists of a multiplicative inversion, i.e.,
X−1 ∈ GF (28), followed by an affine transformation to
obtain Y ∈ GF (28). Look-up tables (LUTs) and com-
posite fields (polynomial basis, normal basis, mixed basis,
and redundant-basis are among the approaches for this low-
area implementation variant [19], [20], [21], [22]) are used
to implement the S-boxes. In general, with composite field
realizations, a transformation matrix first transforms a field
element in the binary field GF (28) to the corresponding
representation in the composite fields GF (28)/GF (((22)2)2).
Then, a multiplicative inversion consisting of composite field
operations in the sub-field GF ((22)2) is performed. Finally,
through an inverse transformation matrix, the inverted output
is obtained. There have been a number of great research works
for error detection of the S-boxes and for the sake of brevity,
we do not discuss them.
The next transformation used in BIG.SubWords of ECHO is
ShiftRows whose fault detection is straightforward and by re-
wiring. Moreover, for the two final linear transformations, i.e.,
MixColumns and AddRoundKey, the 32-bit error indication
flag Ec =
∑
3
r=0(inr,c + kr,c + outr,c), 0 ≤ c ≤ 3, can be
used. It is noted that inr,c, kr,c, and outr,c are the input to
MixColumns, the round key, and the output of AddRoundKey,
respectively. This error indication flag can be compressed so
that an n-bit, 1 ≤ n ≤ 32, error indication flag for these two
transformations are achieved. Finally, after two rounds of the
AES, the output of BIG.SubWords is derived.
Fault detection for the next transformation in ECHO,
BIG.ShiftRows, is by permutation. As explained in the afore-
mentioned explanation, the last transformation in BIG.Round,
i.e., BIG.MixColumns, is an expansion of MixColumns of the
AES. Specifically, the output state of BIG.SubWords (input
state of BIG.MixColumns) is arranged as a 4-row, 64-column
matrix. Then, each 4× 4 sub-matrix is multiplied by the fixed
MixColumns matrix. Therefore, we obtain the error indication
flags of the BIG.MixColumns (B.MC) transformation for j
sub-matrices, 0 ≤ j ≤ 15, as follows
Ejc(B.MC) =
3∑
r=0
(inr,c + outr,c), 4j ≤ c ≤ 4j + 3, (1)
where in the sub-matrices, inr,c and outr,c are the input and
output of BIG.MixColumns, respectively, for which 0 ≤ r ≤ 3
and 0 ≤ c ≤ 63.
Finally, the BIG.Final transformation is performed as the
last transformation in each Compress512 (see Fig. 1) of ECHO.
This transformation includes modulo-2 addition of the input
state of the Compress512 and the output state of the eighth
BIG.MixColumns. We present the following lemma for ob-
taining the predicted parities of this transformation.
Lemma 1: Let M
j
i , 0 ≤ j ≤ 11, be the 128-bit message
blocks and A
j
i , 0 ≤ j ≤ 15, be the 128-bit outputs of the
eighth BIG.MixColumns of the ith Compress512 in Fig. 1.
In addition, let v
j
i−1, 0 ≤ j ≤ 3, be the previous chaining
values. Then, the predicted parities of v
j
i , 0 ≤ j ≤ 3
(the current chaining values), after performing the BIG.Final
transformation is obtained as
Pˆ (vji ) =
3∑
j=0
P (vji−1 +A
4j
i ) +
2∑
j=0
P (M4ji ). (2)
Proof. According to [1], we have v
j
i =
∑
3
j=0 v
j
i−1 +∑1
j=0 1A
j
i +
∑1
j=0 5M
j
i . Therefore, for the predicted par-
ity we reach Pˆ (vji ) =
∑
3
j=0 P (v
j
i−1) +
∑
1
j=0 1P (A
j
i ) +∑1
j=0 5P (M
j
i ) and after rearranging, the proof is complete.
It is interesting to note that one can also obtain multiple
parities for v
j
i by applying the parity derivation function (P )
to selected bits of the arguments v
j
i−1 +A
4j
i and M
4j
i .
B. Fugue
To propose a fault detection scheme for Fugue, we observe
that the Fugue transformations can be divided into three types.
The first type is the rotation transformations, i.e., ROR3,
ROR14, and ROR15. The second category contains the two
3linear transformations TIX and CMIX. Finally, the last one is
the nonlinear transformation SMIX.
Each Fugue round has the following sequence: TIX, ROR3,
CMIX, SMIX, ROR3, CMIX, and SMIX. First, we propose
the following theorem for the first three transformations TIX,
ROR3, and CMIX in the round sequence. Then, we propose
the fault detection scheme for the nonlinear transformation
SMIX.
Theorem 1: Let σSi =
∑29
i=0 Si be the 32-bit result of
modulo-2 additions of Si, 0 ≤ i ≤ 29 (called word-wide
signature). Then, the predicted word-wide signature of the
transformations sequence TIX, ROR3, and CMIX (σˆTRC ) in
the Fugue round is obtained as
σˆTRC = σSi + S24. (3)
Proof. For TIX, the following substitutions are performed:
S10 ← S10+S0, S0 ← mi, S8 ← S8+mi, and S1 ← S1+S24.
Therefore, we have σˆTIX = σSi + S10 + S10 + S0 + S0 +
mi + S8 + S8 +mi + S1 + S1 + S24 = σSi + S24. The ROR3
transformation, which is just rotations three positions to right,
does not change σˆTIX = σSi + S24. Moreover, for CMIX, we
have S0 ← S0 + S4, S1 ← S1 + S5, S2 ← S2 + S6, S15 ←
S15+S4, S16 ← S16+S5, and S17 ← S17+S6. Consequently,
we reach σˆCMIX = σSi +S0+S0+S4+S1+S1+S5+S2+
S2+S6+S15+S15+S4+S16+S16+S5+S17+S17+S6 = σSi .
Therefore, one reaches σˆTRC = σSi + S24 and the proof is
complete.
The nonlinear transformation SMIX in Fugue consists of
two functions. The second one is the linear Super-Mix func-
tion. The Super-Mix function consists of multiplication of S0-
S3 (as a 16-byte input vector) with the following 16 × 16
matrix N with hexadecimal entries to derive a 16-byte output
N =


1 4 7 1 1 0 0 0 1 0 0 0 1 0 0 0
0 1 0 0 1 1 4 7 0 1 0 0 0 1 0 0
0 0 1 0 0 0 1 0 7 1 1 4 0 0 1 0
0 0 0 1 0 0 0 1 0 0 0 1 4 7 1 1
0 0 0 0 0 4 7 1 1 0 0 0 1 0 0 0
0 1 0 0 0 0 0 0 1 0 4 7 0 1 0 0
0 0 1 0 0 0 1 0 0 0 0 0 7 1 0 4
4 7 1 0 0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 7 0 0 0 6 4 7 1 7 0 0 0
0 7 0 0 0 0 0 0 0 7 0 0 1 6 4 7
7 1 6 4 0 0 7 0 0 0 0 0 0 0 7 0
0 0 0 7 4 7 1 6 0 0 0 7 0 0 0 0
0 0 0 0 4 0 0 0 4 0 0 0 5 4 7 1
1 5 4 7 0 0 0 0 0 4 0 0 0 4 0 0
0 0 4 0 7 1 5 4 0 0 0 0 0 0 4 0
0 0 0 4 0 0 0 4 4 7 1 5 0 0 0 0


. (4)
We propose the following theorem for the predicted parity
of the Super-Mix function.
Theorem 2: Let Ii ∈ GF (2
8) and Oi ∈ GF (2
8), 0 ≤
i ≤ 15, be the 16-byte input and output of the Super-Mix
function in Fugue, respectively. Then, the predicted parity for
this function, i.e., PˆSM , is derived as follows (we note that
parity is just an example and any other detecting codes can
be utilized)
PˆSM = {3}h(I0 + I5 + I10 + I15), (5)
where the multiplication is performed using the irreducible
polynomial M(x) = x8 + x4 + x3 + x+ 1.
Proof. We add the elements of the columns of N to reach the
predicted parity PˆSM . It is interesting to note that adding the
elements in all columns except those in columns 0, 5, 10, and
15 would result zero. For instance, if one adds the elements in
column 1 of N (modulo-2), the result would be {4}h+{1}h+
{1}h + {7}h + {7}h + {1}h + {5}h = 0. For columns 0, 5,
10, and 15, the addition of elements results in {1}h+ {4}h+
{7}h + {1}h = {4}h + {7}h = {3}h and this completes the
proof. We note that the multiplication with {3}h = {2}h +
{1}h is derived by the addition of I0 + I5 + I10 + I15 with
x(I0 + I5 + I10 + I15) mod M(x).
IV. SIMULATION RESULTS AND ASIC IMPLEMENTATIONS
The proposed error detection architectures have been simu-
lated after injecting faults. The proposed architectures have the
capability of detecting both permanent and transient faults (this
covers both natural and malicious faults). In this paper, we use
stuck-at error model. The objective in using this model is to
cover the malicious errors injected by the attackers to break
the algorithm (by injecting one or more incorrect bits) and
to detect natural errors caused by bit flips. The stuck-at error
forces one bit (for single stuck-at error model) or multiple bits
(for multiple stuck-at error model) to be stuck at logic one or
zero. This makes the result value independent of the error-free
intended value.
In fault attacks, single error injection is the ideal case
for gaining the maximum information. Nevertheless, due to
technological constraints, a more realistic error model is to
inject multiple errors. Therefore, for covering both natural
errors and fault attacks, multiple errors need to be considered.
The proposed diagnosis schemes in this paper are independent
of the life-time of errors. Therefore, both permanent and
transient stuck-at errors lead to the same error coverage. We
also note that intelligent attackers do not get confined to just
multiple stuck-at faults and thus the ability to detect single
faults is important.
The fault model used to test the proposed architectures is
created using external feedback linear-feedback shift registers
(LFSRs) to generate pseudo-random fault vectors that can
flip random bits in the output of the gates and at random
intervals. For the architectures presented, we have injected up
to 80,000 faults and recorded the number of errors. We have
also used the redundant-basis S-boxes in composite field where
applicable. Moreover, the false alarm ratios are derived. The
error coverage in all the cases is more than 99% (and for the
case of single stuck-at faults, 100% if we harden the error
indication flag comparison units), with relatively low ratio for
false alarms, i.e., 0.1%-0.3% for the cases. As we inject more
faults, the difference between the error detection results is,
comparably, not high, showing the relatively high accuracy of
the results.
Through ASIC and for the constructions of the algorithms
in 256-bit form, we also present the performance and imple-
mentation metrics of the presented constructions. The bench-
marking is performed for the error detection architectures
using TSMC 65nm library and Synopsys Design Compiler
(shown in Table I for area, frequency, throughput, and effi-
ciency [throughput over GE]). We note that in Table I, in
4TABLE I
BENCHMARK FOR THE PROPOSED ERROR DETECTION SCHEMES FOR THE HASH ALGORITHMS ON ASIC (65NM TSMC)
Algorithm Block (bits) Area [GE] Frequency [MHz] Throughput [Gbps] Efficiency [Mbps/GE]
ECHO-256
1,536
145,912 389 6.48 44.40
Proposed scheme 187,098 (28%) 370 (4.9%) 6.18 (4.6%) 33.03 (25.6%)
Fugue-256
32
49,040 547 8.77 178.8
Proposed scheme 57,900 (18.1%) 519 (5.1%) 8.33 (5.1%) 141.1 (21.1%)
order to make the area results meaningful when switching
technologies, we have also provided the NAND-gate equiv-
alency (gate equivalents: GE). This is performed using the
area of a NAND gate in the utilized TSMC 65-nm CMOS
library which is 1.41 µm2. The results presented in Table I
show acceptable overhead (degradation) for performance and
implementation metrics. We also note that the utilized platform
is merely for benchmark and we expect similar results on
field-programmable gate arrays (FPGAs) or different ASIC
libraries.
V. CONCLUSIONS
In this paper, we have proposed efficient fault detection
schemes by presenting closed formulations for the predicted
signatures of different transformations in three hash algo-
rithms. These signatures are derived to achieve low overhead
for the specific transformations and can be tailored to include
byte/word-wide predicted signatures. Through simulations, we
have shown that the proposed fault detection schemes are
highly capable of detecting natural hardware failures and are
capable of deteriorating the effectiveness of malicious fault
attacks. The proposed reliable hardware architectures have
been also implemented on ASIC platform using a 65-nm
standard technology to benchmark their hardware and timing
characteristics. The high efficiency of the proposed schemes
makes the proposed reliable architectures suitable for high-
performance applications.
REFERENCES
[1] R. Benadjila, O. Billet, H. Gilbert, G. Macario-Rat, T. Peyrin,
M. Robshaw, and Y. Seurin, “ECHO hash function,” available:
http://crypto.rd.francetelecom.com/echo/ , accessed March 2018.
[2] S. Halevi, W. E. Hall, and C. S. Jutla, “The hash function Fugue,”
Cryptology ePrint Archive, IACR, https://eprint.iacr.org/2014/423.pdf,
2014, accessed March 2018.
[3] T. Peyrin, “Improved differential attacks for ECHO and Grøstl,” Cryp-
tology ePrint Archive, Report 2010/223, 2010.
[4] O. Benoı¨t and T. Peyrin, “Side-channel analysis of six SHA-3 candi-
dates,” in Proc. CHES, 2010, pp. 140-157.
[5] J.-L. Beuchat, E. Okamoto, and T. Yamazaki, “A compact FPGA
implementation of the SHA-3 candidate ECHO,” Cryptology ePrint
Archive, Report 2010/364, 2010.
[6] K. Gaj, E. Homsirikamol, and M. Rogawski, “Fair and comprehensive
methodology for comparing hardware performance of fourteen round
two SHA-3 candidates using FPGAs,” in Proc. CHES, 2010, pp. 264-
278.
[7] S. Tillich, M. Feldhofer, M. Kirschbaum, T. Plos, J.-M. Schmidt, and A.
Szekely, “Uniform evaluation of hardware implementations of the round-
two SHA-3 candidates,” The Second SHA-3 Candidate Conference, Aug.
2010.
[8] K. Ja¨rvinen, “Sharing resources between AES and the SHA-3 second
round candidates Fugue and Grøstl,” The Second SHA-3 Candidate
Conference, Aug. 2010.
[9] M. Mozaffari Kermani and A. Reyhani-Masoleh, “Concurrent Structure-
Independent Fault Detection Schemes for the Advanced Encryption
Standard,” IEEE Trans. Computers, vol. 59, no. 5, pp. 608-622, May
2010 (special issue on System Level Design of Reliable Architectures).
[10] M. Mozaffari Kermani and A. Reyhani-Masoleh, “A High-Performance
Fault Diagnosis Approach for the AES SubBytes Utilizing Mixed
Bases,” in Proc. IEEE Workshop Fault Diagnosis and Tolerance in
Cryptography (FDTC), pp. 80-87, Nara, Japan, Sep. 2011.
[11] M. Mozaffari Kermani and A. Reyhani-Masoleh, “A Lightweight High-
Performance Fault Detection Scheme for the Advanced Encryption
Standard Using Composite Fields,” IEEE Trans. Very Large Scale
Integrated (VLSI) Systems, vol. 19, no. 1, pp. 85-91, Jan. 2011.
[12] M. Mozaffari Kermani, V. Singh, and R. Azarderakhsh, “Reliable
low-latency Viterbi algorithm architectures benchmarked on ASIC and
FPGA,” IEEE Transactions on Circuits and Systems I: Regular Papers,
vol. 64, no. 1, pp. 208-216, 2017.
[13] M. Mozaffari Kermani, R. Azarderakhsh, and A. Aghaie, “Fault de-
tection architectures for post-quantum cryptographic stateless hash-
based secure signatures benchmarked on ASIC,” ACM Trans. Embedded
Computing Syst. (special issue on Embedded Device Forensics and
Security: State of the Art Advances), vol. 16, no. 2, pp. 59:1-19, Dec.
2016.
[14] S. Patranabis, A. Chakraborty, D. Mukhopadhyay, and P. P. Chakrabarti,
“Fault space transformation: A generic approach to counter differential
fault analysis and differential fault intensity analysis on AES-like block
ciphers,” IEEE Trans. Information Forensics and Security, vol. 12, no.
5, May 2017.
[15] M. Mozaffari Kermani, R. Azarderakhsh, C. Lee, and S. Bayat-
Sarmadi, “Reliable concurrent error detection architectures for extended
Euclidean-based division over GF (2m),” IEEE Trans. Very Large Scale
Integrated (VLSI) Systems, vol. 22, no. 5, pp. 995-1003, May 2014.
[16] M. Mozaffari Kermani, R. Azarderakhsh, and A. Aghaie, “Reliable
and error detection architectures of Pomaranch for false-alarm-sensitive
cryptographic applications,” IEEE Trans. Very Large Scale Integrated
(VLSI) Systems, vol. 23, no. 12, pp. 2804-2812, Dec. 2015.
[17] M. Mozaffari Kermani, K. Tian, R. Azarderakhsh, and S. Bayat-
Sarmadi, “Fault-resilient lightweight cryptographic block ciphers for
secure embedded systems,” IEEE Embedded Systems, vol. 6, no. 4, pp.
89-92, Dec. 2014.
[18] S. Bayat-Sarmadi, M. Mozaffari Kermani, and A. Reyhani-Masoleh,
“Efficient and concurrent reliable realization of the secure cryptographic
SHA-3 algorithm,” IEEE Trans. Computers-Aided Design Integr. Cir-
cuits Syst., vol. 33, no. 7, pp. 1105-1109, Jul. 2014.
[19] A. Hodjat and I. Verbauwhede, “Area-Throughput Trade-Offs for Fully
Pipelined 30 to 70 Gbits/s AES Processors,” IEEE Trans. Computers,
vol. 55, no. 4, pp. 366-372, April 2006.
[20] A. Satoh, S. Morioka, K. Takano, and S. Munetoh, “A compact Rijndael
hardware architecture with S-Box optimization,” in Proc. ASIACRYPT,
2001, pp. 239-254.
[21] D. Canright, “A very compact S-Box for AES,” in Proc. CHES, 2005,
pp. 441-455.
[22] Y. Nogami, K. Nekado, T. Toyota, N. Hongo, and Y. Morikawa, “Mixed
bases for efficient inversion in F ((22)2)2 and conversion matrices of
SubBytes of AES,” in Proc. CHES, 2010, pp. 234-247.
