Compact Implementation of BLUE MIDNIGHT WISH-256 Hash Function on Xilinx FPGA Platform by El Hadedy, Mohamed et al.
Compact Implementation of BLUE MIDNIGHT
WISH-256 Hash Function on Xilinx FPGA
Platform
Mohamed El Hadedy1,3, Danilo Gligoroski2, Svein J. Knapskog 1 and Martin Margala 3
1The Norwegian Center of Excellence for Quantifiable Quality of Service in Communication Systems(Q2S),
Norwegian University of Science and Technology (NTNU),
O.S.Bragstads plass 2E, N-7491 Trondheim, Norway
mohamed.elhadedy@q2s.ntnu.no, Knapskog@q2s.ntnu.no
2Department of Telematics, Faculty of Information Technology, Mathematics and Electrical Engineering,
The Norwegian University of Science and Technology (NTNU),
O.S.Bragstads plass 2E, N-7491 Trondheim, Norway
danilog@item.ntnu.no
3 Department of Electrical and Computer Engineering, University of Massachusetts Lowell,
Ball 301, One University Ave, Lowell, MA 01854,USA
Mohamed Aly@uml.edu, Martin Margala@uml.edu
Abstract: – In cryptography and information security, hash
functions are considered as the ”Swiss army knife” - they
are used in countless protocols and algorithms. In 2005, we
witnessed a significant theoretical breakthrough in breaking
the current cryptographic standard SHA-1. Although there is
another family of standardized hash functions called SHA-2,
ready to replace SHA-1 hash function, at the end of 2007,
the National Institute of Standards and Technology (NIST)
decided to start a 4 year world-wide development process,
including a competition for the superior algorithm design,
for choosing the next cryptographic hash standard SHA-3.
Blue Midnight Wish is one of the proposed new designs in the
SHA-3 competition that continues in the Second Round of the
competition. This paper presents the design and analysis of
an area efficient Blue Midnight Wish compression function
with digest size of 256 bits (BMW-256) on FPGA platforms.
The proposed architecture achieves significant improvements
in system throughput with reduced area. We demonstrate
the performance of the proposed BMW hash function core
using VIRTEX 5 FPGA implementation. The new BMW hash
function design allows for 16X speed up in performance while
consuming significantly lower area than previously reported.
Keywords: Hash Function Standard, SHA-2, Blue Midnight Wish
I. Introduction
Cryptographic hash functions play a fundamental role in
modern cryptography. While related to conventional hash
functions commonly used in non- cryptographic computer
applications in both cases, larger domains are mapped to
smaller ranges they differ in several important aspects. Hash
functions take a message as input and produce an output re-
ferred to as a hash code, hash result, hash value, or simply
hash. More precisely, a hash function h maps bit strings of
arbitrary finite length to strings of fixed length, say n bits. For
a domain D and range R with h: D→ R and |D| > |R| ,the
function is many to one, implying that the existence of col-
lisions (pairs of inputs with identical output) is unavoidable.
Indeed, restricting h to a domain of t-bit inputs (t > n ), if
h were random in the sense that all outputs were essentially
equiprobable , then about 2t−n inputs would map to each out-
put, and two randomly chosen inputs would yield the same
output with probability 2−n. the basic idea of cryptographic
hash functions is that a hash-value serves as a compact rep-
resentative image (sometimes called an imprint, digital fin-
gerprint, or message digest) of an input string, and can be
used as if it were uniquely identifiable with that string [1].
As the hash functions are widely used in many security ap-
plications, it is very important that they fulfill certain security
properties. Those can be considered as follows:
• Pre-image resistance: for essentially all pre-specified
outputs, it is computationally infeasible to find any input
which hashes to that output, i.e., to find any pre-image
M such that H = Hash (M) when given any H for which
a corresponding input is not known.
• Second pre-image resistance: It must be hard to find an-
other pre-image for a given input, i.e., given M0 and
Hash (M0) getting M1 must be hard such that Hash
(M0) = Hash (M1).
• Collision resistance: It is computational infeasible to
find any two distinct inputs http M0, M1 which hash
to the same output, i.e., such that Hash (M0) = Hash
(M1).
A function that is characterized by the first two properties is
called a one-way hash function. If all three properties are
met the hash function is considered collision resistant. Find-
ing collisions in a specific hash function is the most common
1
 
 
Journal of Information Assurance and Security 5 (2010) 626-636
  Received June 14, 2010 1554-1010 $ 03.50 Dynamic Publishers, Inc.
way of attacking it. There are a lot of hash functions; they are
based on cellular automation, block ciphers, modular arith-
metic, knapsack and lattice problem, algebraic matrices, etc.
Recently, there have been two SHA algorithms introduced.
SHA-1, and SHA-2, and although they have some similari-
ties, they have also significant differences [1,2]. SHA-1 is
the most used member of the SHA hash family, employed in
hundreds of different applications and protocols. However,
in 2005, we witnessed a significant theoretical breakthrough
in breaking the current cryptographic standard SHA-1 [1].
Although there is another family of standardized hash func-
tion called SHA-2, ready to replace SHA-1, consequently,
the discovered mathematical weakness which might exist in-
dicates the need for using stronger hash functions [2].
The SHA-2 family is a family of four algorithms that differs
from each other by different digest size, different initial val-
ues and different word size. The digest sizes are: 224, 256,
384 and 512 bits. Although no attacks have yet been reported
on the SHA-2 variants, they are algorithmically similar to
SHA-1, and the National Institute of Standards and Technol-
ogy NIST have felt the need for and made efforts to develop
an improved new family of hash functions [2, 3]. At the
end of 2007, (NIST) decided to start a 4 year world-wide de-
velopment process, including a competition for the superior
algorithm design, for choosing the next cryptographic hash
standard SHA-3. The new hash standard SHA-3 is currently
under development - the function will be selected via an open
competition running between 2008 and 2012. The Blue Mid-
night Wish (BMW) hash function is one of the candidates
from Second Round of the competition and it is one of the
fastest proposed new designs in the SHA-3 competition in
software [4]. In this paper, we show the proposed architec-
ture design is simple, area efficient and provides significant
throughput improvements over previous works. The pro-
posed BMW hash function core -256 is evaluated in FPGA
using VIRTEX II XCV300-6PQ240 Xilinx device [5, 6]. The
rest of the paper is organized as follows. In Section 2, we
describe briefly the compression function of BMW-256 al-
gorithm with new tweaks. Section3, Background of Xilinx
Virtex-5 FPGA. Section4, highlights of the architecture and
FPGA implementation of the proposed BLUE MIDNIGHT
WISH hash function core sub-system. In Section 5, the syn-
thesis results of the FPGA implementation are given with
comparisons with other related works. Finally, in section 6,
conclusions, observations and future work are discussed.
II. BLUE MIDNIGHT WISH (BMW)
Blue Midnight wish hash function is a one of the 14 can-
didates in second round of the NIST’s SHA-3 competition
[7]. It was tweaked in order to resist attacks by Thomsen [8].
BMW is a family of hash functions, containing four major
instances, e.g., BMW-n, for n= 224, 256, 384, 512, where
n is the size of hash output. BMW is a wide-pipe Merkle-
Damgrd hash construction [9] with an unconventional com-
pression function, where the nonlinearity is derived from the
overlap of modular addition and XOR operations. The most
innovative parts of the design are the compression function
construction and the design of the permutations; much of the
design is novel and unique amongst the second-round candi-
dates. BMW has very good performance and appears to be
Figure. 1: Representation of the compression function in
Blue Midnight Wish
Figure. 2: A graphic representation of the Blue Midnight
Wish Hash function
suitable for a wide range of platforms. It has modest memory
requirements. The BMW has four different operations in the
hash computation stage: bit-wise logical word XOR opera-
tion, Word addition and subtraction, shift operations (left or
right), and rotate left operation. BMW uses a double pipe de-
sign to increase the resistance against generic multi-collision
attacks and length extension attacks [10,11]. In the double
pipe design, the sizes of the inputs to the compression func-
tions are twice the message digest size.
As shown in Fig.1, the compression function of BMW takes
the chaining of 16 words H(i−1)0 , H
(i−1)
1 ,....,H
(i−1)
15 , and
 
 627 Hadedy et al.
Table 1: Definition of the function f0 of BLUE MIDNIGHT WISH
1. Bijective Transform of M(i) ⊕H(i−1)
W
(i−1)
0 = (M
(i)
5 ⊕H
(i−1)
5 )− (M
(i)
7 ⊕H
(i−1)
7 ) + (M
(i)
10 ⊕H
(i−1)
10 ) + (M
(i)
13 ⊕H
(i−1)
13 ) + (M
(i)
14 ⊕H
(i−1)
14 )
W
(i−1)
1 = (M
(i)
6 ⊕H
(i−1)
6 )− (M
(i)
8 ⊕H
(i−1)
8 ) + (M
(i)
11 ⊕H
(i−1)
11 ) + (M
(i)
14 ⊕H
(i−1)
14 )− (M
(i)
15 ⊕H
(i−1)
15 )
W
(i−1)
2 = (M
(i)
0 ⊕H
(i−1)
0 ) + (M
(i)
7 ⊕H
(i−1)
7 ) + (M
(i)
9 ⊕H
(i−1)
9 )− (M
(i)
12 ⊕H
(i−1)
12 ) + (M
(i)
15 ⊕H
(i−1)
15 )
W
(i−1)
3 = (M
(i)
0 ⊕H
(i−1)
0 )− (M
(i)
1 ⊕H
(i−1)
1 ) + (M
(i)
8 ⊕H
(i−1)
8 )− (M
(i)
10 ⊕H
(i−1)
10 ) + (M
(i)
13 ⊕H
(i−1)
13 )
W
(i−1)
4 = (M
(i)
1 ⊕H
(i−1)
1 ) + (M
(i)
2 ⊕H
(i−1)
2 ) + (M
(i)
9 ⊕H
(i−1)
9 )− (M
(i)
11 ⊕H
(i−1)
11 )− (M
(i)
14 ⊕H
(i−1)
14 )
W
(i−1)
5 = (M
(i)
3 ⊕H
(i−1)
3 )− (M
(i)
2 ⊕H
(i−1)
2 ) + (M
(i)
10 ⊕H
(i−1)
10 )− (M
(i)
12 ⊕H
(i−1)
12 ) + (M
(i)
15 ⊕H
(i−1)
15 )
W
(i−1)
6 = (M
(i)
4 ⊕H
(i−1)
4 )− (M
(i)
0 ⊕H
(i−1)
0 )− (M
(i)
3 ⊕H
(i−1)
3 )− (M
(i)
11 ⊕H
(i−1)
11 ) + (M
(i)
13 ⊕H
(i−1)
13 )
W
(i−1)
7 = (M
(i)
1 ⊕H
(i−1)
1 )− (M
(i)
4 ⊕H
(i−1)
4 )− (M
(i)
5 ⊕H
(i−1)
5 )− (M
(i)
12 ⊕H
(i−1)
12 )− (M
(i)
14 ⊕H
(i−1)
14 )
W
(i−1)
8 = (M
(i)
2 ⊕H
(i−1)
2 )− (M
(i)
5 ⊕H
(i−1)
5 )− (M
(i)
6 ⊕H
(i−1)
6 ) + (M
(i)
13 ⊕H
(i−1)
13 )− (M
(i)
15 ⊕H
(i−1)
15 )
W
(i−1)
9 = (M
(i)
0 ⊕H
(i−1)
0 )− (M
(i)
3 ⊕H
(i−1)
3 ) + (M
(i)
6 ⊕H
(i−1)
6 )− (M
(i)
7 ⊕H
(i−1)
7 ) + (M
(i)
14 ⊕H
(i−1)
14 )
W
(i−1)
10 = (M
(i)
8 ⊕H
(i−1)
8 )− (M
(i)
1 ⊕H
(i−1)
1 )− (M
(i)
4 ⊕H
(i−1)
4 )− (M
(i)
7 ⊕H
(i−1)
7 ) + (M
(i)
15 ⊕H
(i−1)
15 )
W
(i−1)
11 = (M
(i)
8 ⊕H
(i−1)
8 )− (M
(i)
0 ⊕H
(i−1)
0 )− (M
(i)
2 ⊕H
(i−1)
2 )− (M
(i)
5 ⊕H
(i−1)
5 ) + (M
(i)
9 ⊕H
(i−1)
9 )
W
(i−1)
12 = (M
(i)
1 ⊕H
(i−1)
1 ) + (M
(i)
3 ⊕H
(i−1)
3 )− (M
(i)
6 ⊕H
(i−1)
6 )− (M
(i)
9 ⊕H
(i−1)
9 ) + (M
(i)
10 ⊕H
(i−1)
10 )
W
(i−1)
13 = (M
(i)
2 ⊕H
(i−1)
2 ) + (M
(i)
4 ⊕H
(i−1)
4 ) + (M
(i)
7 ⊕H
(i−1)
7 ) + (M
(i)
10 ⊕H
(i−1)
10 ) + (M
(i)
11 ⊕H
(i−1)
11 )
W
(i−1)
14 = (M
(i)
3 ⊕H
(i−1)
3 )− (M
(i)
5 ⊕H
(i−1)
5 ) + (M
(i)
8 ⊕H
(i−1)
8 )− (M
(i)
11 ⊕H
(i−1)
11 )− (M
(i)
12 ⊕H
(i−1)
12 )
W
(i−1)
15 = (M
(i)
12 ⊕H
(i−1)
0 )− (M
(i)
4 ⊕H
(i−1)
4 )− (M
(i)
6 ⊕H
(i−1)
6 )− (M
(i)
9 ⊕H
(i−1)
9 ) + (M
(i)
13 ⊕H
(i−1)
13 )
2. Further bijective transform of W (i)j , j = 1, 2, 3, ..., 15
Q
(i)
0 = S0(W
(i)
0 ) +H
(i−1)
1 ;Q
(i)
1 = S1(W
(i)
1 ) +H
(i−1)
2 ;Q
(i)
2 = S2(W
(i)
2 ) +H
(i−1)
3 ;Q
(i)
3 = S3(W
(i)
3 ) +H
(i−1)
4 ;
Q
(i)
4 = S4(W
(i)
4 ) +H
(i−1)
5 ;Q
(i)
5 = S0(W
(i)
1 ) +H
(i−1)
6 ;Q
(i)
6 = S1(W
(i)
6 ) +H
(i−1)
7 ;Q
(i)
7 = S2(W
(i)
7 ) +H
(i−1)
8 ;
Q
(i)
8 = S3(W
(i)
8 ) +H
(i−1)
9 ;Q
(i)
9 = S4(W
(i)
9 ) +H
(i−1)
10 ;Q
(i)
10 = S0(W
(i)
10 ) +H
(i−1)
11 ;Q
(i)
11 = S1(W
(i)
11 ) +H
(i−1)
12 ;
Q
(i)
12 = S2(W
(i)
12 ) +H
(i−1)
13 ;Q
(i)
13 = S3(W
(i)
13 ) +H
(i−1)
14 ;Q
(i)
14 = S4(W
(i)
14 ) +H
(i−1)
15 ;Q
(i)
15 = S3(W
(i)
15 ) +H
(i−1)
0 ;
3. S-transform used in f0 Function
S0(x) = SHR1(x)⊕ SHL3(x)⊕ROTL4(x)⊕ROTL19(x)
S1(x) = SHR1(x)⊕ SHL2(x)⊕ROTL8(x)⊕ROTL23(x)
S2(x) = SHR2(x)⊕ SHL1(x)⊕ROTL12(x)⊕ROTL25(x)
S3(x) = SHR2(x)⊕ SHL2(x)⊕ROTL15(x)⊕ROTL29(x)
Table 2: Initial double pipe H(i−1) for BMW-256 (Hexadec-
imal values)
BLUE MIDNIGHT WISH-256
H
(i−1)
0 0x40414243 H
(i−1)
8 0x60616263
H
(i−1)
1 0x44454647 H
(i−1)
9 0x64656667
H
(i−1)
2 0x48494A4B H
(i−1)
10 0x68696A6B
H
(i−1)
3 0x4C4D4E4F H
(i−1)
11 0x6C6D6E6F
H
(i−1)
4 0x50515253 H
(i−1)
12 0x70717273
H
(i−1)
5 0x54555657 H
(i−1)
13 0x74757677
H
(i−1)
6 0x58595A5B H
(i−1)
14 0x88898A8B
H
(i−1)
7 0x5C5D5E5F H
(i−1)
15 0x8C8D8E8F
a message 16 words M (i)0 , M
(i)
1 ,....,M
(i)
15 as input,and pro-
duces the updated chaining H(i)0 , H
(i)
1 ,....,H
(i)
15 . The size of
a word is 32 bits for BMW-224/256 and 64 bits for BMW-
384/512. The new BMW compression function [5,6] uses
2 main parts as shown in Fig.2. The first one comprises
three functions, called f0, f1, and f2, in sequence to gen-
erate H(i). As shown in the following. Inputs for the func-
tion f0 are two arguments as shown in Fig.1: The first ar-
gument consists of sixteen 32-bit words, which are work-
ing as initial double pipe values H(i−1)0 , H
(i−1)
1 ,....,H
(i−1)
15 ,
as shown in Table 1, for BMW-256. The second argument
consists of sixteen 32-bit words, which represent the input
message block: M (i)0 , M
(i)
1 ,....,M
(i)
15 . As shown in Table 2,
the function f0(M (i), H(i−1)) computes M (i) ⊕ H(i−1) ,
produces a temporary Wj(i) ,j = 1, 2, 3, ..., 15, and Q
(i)
a =
(Q(i)0 ,Q
(i)
1 ,...,Q
(i)
15 ) fromW
(i)
j . The second function f1 takes
the same message block M(i), and Q
(i)
a (the output from
f0) as input, and generates the second part of the quadru-
ple pipe Q(i)b = (Q
(i)
0 , Q
(i)
1 ,..., Q
(i)
15 ), through ADD Element,
EXPAND 1 and EXPAND 2 expression functions as shown
in Eq.(1) and Table.3. Expansion functions make use of the
s-transform along with the r-transforms as shown in Eq. (2),
and Eq. (3).
For ii = 0,1 : Q(i)(ii+16) = Expand1(ii+ 16)
Expand1(i) = S1(Q
i
(j−16)) + S2(Q
i
(j−15)) +
S3(Q
i
(j−14)) + S0(Q
i
(j−13)) + S1(Q
i
(j−12)) +
S2(Q
i
(j−11)) + S3(Q
i
(j−10)) + S0(Q
i
(j−9)) +
S1(Q
i
(j−8))+S2(Q
i
(j−7))+S3(Q
i
(j−6))+S0(Q
i
(j−5))+
S1(Q
i
(j−4))+S2(Q
i
(j−3))+S3(Q
i
(j−2))+S0(Q
i
(j−1))+
ADD Element(j − 16)
For ii=2,3,4,5,...,15 : Q(i)(ii+16) = Expand2(ii+ 16)
Expand2(i) = (Q
i
(j−16)) + r1(Q
i
(j−15)) + (Q
i
(j−14)) +
r2(Q
i
(j−13)) + (Q
i
(j−12)) + r3(Q
i
(j−11)) + (Q
i
(j−10)) +
r4(Q
i
(j−9)) + (Q
i
(j−8)) + r5(Q
i
(j−7)) + (Q
i
(j−6)) +
r6(Q
i
(j−5)) + (Q
i
(j−4)) + r7(Q
i
(j−3)) + S4(Q
i
(j−2)) +
S5(Q
i
(j−1)) +ADD Element(j − 16)
(1)
628Compact Implementation of BLUE MIDNIGHT WISH-256 Hash Function on Xilinx FPGA Platform
Note that ADD Element(j) index expressions involving
the variable j for left rotations, M and H are computed
modulo(16).
ADD Element(j) = (ROTL(j+1)(M
(i)
(j)) +
ROTL(j+4)(M
(i)
(j+3)) + ROTL
(j+11)(M
(i)
(j+10)) +
Kj+16)⊕H(i)j+7
(2)
r1(x) = ROTL
3(X), r2(x) = ROTL7(x),
r3(x) = ROTL
13(x), r4(x) = ROTL16(x),
r5(x) = ROTL
19(x), r6(x) = ROTL23(x),
r7(x) = ROTL
27(x), S4(x) = SHR1(x)⊕ x,
S5(x) = SHR
2(x)⊕ x
(3)
Table 3: Kj for BLUE MIDNIGHT WISH (Hexadecimal Val-
ues)
BLUE MIDNIGHT WISH-256
K0 0x55555550 K8 0x 7ffffff8
K1 0x5aaaaaa5 K9 0x 8555554d
K2 0x5ffffffa K10 0x 8aaaaaa2
K3 0x6555554f K11 0x 8ffffff7
K4 0x6aaaaaa4 K12 0x 9555554c
K5 0x6ffffff9 K13 0x 9aaaaaa1
K6 0x7555554e K14 0x 9ffffff6
K7 0x7aaaaaa3 K15 0x a555554b
As shown in Table 4, the third function f2 takes three argu-
ments: Message block M(i) and the quadruple pipes Qa(i)
and Qb(i) and generates the value of the double pipe Hi.
The second part contains the same functions but instead
of initial double pipe values H(i−1)0 , H
(i−1)
1 , H
(i−1)
2
,..., H(i−1)15 , it will use constant values CONST
final
j
= (CONST final0 , CONST
final
1 , CONST
final
2 ,...,
CONST final15 )as shown in Table 5 and the input mes-
sage block will be the new double pipe H(i) = (H(i)0 , H
(i)
1 ,
H
(i)
2 ,..., H
(i)
15 ). The reason to use CONST
final
j values is
to remove one degree of freedom to the attackers who try to
find pseudo collisions and pseudo-preimages. Additionally,
the final invocation of the compression function is a measure
for any attack whereby an attacker can find near collisions or
near-pseudo-collisions of the compression function of BMW
[5,6].
Table 5: CONST finalj for BMW-256 (Hexadecimal values)
BLUE MIDNIGHT WISH-256
CONST final0 0Xaaaaaaa0 CONST
final
8 0Xaaaaaaa8
CONST final1 0Xaaaaaaa1 CONST
final
9 0Xaaaaaaa9
CONST final2 0Xaaaaaaa2 CONST
final
10 0Xaaaaaaaa
CONST final3 0Xaaaaaaa3 CONST
final
11 0Xaaaaaaab
CONST final4 0Xaaaaaaa4 CONST
final
12 0Xaaaaaaac
CONST final5 0Xaaaaaaa5 CONST
final
13 0Xaaaaaaad
CONST final6 0Xaaaaaaa6 CONST
final
14 0Xaaaaaaae
CONST final7 0Xaaaaaaa7 CONST
final
15 0Xaaaaaaaf
Figure. 3: Xilinx Virtex 5 FPGA components [12,13]
Figure. 4: Arrangement of slices in a CLB of Virtex-5 FPGA
[13]
III. Background of Xilinx Virtex-5 FPGA Ar-
chitecture
Virtex-5 FPGAs offer today’s designers the ultimate system
integration platform to solve their most demanding require-
ments. Virtex-5 FPGAs offer unprecedented performance
and density gains - at speeds on average 30 percent faster
and 65 percent increased capacity over previous generation
90nm FPGAs. Notably, this breakthrough performance has
been achieved while reducing dynamic power consumption
by 35 percent and consuming 45 percent less area than pre-
vious generation devices. Description of Virtex-5 FPGA ar-
chitecture parts can help us to understand how to implement
BMW-256 compression function. As shown in Fig.3 the ma-
jor components of Xilinx Virtex-5 FPGA. The Configurable
Logic Blocks (CLBs) are the main logic resources for imple-
menting sequential as well as combinatorial circuits. Each
CLB element is connected to a switch matrix for access to
the general routing matrix (shown in Fig. 4). A CLB ele-
ment contains a pair of slices. These two slices do not have
direct connections to each other, and each slice is organized
as a column. Each slice in a column has an independent carry
chain. For each CLB, slices in the bottom of the CLB are
labeled as SLICE (0), and slices in the top of the CLB are
labeled as SLICE (1).
The Xilinx tools designate slices with the following defini-
tions. An ”X” followed by a number identifies the position
of each slice in a pair as well as the column position of the
slice. The ”X” number counts slices starting from the bot-
 
 629 Hadedy et al.
Table 4: Definition of the folding function f2 of BLUE MIDNIGHT WISH
1. Cumulative temporary variables XL and XH
XL = Q
(i)
16 ⊕Q
(i)
17 ⊕Q
(i)
18 ⊕Q
(i)
19 ⊕Q
(i)
20 ⊕Q
(i)
21 ⊕Q
(i)
22 ⊕Q
(i)
23
XH = XL⊕Q(i)24 ⊕Q
(i)
25 ⊕Q
(i)
26 ⊕Q
(i)
27 ⊕Q
(i)
28 ⊕Q
(i)
29 ⊕Q
(i)
30 ⊕Q
(i)
31
2. The new double pipe H(i)
H
(i)
0 = (SHL
5(XH)⊕ SHR5(Q(i)16 )⊕M
(i)
0 ) + (XL⊕Q
(i)
24 ⊕Q
(i)
0 )
H
(i)
1 = (SHL
7(XH)⊕ SHL8(Q(i)17 )⊕M
(i)
1 ) + (XL⊕Q
(i)
25 ⊕Q
(i)
1 )
H
(i)
2 = (SHL
5(XH)⊕ SHL5(Q(i)18 )⊕M
(i)
2 ) + (XL⊕Q
(i)
26 ⊕Q
(i)
2 )
H
(i)
3 = (SHR
1(XH)⊕ SHL5(Q(i)19 )⊕M
(i)
3 ) + (XL⊕Q
(i)
27 ⊕Q
(i)
3 )
H
(i)
4 = (SHR
3(XH)⊕ (Q(i)20 )⊕M
(i)
4 ) + (XL⊕Q
(i)
28 ⊕Q
(i)
4 )
H
(i)
5 = (SHL
6(XH)⊕ SHR5(Q(i)21 )⊕M
(i)
5 ) + (XL⊕Q
(i)
29 ⊕Q
(i)
5 )
H
(i)
6 = (SHR
4(XH)⊕ SHL6(Q(i)22 )⊕M
(i)
6 ) + (XL⊕Q
(i)
30 ⊕Q
(i)
6 )
H
(i)
7 = (SHr
11(XH)⊕ SHL2(Q(i)23 )⊕M
(i)
7 ) + (XL⊕Q
(i)
31 ⊕Q
(i)
7 )
H
(i)
8 = ROTL
9(H
(i)
4 + (XH ⊕Q
(i)
24 ⊕M
(i)
8 ) + (SHL
8(XL)⊕Q(i)23 ⊕Q
(i)
8 )
H
(i)
9 = ROTL
10(H
(i)
5 + (XH ⊕Q
(i)
25 ⊕M
(i)
9 ) + (SHR
6(XL)⊕Q(i)16 ⊕Q
(i)
9 )
H
(i)
10 = ROTL
11(H
(i)
6 + (XH ⊕Q
(i)
26 ⊕M
(i)
10 ) + (SHL
6(XL)⊕Q(i)17 ⊕Q
(i)
10 )
H
(i)
11 = ROTL
12(H
(i)
7 + (XH ⊕Q
(i)
27 ⊕M
(i)
11 ) + (SHL
4(XL)⊕Q(i)18 ⊕Q
(i)
11 )
H
(i)
12 = ROTL
13(H
(i)
0 + (XH ⊕Q
(i)
28 ⊕M
(i)
12 ) + (SHR
3(XL)⊕Q(i)19 ⊕Q
(i)
12 )
H
(i)
13 = ROTL
14(H
(i)
1 + (XH ⊕Q
(i)
29 ⊕M
(i)
13 ) + (SHR
4(XL)⊕Q(i)20 ⊕Q
(i)
13 )
H
(i)
14 = ROTL
15(H
(i)
2 + (XH ⊕Q
(i)
30 ⊕M
(i)
14 ) + (SHR
7(XL)⊕Q(i)21 ⊕Q
(i)
14 )
H
(i)
15 = ROTL
16(H
(i)
3 + (XH ⊕Q
(i)
31 ⊕M
(i)
15 ) + (SHR
2(XL)⊕Q(i)22 ⊕Q
(i)
15 )
tom in sequence 0, 1 (the first CLB column); 2, 3 (the second
CLB column); etc. A ”Y” followed by a number identifies a
row of slices. The number remains the same within a CLB,
but counts up in sequence from one CLB row to the next
CLB row, starting from the bottom. Fig.5 shows four CLBs
located in the bottom-left corner of the die [13]. For the Xil-
inx Virtex 5, every slice contains four look-up tables (LUTs),
four storage elements, wide-function multiplexers, and carry
logic. These elements are used by all slices to provide logic,
arithmetic, and ROM functions.
In addition to this, some slices support two additional func-
tions: storing data using distributed RAM and shifting data
with 32-bit registers. Slices that support these additional
functions are called SLICEM; others are called SLICEL as
shown in Fig.6, and Fig.7 respectively. The storage elements
in a slice can be configured as either edge-triggered D-type
flip-flops or level-sensitive latches. The D input can be driven
directly by a LUT output via AFFMUX, BFFMUX, CFF-
MUX or DFFMUX, or by the BYPASS slice inputs bypass-
ing the function generators via AX, BX, CX, or DX input.
When configured as a latch, the latch is transparent when the
CLK is Low.
In literature, LUT is a basic unit for reconfigurable logic.
The LUT can implement any digital logic truth table, con-
strained only by the number of signal inputs and outputs.
In addition to, basic LUTs, slices contain three multiplex-
ers (F7AMUX , F7BMUX4, and F8MUX). These multi-
plexers are used to combine up to four function generators to
provide any function of seven or eight inputs in a slice.
F7AMUX and F7BMUX are used to generate seven input
functions from LUTs A and B, or C and D, while F8MUX is
used to combine all LUTs to generate eight input functions.
Functions with more than eight inputs can be implemented
using multiple slices. There are no direct connections be-
tween slices to form function generators greater than eight
inputs within a CLB or between slices. Each CLB contains
2 slices, 8 LUTs, 8 flip flops, 2 arithmetic and carry chains,
256 bit distributed RAM and 128 bits shift registers.
Figure. 5: The relations between slices and CLBs [13]
IV. BMW-256 core Architecture
The BMW-256 core hardware architecture can be best de-
scribed in three main sections. In the first section, we give an
overview of the hardware block as whole. In the second sec-
tion, we describe by the details the internal components of
BMW-256 core. In the third section, we describe the BMW-
256 hashing core operations.
A. Hardware Overview
As shown in Fig.8, BMW-256 Core block, it comprises four
blocks as the following:
630Compact Implementation of BLUE MIDNIGHT WISH-256 Hash Function on Xilinx FPGA Platform
Figure. 6: SLICEM Internal Architecture [13]
Figure. 7: SLICEL Internal Architecture [13]
• Memory block: it contains block RAM (832 byte),
block ROM (192 byte) and Memory Data Bus block
to organize data flows between ROM, RAM blocks and
Hash Computation Core.
• Hash Computation Core: it comprises 32 bit ALU
(Arithmetic Logic Unit) and 32 bit parallel Shift/Rotate
Block. Hash Computation Core receives data flow from
Memory block and transmits data to Memory block and
Output Shift Register according to Controller block.
Figure. 8: BLUE MIDNIGHT WISH-256 Core Architecture
• Output Shift Register: it receive the final double pipe
hash values H(i)0 , H
(i)
1 , H
(i)
2 , ..., H
(i)
15 from Hash Com-
putation Core according to Controller process.
• Controller : it contains Moore Finite State Machine
(MFSM) instructions to control the data flow between
BMW-256 Core components according to the BMW-
256 mathematical algorithm [5,6]
B. BMW-256 Core internal components
In this section, we describe by details BMW-256 Core com-
ponents as the following:
• Memory Block: To implement the BMW-256 Core
Memory block, we used FPGA block RAM 256 x 32
bits as shown in Fig.9, the Memory block interface in
Table 6, and the Memory Block Truth Table as shown
in Table 7.
Table 6: BLUE MIDNIGHT WISH256 Core Memory Block
Interface
signal I/O Description
CLK IN Global CLOCK
RAM ROM IN Active Low , initialize RAM to write and read data ,
Active High, initialize ROM to read data
WR IN Active high, to write data in RAM rows
ADD(7:0) IN Data present in Address Bus
DATA IN (31:0) IN Data presents in Data RAM input 32 bit Bus
DATA OUT(31:0) OUT Data presents in Data RAM output 32 bit Bus
As we mentioned in section 3.1, Memory Block con-
tains ROM to store the BMW-256 constants Kj ,
J=0,1,..., 15 , H(i−1) and Constantfinalj as shown in
Table.8, Table.9 and Table.10. on the other hand, Mem-
ory Block contains RAM to store the BMW-256 input
block Message (M (i)0 , M
(i)
1 , ..., M
(i)
15 ) as shown in, the
intermediate values of BMW hash function as shown in
Table.9, and the final double pipe values H(i) = (H(i)0 ,
H
(i)
1 , H
(i)
2 ,...,H
(i)
15 ).
• Hash Computation Core: as shown in Fig.8, Hash Com-
putation Core contains three operative components as
the following:
 
 631 Hadedy et al.
Figure. 9: BLUE MIDNIGHT WISH-256 Core Memory
Block
Table 7: BLUE MIDNIGHT WISH-256 Core Memory Block
Operation
WR RAM ROM Operation
X (don’t care) 1 ROM (Read)
0 0 RAM (Read)
1 0 RAM (Write)
• Parallel Shifter/Rotator: As shown in Fig.10 Parallel
Shifter/Rotator contains a 5 x 32 Mux (Multiplex) ma-
trix each one is Mux 2x1with big Encoder (5 X 11).
This component is responsible for the shift and rotation
operations of the 32 bit word. It receives 32 bit parallel
data from the Memory Block and transmits 32 bit paral-
lel data to the ALU. That happens decided by the value
of shifter control word. Because we have 46 operations
in BMW hash Core, the width of Shifter control word is
6 control bits as shown Table 11.
• Arithmetic and Logic Unit (ALU) component: As
shown in Fig.11, Fig.12 ALU contains 32 cells all of
them working together according to the truth table as
shown in Table 12. ALU component uses four differ-
ent operations in the hash computation stage: bit-wise
logical word XOR operation, word addition and sub-
traction (modulo 232). ALU component receives 32 bit
data word from Parallel shifter/rotator and Temporary
Register and transmit the output to the Temporary Reg-
ister to work as a parallel accumulator. This is decided
by the value of ALU Control word (2 bit word).
• Temporary Register contains 32 Mux2x1 and Shift reg-
ister as shown in Fig.13, and Table 13. Temporary Reg-
ister works as an accumulator. It receives 32 bit words
from Memory unit and ALU and transmits data 32 bit
word to ALU and the output stage. This happens de-
cided by the value of TMP Control word (2 bit word).
• Controller:Controller has been designed as Moore FSM
as shown in Fig.16. it Contains, six operative parts, all
of them working together to produce Memory Address
Table 8: BLUE MIDNIGHT WISH-256 Core ROM (H(i−1)
locations)
RAM 256 X 32 Row Location Double pipe initial Values ADD (7:0)
208 H(i−1)0 11010000
209 H(i−1)1 11010001
210 H(i−1)2 11010010
211 H(i−1)3 11010011
212 H(i−1)4 11010100
213 H(i−1)5 11010101
214 H(i−1)6 11010110
215 H(i−1)7 11010111
216 H(i−1)8 11011000
217 H(i−1)9 11011001
218 H(i−1)10 11011010
219 H(i−1)11 11011011
220 H(i−1)12 11011100
221 H(i−1)13 11011101
222 H(i−1)14 11011110
223 H(i−1)15 11011111
Table 9: BMW-256 Core ROM (Kj locations)
RAM 256 X 32 Row location Double pipe initial Values ADD (7:0)
224 K(i−1)0 11100000
225 K(i−1)1 11100001
226 K(i−1)2 11100010
227 K(i−1)3 11100011
228 K(i−1)4 11100100
229 K(i−1)5 11100101
230 K(i−1)6 11100110
231 K(i−1)7 11100111
232 K(i−1)8 11101000
233 K(i−1)9 11101001
234 K(i−1)10 11101010
235 K(i−1)11 11101011
236 K(i−1)12 11101100
237 K(i−1)13 11101101
238 K(i−1)14 11101110
239 K(i−1)15 11101111
Figure. 10: Parallel Shifter/Rotator schematic diagram
word to control Memory block traffic with other BMW-
256 sub-systems and on the other hand. T he controller
produces the control word to control data flow between
BMW-256 Core sub-systems. The controller subsys-
632Compact Implementation of BLUE MIDNIGHT WISH-256 Hash Function on Xilinx FPGA Platform
Table 10: BLUE MIDNIGHT WISH-256 Core ROM
(Constfinalj locations)
RAM 256 X 32 Row location Double pipe initial Values ADD (7:0)
240 Constfinal0 11110000
241 Constfinal1 11110001
242 Constfinal2 11110010
243 Constfinal3 11110011
244 Constfinal4 11110100
245 Constfinal5 11110101
246 Constfinal6 11110110
247 Constfinal7 11110111
248 Constfinal8 11111000
249 Constfinal9 11111001
250 Constfinal10 11111010
251 Constfinal11 11111011
252 Constfinal12 11111100
253 Constfinal13 11111101
254 Constfinal14 11111110
255 Constfinal15 11111111
Figure. 11: Parallel 32 bit ALU (Arithmetic and Logic Unit)
Figure. 12: ALU Cell (Arithmetic and Logic Unit)
tems are working as the following: Input Message Con-
trol, once the start signal become high, the input mes-
sage Control starts to organize the sixteen input mes-
sages inside RAM locations. After that the round En-
able signal become high, that will make BMW Round
Control starts to execute the f0, f1, and f2 BMW ac-
cording to BMW-256 Algorithm. Finally, with the fi-
nal message round, with Final Round signal becomes
high, the Final process Control Block starts to trans-
fer Constfinal in Messages locations and start BMW
Table 11: Parallel Shifter/Rotator operations in BLUE MID-
NIGHT WISH-256 Core
S/R control Operation S/R control Operation
000000 LOAD 010111 ROL8
000001 SHl1 011000 ROL9
000010 SHl2 011001 ROL10
000011 SHl3 011010 ROL11
000100 SHl4 011011 ROL12
000101 SHl5 011100 ROL13
000110 SHl6 011101 ROL14
000111 SHl8 011110 ROL15
001000 SHR1 011111 ROL16
001001 SHR2 100000 ROL17
001010 SHR3 100001 ROL18
001011 SHR4 100010 ROL19
001100 SHR5 100011 ROL20
001101 SHR6 100100 ROL21
001110 SHR7 100101 ROL22
001111 SHR11 100110 ROL23
010000 ROL1 100111 ROL24
010001 ROL2 101000 ROL25
010010 ROL3 101001 ROL26
010011 ROL4 101010 ROL27
010100 ROL5 101011 ROL28
010101 ROL6 101100 ROL29
010110 ROL7 101101 ROL30
Table 12: Arithmetic and Logic Unit Cell (Truth Table)
LOG ARTH ALU Operations
1 X (Don’t Care) XOR
0 0 ADD
0 1 Subtract
Figure. 13: Temporary Register Unit
Round Control Block to work to produce the Final Hash
Output. Both Control Selector and Address Bus Selec-
tor are Multiplexers controlled by Combinational circuit
block called Bus Selector Control.
C. The BMW-256 hashing core operations
In this section we describe how the Computation Hash Core
works to execute the internal functions in BMW-256 as
 
 633 Hadedy et al.
Table 13: TMP Register (Truth Table)
Shift Register Enable MUX Selector Temp Output
0 X (don’t care) Idle
1 0 Memory TMP Bus
1 1 ALU Output
Figure. 14: Controller
shown in Table 14. For example, if we would XOR two
pieces of data in locations number 4 and 5 in memory unit,
and write the result in location number 7. First, Controller
gives order to Memory Block to choose locations number
4. Then Controller asks Temporary Register to pick up the
data from data bus and then the same operation happens with
location number 5. But instead of Temporary Register, the
Parallel Shifter/Rotator picks up data. Now, Controller asks
operation encoder unit to give order to one bit ALU unit to
add these data and save it in Temporary Register. Finally,
Controller gives order to Memory Block to pick up data and
located in location number 7.
Table 14: BLUE MIDNIGHT WISH internal functions (exe-
cution times)
BMW functions No. of Cycles
F0 413
F1 476
F2 171
V. Performance Evaluation
BMW 256 Core has been designed in VHDL [14] and it was
synthesized (synthesis, placement and routing) using ISE
foundation 10.1 [15] in VIRTEX XCV300-6PQ240 and Vir-
tex5 XC5VLX110 Xilinx devices. As shown in Table 15, the
comparison with the previous implementation for BMW-256
[16] shows improvements in number of cycles for each oper-
ation. In the proposed design we used Parrallel Shifting and
rotation with parrallel 32-bit ALU instead of using single 1
bit cell , by this way , the new design has succeeded to reduce
number of cycles for each operation.
Table 15: BLUE MIDNIGHT WISH-256 Hashing Core Oper-
ations (execution times)
Operation Proposed BLUE MIDNIGHT WISH-256[16]
Load 1 1
XOR 3 32
ADD 1 32
SUB 1 32
S0 4 127
S1 4 128
S2 4 129
S3 4 132
S4 4 34
S5 2 34
R1 1 3
R2 1 7
R3 1 13
R4 1 16
R5 1 19
R6 1 23
R7 1 27
In Table 16, we compare the area size for different designs
for SHA-2 [17,18] and with deferent candidates from SHA-3
competitions in VIRTEX and VIRTEX 5 Xilinx devices.
Table 16: BLUE MIDNIGHT WISH-256 Performance results
Algorithm Name FPGA Type Area (Slice) Throughput
Proposed Virtex XCV300 1314 6 Mbps
Virtex5 XC5VLX110 445 21 Mbps
BMW-256 [16] Virtex XCV300 2147 1.07 Mbps
Virtex5 XC5VLX110 1980 5Mbps
SHA-256[17] Virtex XCV200 4768 291Mbps
SHA-256 [18] Virtex E XCV600 5828 ———-
As shown in Table 16, we have achieved around 38% lower
area compared to the old design for BMW-256 with rising
up throughput around 6 times compare to the old one, on the
same FPGA Virtex XCV300 device and around 77% lower
area compared to the old BMW-256 with rising up through-
put 16 times compare to the old one on the same FPGA Vir-
tex5 XC5VLX110.
VI. Conclusion and Future Work
In This paper we presented an FPGA implementation of a
new BMW-256 Hashing core structure with 256 bits of mes-
sage digest using parallel shifter/rotator and parallel 32 bit
word Arithmetic logic unit (ALU). The BMW-256 core re-
ceives 16 messages words of 32 bits and processes them. The
goal was to use as small area as possible in order to minimize
the hardware cost. we have achieved around 72% lower area
compared to SHA-256 on the same FPGA device.
634Compact Implementation of BLUE MIDNIGHT WISH-256 Hash Function on Xilinx FPGA Platform
References
[1] X. Wang, A. C. Yao, and F. Yao. ”Cryptanalysis on
SHA-1 hash function”. In proceeding of The Crypto-
graphic hash workshop. National Institute of Standards
and Technology, November 2005.
[2] NIST (2006). ”NIST Comments on Cryptanalytic At-
tacks on SHA-1”
[3] William E. Burr, ”Cryptographic Hash Standards:
Where Do We Go from Here?”, IEEE Security and
Privacy, Vol. 4, No. 2, pp. 88-91, Mar./Apr. 2006,
doi:10.1109/MSP.2006.37
[4] eBACS (2010). ”ECRYPT Benchmarking of Crypto-
graphic Systems”
[5] D. Gligoroski, V. Klima, S. J. Knapskog, M. El-
Hadedy, J.Amundsen ,”Blue Midnight Wish”, In pro-
ceeding of The First SHA-3 Candidate Conference,
February 2009, Belgium- Leuven
[6] D. Gligoroski, V. Klima, ”A Document describing all
modi cations made on the Blue Midnight Wish cryp-
tographic hash function before entering the Second
Round of SHA-3 hash competition”
[7] ”National Institute of Standards and Technology. An-
nouncing Request for Candidate Algorithm Nomina-
tions for a New Cryptographic Hash Algorithm (SHA-
3) Family”. Federal Register, 27(212):6221262220,
November 2007
[8] S. S. Thomsen. ”Pseudo-cryptanalysis of the Original
Blue Midnight Wish”. In S. Hong and T. Iwata, editors,
Fast Software Encryption, LNCS, Seoul, South Korea,
2010. Springer. To appear.
[9] Lucks,S. ”Design Principles for Iterated Hash Func-
tions”, e-print (September 29, 2004)
[10] Joux, A., ”Multicollisions in iterated hash functions.
Application to cascaded constructions”.In Proceed-
ing of CRYPTO 2004. LNCS, vol. 3152, pp. 430440,
2004.
[11] Lucks, S., A failure-friendly design principle for hash
functions, In proceeding of ASIACRYPT,2005.
[12] M. Long, ”Implementing Skein Hash Function on Xil-
inx Virtex-5 FPGA Platform”, 02-02-2009,Version 0.7
[13] Xilinx, ”Virtex-5 FPGA User Guide”, UG190 (v5.2)
November 5, 2009
[14] ”Model Sim PE/PLUS User’s Manual, Model technol-
ogy, 2008
[15] Xilinx, ”Device Package User Guide”,2010
[16] M. El Hadedy, D. Gligoroski, S. J. Knapskog, ”Low
Area Implementation of the Hash Function ”Blue Mid-
night Wish - 256” for FPGA platforms”. In Proceed-
ings of The International Conference on Intelligent
Networking and Collaborative Systems. IEEE Com-
puter Society 2009 ISBN 978-0-7695-3858-7.
[17] N. Sklavos, O. Koufopavlou, ”Implementation of the
SHA-2 Hash Family Standard Using FPGAs”, The
Journal of Supercomputing, 31(3), pp.227-248, 2005.
[18] M. McLoone, J. V. McCanny, ”Efficient single-chip
implementation of SHA-384 & SHA-512”. In Pro-
ceedings of the International Conference on Field-
Programmable Technology (FTP), pp. 311-314, 2002
Author Biographies
Mohamed El-Hadedy is a PhD stu-
dent at the Centre for Quantifiable Qual-
ity of Service in Communication Sys-
tems (Q2S) at the Norwegian Univer-
sity of Science and Technology, Trond-
heim, Norway and Visiting Researcher
at Electrical and computer engineer-
ing department at University of Mas-
sachusetts Lowell, Massachusetts, USA . He received his
B.Sc., rated /Very Good with Honors/ (Ranked Fourth), in
Electronic and Communication Engineering from the Elec-
tronic and Communications Department, Faculty of Engi-
neering, Mansoura University, Egypt, 2002. He received his
M.Sc. in Electronic and Communication Engineering from
the Electronic and Communications department, Faculty of
Engineering, Mansoura University, Egypt in 2006. The ti-
tle of his Master Thesis is: (Improvement of digital image
watermarking techniques based on FPGA Implementation).
He worked as an assistant researcher in the Electronic and
Communication Department, Faculty of Engineering, Man-
soura University, Egypt, from 2002 to 2003, and as an as-
sistant researcher in the Atomic Energy Authority in Cairo,
Egypt, from 2004 to 2008. His research interests are cryp-
tography, computer security, computer architecture design,
FPGA and ASIC implementations, watermarking and digital
rights management.
Danilo Gligoroski is professor at the
Department of Telematics at Norwe-
gian University of Science and Tech-
nology - Trondheim, Norway. He re-
ceived the PhD degree in Computer
Science from Institute of Informat-
ics, Faculty of Natural Sciences and
Mathematics, at University of Skopje
- Macedonia in 1997. His research interests are Cryptogra-
phy, Computer Security, Discrete algorithms and Informa-
tion Theory and Coding.
Svein Johan Knapskog Svein J.
Knapskog received his Siv.ing. degree
(M.S.E.E.) from the Norwegian In-
stitute of Technology (NTH), Trond-
heim, Norway in 1972. Since 2001, he
has been Professor at the Norwegian
University of Science and Technology
(NTNU), the Department of Telemat-
ics. He is presently principal academic at the Norwegian
Centre of Excellence (CoE) for Quantifiable Quality of Ser-
vice in Communication Systems (Q2S). He has previously
held various positions in Norwegian public sector, SINTEF
and industry. From 1982 - 2000, he was Associate Professor
 
 635 Hadedy et al.
at NTH (later NTNU), Department of Computer Science and
Telematics, where he also served a three year term as Head
of Department. In the academic year 2005-2006 he has been
acting Head of Department of the Department of Telemat-
ics. His field of interests includes information security and
QoS as well as related communication system architectural
issues. His current research focus is on information security
primitives, protocols and services in distributed autonomous
telecommunication systems and networks, and security eval-
uation there of. Prof. Knapskog has been active in a num-
ber of conference program committees and has authored/co-
authored a number of technical reports, research papers and
a textbook (in Norwegian)
Martin Margala (IEEE-SM04) re-
ceived the M.S. degree in microelec-
tronics from Slovak Technical Univer-
sity, Slovakia, in 1990 and the Ph.D.
degree in electrical and computer en-
gineering from the University of Al-
berta, Canada, in 1998. He is currently
an Associate Professor with the Elec-
trical and Computer Engineering Department, University of
Massachusetts, Lowell. Previously, he was with the Uni-
versity of Rochester, Rochester, NY, and with the Univer-
sity of Alberta. From 1998 to 2003, he has been an adjunct
scientist with the Telecommunications Research Labs, Ed-
monton, Canada. He holds three patents (two others pend-
ing) and is an author or coauthor of more than 120 publi-
cations in peer-reviewed journals and conference proceed-
ings on high-frequency circuit design and test. His main re-
search interests are energy-efficient low-voltage circuit de-
sign, high-bandwidth and data-processing architectures and
adaptive built-in-self-test systems. Dr. Margala is a member
of program committees of many conferences and symposia
in design and test.
636Compact Implementation of BLUE MIDNIGHT WISH-256 Hash Function on Xilinx FPGA Platform
