Data encryption standard simulation and a bit-slice architecture design by Sixel, Ricardo Girardi et al.

Data Encryption Standard Simulation and a
Bit-Slice Architecture Design
R. G. Sixel, R. S. Monteiro, and ML. Anido





This paper presents a high levellanguage implementation of the Data Encryption Standard
(DES) and discusses a design that employs a bit-sliced architecture. The HLL
implementation was performed on Borland 's@ Delphi4TM language and proved to be highly
valuable for obtaining the intermediate results that were required for debugging. The key
objectives of this work were to make DES available for system applications written in the
Delphi4TM language and also to discuss the design of a bit-sliced DES architecture suitable
for applications requiring low silicon area.
Keywords: DES, data encryption, ciphering, DES architectures
I. Introduction
For many years, cryptography was the domain of the diplomatic and military world [1,2].
Thanks to the microelectronics revolution, a need for commercial cryptography has
emerged This is even more important with the fantastic growth of telephone
communications, computer network applications, such as e-commerce, and many other
applications that require some sort of security .
Until a few years ago, a 64-bit wide hardware implementationofthe DES a1gorithm [3,4]
demanded a considerable amount of hardware, making software implementations more
attractive. This is because of the large number of 32-bit load/shift registers, buses and
ROMs. Naturally, such hardware demands limited the number of applications that
embodied cryptography for security purposes, particularly those applications requiring
high-speed and very low hardware cost.
1
With the advent of microelectronics, some chips have been developed that implement the
standard [5,6,7,8]. Many of them have been designed using fulI-custom or standard ceII
design approaches and have been implemented using CMOS technology .Silicon
Compilation has a1Iowed the description of the DES standard in High LeveI Languages
such as VHDL and to synthesize the description into severa1 technologies, such as standard
ceII or FPGAs. Microelectronics and Silicon Compilation a1Iow the development of new
and more powerful chips supporting highly demanding interactive applications with
cryptography support such as rea1-time video and audio, fast and secure disk access, reaI-
time remote controI, etc.
The DES a1gorithm protects data in two ways. First, privacy is protected. After encryption,
the sender can be sure that the message, sent over an insecure communication channeI such
as electronic maiI, is only read by the intended receiver. A second and often more
important demand is that of authentication. After decryption, the receiver can be sure that
the message he received came from the origina1 sender and no one eIse. Both ,sender and
receiver want to be sure that the integrity of the message is guaranteed, i.e., that an
opponent did not change, insert, or delete parts of the message.
Section two describes the genera1 characteristics of the DES aIgorithm by showing block
diagrams that represent the basic operations. Section three addresses the implementation of
the a1gorithm in the DelphiTM Ianguage. This implementation had two objectives: first,
testing the whole a1gorithm prior to a VHDL description for future synthesis and secondly
making DES available for other applications requiring a software implementation (possibly
in the form of a DLL or a Delphi's component). Section four discusses a bit-sliced
implementation of the DES a1gorithm, which is targeted at systems that have a Iimited
silicon budget and section 5 presents the main conclusions of this work.
II. The DES AIgorithm and its Hardware
The DES a1gorithm was meant to provide cryptographic protection to computer data both
in transmission and whiIst in storage. As shown in fig. 1, a single DES ca1culation is a
sequence of a 64-bit initia1 permutation, a consecutive ca1culation of 16 rounds, and a 64-
bit inverse initiaI permutation. The a1gorithm passes a block of eight bytes of data, and
2
takes it through 18 stages of manipu1ation using substitution and transposition techniques.
The data to be encrypted (or decrypted) is controlled by a 56-bit key. Sixteen stages are
identical, except that they use 16 different internal subkeys derived from the 56-bits of the
main key.
Figure 2 illustrates the calcu1ation of the function f (R,K) and contains hardware for one
DES round. It consists of32- and 48-bit modulo2 adders (XOR's Add1 and Add2), eight
nonlinear substitution functions with six inputs and four outputs (S boxes), an expansion








Fig. 2- Calculation ofthe function f(R, K)
Figure 1 -Major flow of the DES algorithm
3
-f 64


















: Permutation : 48
I Fig. 3- Subkey calculation block diagram
Figure 3 depicts the architecture for subkey calculation. For each DES round, a subkey of
48 bits has to be generated. The input key is 64-bit wide and 8 bits are used for parity
checking. After an initial key permutation (PC 1) the 16 subkeys, one for each round, are
derived from the 56-bit key selected for encryption. One subkey is obtained after some left
of right rotation and after a 56- to 48-bit permutation and selection.
III. DES Implementation in the DelphiTM Language
When a decision to implement the DES algorithm, in a high levellanguage such as Delphi,
was taken, there were two major objectives to be reached. Firstly, it was necessary to test
the implementation of the algorithm, obtaining intermediate results that could be used for
comparison in a further VHDL bit-s1ice implementation. Secondly, such implementation
can be used in software app1ications, where the DES algorithm can be adequate. This can
possibly be done in the form of a DLL or a Delphi ' s component.
Figures 4, 5 and 6 illustrate part of the DelphiTM code used to implement the DES




















SUB-SK1 := SK1 ;
SUB-SK2 := SK2;





EX-HIGH-S := (EX-HIGH shl 8) or ((EX-LOW and $FFOOOOOO) shr 24);
EX-LOW-S := (EX-LOW and $OOFFFFFF);
Result-P := P(S(SUB-SK1 xor EX-HIGH-S,SUB-SK2 xor EX-LOW-S»;










Load I nterface(L, R, EX-High , EX-Low ,state ) ;
end;
end;
Figure 4- Main procedure ofthe DES algorithm
5
LeftShifts: array[1..16] ofbyte = (1,1,2,2,2,2,2,2,1,2,2,2,2,2,2,1);
{ K generates a subkey n, where the 64-bit key is represented as :
-sk1 (32 bits MSB); OUTPUT: -sk1 (24 bits -MSB)
-sk2 (32 bits LSB); -sk2 (24 bits -LSB)
}
procedure K(n : byte; var sk1 ,sk2 : cardinal);
implementation
uses math,uDef;







for k:=1 to 2 do II Reading Vector
for i:=1 to 28 do
begin
if k= 1 then
c := c or (read-bit-word64(sk1,sk2,PC1[k,i])* (1 shl (i-1)))
else
d := d or (read-bit-word64(sk1,sk2,PC1[k,i])* (1 shl (i-1»));
end;
for i := 1 to n do
begin
for j := 1 to LeftShifts[i] do
begin
C := ((read-bit-word32(C,1) shl 27) or (C shr 1)) and $FFFFFFF;






for k:=1 to 2 do II reading vector
for i:=1 to 24 do
begin
if k=1 then
sk2 := sk2 or (read-bit-word56(d,c,PC2[k,i])*(1 shl (i-1)))
else




Figure 5 -Procedure for subkey calculation
6






for i:=1 to 64 do
begin






procedure expand (R:LongWord; var EX-High, EX-Low: LongWord);
var i, value:LongWord;
begin
EX-High := O; EX-Low := 0;
for i:=1 to 48 do
begin
valor := read-bit-word32(R,array-expansao[i]);
write-bit-word64(EX-High , EX-Low, i, value );
end;
end;






for i:=1 to 64 do
begin
value := read-bit-word64(L,R, array-final-permutation [i]);
write-bit-word64(Out-High, Out-Low , i, valor);
end;
end;
Fig. 6 -Some additiona1 procedures: expansion. initial and fmal permutations
7
Ll C"""""", " ci~;;;:;::::;~~"~""cC; ";"'"C" " """CC"C""" ";;,"
"Ll;"4; BC5DEA3' R1;4,t§"~~~?~~~.""";" OOOODF82FBF54'A3 "\;;
~t~
Fig. 7 -Intennediate results (Li, Ri and Ei) for one combination of input data and key.
IV .A bit-slice architecture for the implementation of the DES algorithm
Despite the enonnous increase in silicon densitiy in recent years, a 64-bit wide
implementation of the DES a1gorithm still takes a considerable silicon area. However,
there are many applications where data encryption is just one sma11 part of a much larger
system. In these cases, silicon area is a precious resource and has to be used judicioulsy.
This section discusses a bit-slice (or nibble) implementation ofthe DES algorithm.
In the bit-sliced architecture illustrated by figure 8, each byte of the data is loaded into the
L & R registers -a nibble in L and a nibble in R. As soon as the 64-bit (8 registers of 4
bits each) data is loaded into L&R, and the 64-bit key into the subkey generator, encryption
(decryption) begins by operation on a nibble (4 bits) at a time in a nibble serial
manner[9,10], that ~s, the registers operate as a cyclic register chain. The key is loaded only
8
once, and then remains in the subkey generator throughout the process. The encrypted data
appears at the outputs of the L & R registers after every 186 cycles. It is worth noting that
the loading of the data input and the unloading of the finished output can be pipelined if a
separate data bus is provided (178 cycles are required in this case). Using the same data
bus to load data and key, it is necessary to load the key first. Additionally, as the data is
manipu1ated in two blocks of 32 bits, each operation involving 32 bits requires eight
cycles (8 x 4-bits). The overall operation ofthe architecture is:
1. Load eight bytes of main key into the subkey generator .
2. Load eight bytes of data into the L & R registers.
3. Allow 2 wait-states to line up nibbles inside L&R.
4. Repeat 16 times: -2 cycles of invalid E-bits,
-8 cycles to generate 8 sets ofE-bits (each 6-bits wide).
This will allow the permuter to generate the correct
P-bits and hence perform 15 stages of F (Rn, Kn).
Thus, LoRo ~ Ll5 RIS.
5. Eight additional cycles to generate the next block of E-bits, and takes L 15 R 15 to L 16 R
16.
6. Eight cycles to unload the encrypted/decrypted data.
During the first subkey cycle, step (4) above, the f1fst ten cycles do not operate on the L&R
registers as the permuter has not yet been loaded. However, in subsequent cycles, each
iteration ofstep (4) takes L nR n to L n+l R n+l.
At step (6), the next eight bytes of data can be loaded simu1taneously if there are two data-
buses, in which case proceed to step (3). If not, continue from step (2).
To load the key, the data, and operating on them to produce eight bytes of
encrypted/decrypted output will require 194 clock cycles. However, the time required to











Fig. 8 -Bit-sliced architecture for DES implementation
V. Conclusions
This paper described a high leveI language implementation of the Data Encryption
Standard (DES) in the DelphiTM language. This implementation had two objectives: first,
testing the whole a1gorithm prior to a VHDL description for future synthesis and second1y
making DES available for other applications requiring a software implementation (possibly
in the form ofa DLL or a Delphi's component).
This paper a1so focused on the design description of a bit-slice architecture (using 4-bit
nibbles) that can be used when there are silicon area constraints. This is the case in many
situations where the ciphering/deciphering of data is just one small part of a much larger
problem and silicon area has to be used judiciously. The designer has to balance silicon




[1] W. Diffie and M. E. Hellman, "Privacy and authentication: An introduction to
cryptography,", Proc.IEEE, col. 67, no.3, pp. 397-427, Mar. 1979.
[2] B. Beckett, "Introduction to Cryptology and PC Security", McGraw Hil1 Book Co.,
2000
[3] Data Encryption Standard, Federallnfonnation Processing Standard (FIPS) 46, Nat.
Bur. Stand., Jan. 1977.
[4] DES Modes of Operation, Federa1lnfonnation Processing Standard (FIPS) 81, Nat.
Bur. Stand. Dec. 1980.
[5] M. Davio, Y. Desmedt, J. Goubert, F. Hoonaert and J. J. Auisquater, "Efficient
hardware and software implementations of the DES" in advances in Cryptology , Proc.
Crypto 84, Aug. 1984.
[6] I. Verbauwhede, F. Hoomaert, J. Vandewa11e and H. J. de Man, "Security and
Perfonnance Optimization of a New DES Data Encryption Chip", IEEE Jouma1 of Solid-
State Circuits, Voi. 23, No.3, June, 1988.
[7] D. MacMil1an, "Single chip encrypts data at 14 Mb/s", Electronics, vol. 54, pp. 161-
165, June, 16, 1981.
[8] R. C. Fairfield, A. Matusevich, and J. Plany, "An LSI digital encryption processor
(DEP), "IEEE Commun. Mag. Vol. 23, no.7, pp. 30-41, July 1985.
[9] C. Mistry and E. J. Za1uska, " A VLSI DES Implementation: Subkey Generation",
M.Sc. thesis, Dept. of E1ectronics and Computer Science, Univ. of Southampton, UK,
1987.
[10] H. S. Gil1 and E. J. Za1uska, "Part Implementation of a Data Encryption Standard
Chip Set", M. Sc. Thesis, Dept of Electronics and Computer Science, Univ. of
Southampton, UK, 1987.
11
