OPTIMIZATION OF TWO FISH ENCRYPTION ALGORITHM ON

FPGA by DORE RAJA, ANAND A RAJA
CERTIFICATION OF APPROVAL 




Fawnlzu 1\zmadl Husain 
Lecturer 
by 
ANAND A RAJA AIL DORE RAJA 
A project dissertation submitted to the 
Electrical & Electronics Engineering Programme 
Universiti Teknologi PETRONAS 
In partial fulfillment of the requirement for the 
BACHELOR OF ENGINEERING (Hons) 
(ELECTRICAL & ELECTRONICS ENGINEERING) 
Electrical & Electronics Engineering 
New Academic Btoct NO 22 
UniveBni Teknologi PETRONAS 
31750 Tronoh UNIVERSITI TEKNOLOGI PETRONAS 
TRONOH, PERAK 
DECEMBER 2004 
Perak Darul Ridzuen, MALAYSIA. 
CERTIFICATION OF ORIGINALITY 
This is to certify that I am responsible for the work submitted in this project, that the 
original work is my own except as specified in the references and acknowledgements and 
that the original work contained herein have not been undertaken or done by unspecified 




The demand for efficient and secure ciphers has given rise to a new generation of block ciphers 
capable of providing increased protection at lower cost. Among these new algorithms is Twofish. 
Twofish is a promising 128-bit block which was one of the 5 finalists in the National Institute of 
Standards and Technology organized competition as the Advanced Encryption Standard. The aim 
of the competition was to find a suitable candidate to replace DES at the core of many encryption 
systems worldwide. 
Twofish can work with variable key lengths: 128, 192 or 256 bits. In this report, only a version of 
128-bit key length was discussed. Twofish has 6 main building blocks; Feistel Networks, 
whitening, S-boxes, MDS Matrices, Pseudo Hadamard Transforms and Key Schedule. Twofish is 
a 16 round Feistel network with a bijective F function, which corresponds to 8 cycles. The 
whitening technique employed substantially increases the difficulty of keysearch attacks against 
the remainder of the cipher. Twofish uses 4 different, bijective, key-dependent, 8-by-8 bit S-
boxes. Twofish uses a single 4 by 4 MDS matrix over GF (28).This is one of the 2 main diffusion 
elements of Two fish. There is also Reed-Solomon code with the MDS property used in the key 
schedule; this doesn't add diffusion to the cipher but does add diffusion to the key schedule.) 
Besides that, Twofish also uses a 32 bit Pseudo Hadamard Transform to mix the outputs from its 
2 parallel 32-bit g functions. Finally, Twofish needs a lot of key material, and has complicated 
key schedule. To facilitate analysis, the key schedule uses the same primitives as the round 
function. Except for 2 additional rotations, each pair of expanded key words is constructed by 
applying the Twofish round function (with key-dependent). 
For this project, 2 different optimized designs were implemented. The first design 
(Design I) was implemented with minimum hardware resources usage, using a single F -Function 
(modified) and was optimized with reasonable latency, throughput and throughput per gate. As 
for the second design (Design 2) was implemented with reasonably minimum hardware resources 
using 4 units of F-Function(modified) of Design I, minimum hardware resources usage, very 
small latency, very high throughput and very high throughput per gate. Furthermore, both Design 
I and Design 2 were implemented with zero keying and function as encryptor/decryptor. Both 
Design I and Design 2 were written using VHDL, simulated using ALDEC, synthesized using 
XILINX Synthesizing Tools, implemented using XILINX ISE6.2i implementation tools and 
download onto the Spartan 2 FPGA board using BEDLOAD utility program. 
As a conclusion this Final Year Project is quite successful because all the objectives have 
been met successfully. 
iii 
ACKNOWLEDGEMENT 
My biggest thanks go to my Final Year Supervisor, Mr. Noohul Basheer who found time 
and energy in his busy schedule to guide me towards completing this project successfully. 
I thank my co-supervisor, Mr. Fawnizu for giving me valuable advice to improve my 
design. 
I also thank all UTP technicians who had lent their help to me while I was .completing 
this project. 
Special thanks are due to the following individuals who had given valuable technical 
advice on hardware description language and hardware programming that helped me to 
complete my project successfully. 
• Mr. Ettikan Karuppiah Kandasamy 
• Mr. A vinansh Ravindranathan 
• Miss Vincci Lee Hoong Ming 





Finally, I am grateful to UTP for providing necessary equipment and facilities that helped 
in the completion of my project. 
lV 
TABLE OF CONTENTS 
CERTIFICATION OF APPROVAL 
CERTIFICATION OF ORIGINALITY II 
ABSTRACT iii 
ACKNOWLEDGEMENT IV 
1. CHAPTER 1: INTRODUCTION I 
1.1 BACKGROUND STUDY I 
1.2 PROBLEM STATEMENT I 
1.3 OBJECTIVES 2 
1.4 SCOPE OF STUDY 3 
2. CHAPTER 2: LITERATURE REVIEW AND THEORY 4 
2.1 INTRODUCTION 5 
2.2 TWOFISH DESIGN GOALS 6 
2.3 TWO FISH BUILDING BLOCKS 8 
2.3.1 Feistel Networks 8 
2.3.2 S-Boxes 9 
2.3.3 MDS Matrices 9 
2.3.4 Pseudo Hadamard Transform 10 
2.3.5 Whitening 10 
2.3.6 Key Schedule 11 
2.4 TWO FISH ENCRYPTION ALGORITHM- A DETAIL 11 
DESCRIPTION 
2.4.1 The Function F 13 
2.4.2 The Function g 13 
2.4.3 The Key Schedule 14 
2.4.3.1 Additional Key Lengths 16 
2.4.3.2 The Function h 16 
2.4.3.3_Key-dependent S-boxes 17 
2.4.3.4 The Expanded Key Words Kj 19 
2.4.3.5 The Permutations q0 and q1 19 
2.4.4 Round Function Overview 20 
3. CHAPTER3:METHODOLOGY 22 
3.1 PROCEDURE IDENTIFICATION 22 
3.2 TOOLS 22 
4. CHAPTER 4: PROJECT WORK 23 
4.1 DESIGN IMPLEMENTATION 23 
4.1.1Design Decisions 24 
4.1.1.1 Building Blocks 24 
4.1.1.1.1 Q-Permutations 25 
4.1.1.1.2 S-Boxes 26 
4.1.1.1.3 Maximum Distance Separable Matrix 27 
4.1.1.1.4 Reed-Solomon Matrix 28 
4.1.1.1.5 Operation Selector 29 
4.1.2 Integration and Overall Structure 30 
4.1.2.1 Input Register Module 31 
4.1.2.2 Outer Register Module 32 
4.1.2.3 Key Register Module 33 
4.1.2.4 Modified F-Function 35 
4.1.2.5 Design Overall Structure 39 
4.1.2.6 Controller 42 
4.2 PROGRAMMING STRATEGY 46 
4.2.1 TWOFISHCORE 47 
4.2.2 FULL ADDER 49 
4.2.3 CIPHERTEXT 49 
4.2.4 CLEAR TEXT/PLAINTEXT 53 
4.2.5 CONTROLLER 54 
4.2.6 KEYMODULE 66 
4.2.7 MODIFIED F 66 
4.2.8 OPSELECT 73 
4.2.9 32 BIT REGISTER 74 
4.2.10WRAPPER . 75 
5. CHAPTER 5: RESULTS & DISCUSSION 79 
5.1 SIMULATION 79 
5.2 TEST VECTOR 1- DESIGN I 79 
5.2.1Encryption 79 
5.2.1.1 Load Key 79 
5.2.1.2 Start Encryption 81 
5.2.2 Decryption 83 
5.2.2.1 Load Key 83 
5.2.2.2 Start of Decryption 84 
5.3 TEST VECTOR 2- DESIGN I 86 
5.3.1 Encryption 86 
5.3.2 Decryption 87 
5.4 TEST VECTOR 1- DESIGN 2 89 
5.4.1Encryption 89 
5.4.1.1 Load Key 89 
5.4.1.2 Start Encryption 91 
5.4.2 Decryption 93 
5.4.2.1 Load Key 93 
5.4.2.2 Start of Decryption 95 
5.5 TEST VECTOR 2- DESIGN 97 
5.5.1 Encryption 97 
5.5.2 Decryption 98 
5.6 DESIGN IMPLEMENTATION 100 
5.7 GENERATE PROGRAMMING FILE & DOWNLOADING 102 
6. CHAPTER 6: DISCUSSION 103 
6.1 ORIGINAL DESIGN BY THE AUTHOR 103 
6.2 DESIGN I 105 
6.3 DESIGN 2 107 
6.4 PERFORMANCE DIFFERENCE BETWEEN DESIGN I AND Ill 
DESIGN 2 ON SPARTAN 2-XC2S200-5PQ208 
6.5 PERFORMANCE OF DESIGN I AND DESIGN 2 WHEN 112 
IMPLEMENTED ON SPARTAN 3 -XC3S400-5FG456 
6.6 PERFORMANCE COMPARISON OF DESIGN I WHEN 113 
IMPLEMENTED ON SPARTAN 2- XC2S200-5PQ208 & SPARTAN 
3 -XC3S400-5FG456 
6.7 PERFORMANCE COMPARISON OF DESIGN 2 WHEN 114 
IMPLEMENTED ON SPARTAN 2 
- XC2S200-5PQ208 & SPARTAN 3 -XC3S400-5FG456 
6.8 FACTORS CAUSING DIFFERENCES BETWEEN ONE 114 
IMPLEMENTATION AND ANOTHER IMPLEMENTATION OF 
FPGA 
6.9 PERFORMANCE COMPARISON 116 
7. CHAPTER 7: CONCLUSION AND RECOMMENDATION 118 
7.1 CONCLUSION 118 
7.2 RECOMMENDATION 119 
8. CHAPTERS: REFERENCE 120 
9. APPENDIX A: MDS MATRIX 121 
10. APPENDIX B: REED SOLOMON MATRIX 130 
11. APPENDIX C: SOURCE CODE FOR DESIGN 2 220 
LIST OF FIGURES 
Figure I: Schematic Diagram of Two fish Encryption Algorithm 
Figure 2: Function h 
Figure 3: A view of a Single Round F Function (128-bit Key) 
Figure 4: Project Development Block 
Figure 5: Q-Permutation 
Figure 6: S-boxes 
Figure 7: MDS 
Figure 8: Reed-Solomon Matrix 
Figure 9: Operation Selector 
Figure I 0: Building Block for Operation Selection 
Figure II: Input Register Module for Design I 
Figure 12: Output Register Module for Design I 
Figure 13: Output Register Module for Design 2 
Figure 14: Key Register Module 
Figure 15: Generalized F-Function of Design I 
Figure 16: Generalized F-Function of Design 2 
Figure 17: Structure of the Cipher of Design I 
Figure 18: Structure of the Cipher of Design 2 
Figure 19: The Controller of Design I 
Figure 20: The Controller of Design 2 
Figure 21: Hierarchy of the Design Flow 
Figure 22: Block Diagram of the Twofish Core- with I/0 pins 
Figure 23: Block Diagram of the Full Adder 
Figure 24: Block Diagram of the Ciphertext of Design I 
Figure 25: Block Diagram of the Ciphertext of Design 2 
Figure 26: Block Diagram of the Cleartext/Plaintext 
Figure 27: Finite State Machine of the Design I 
Figure 28: Overall Design with some of the Control Signals for Design I 
Figure 29: Block Diagram of the Controller for Design I 
Figure 30: Finite State Machine of the Design 2 
Figure 31: Overall Design with some of the Control Signals for Design 2 
Figure 32: Block Diagram of the Controller for Design 2 
Figure 33: Block Diagram of the Keymodule 
Figure 34: Block Diagram of the Modified_F of Design I 
Figure 35: Block Diagram ofEvenkeygenerator 
Figure 36: Block Diagram of Oddkeygenerator 
Figure 37: Block Diagram ofEvenplaintextgenerator 
Figure 38: Block Diagram ofOddplaintextgenerator 
Figure 39: Block Diagram of the Opselect 
Figure 40: Block Diagram of the 32-bit Register 
Figure 41: Wrapper for Both Design I and Design 2 
Figure 42: Idle- Text Vector I 
Figure 43: Load Key- Text Vector 1 
Figure 44: Start Encryption- Text Vector 1 
Figure 45: End of Encryption- Text Vector 1 
Figure 46: Initialization of Decryption- Text Vector 1 
Figure 47: Decryption- Loading Key- Text Vector 1 
Figure 48: Start of Decryption- Text Vector 
Figure 49: End of Decryption- Text Vector 
Figure 50: Initialization of Input Values- Test Vector 2 
Figure 51: End of Encryption- Test Vector 2 
Figure 52: Initialization oflnput Values- Test Vector 2 
Figure 53: End of Decryption- Test Vector 2 
Figure 54: Idle- Text Vector 1 
Figure 55: Load Key- Text Vector 1 
Figure 56: Start of Encryption- Text Vector 1 
Figure 57: End of Encryption- Text Vector 1 
Figure 58: Initialization of Decryption- Text Vector 1 
Figure 59: Decryption- Loading Key- Text Vector 1 
Figure 60: Start of Decryption- Text Vector 
Figure 61: End of Decryption- Text Vector 
Figure 62: Initialization oflnput Values- Test Vector 2 
Figure 63: End of Encryption- Test Vector 2 
Figure 64: Initialization oflnput Values- Test Vector 2 
Figure 65: End of Decryption- Test Vector 2Figure 66: B3-SPARTAN2+ Board 
Figure 67: A View of a Complete Single Round F -Function of the Original Design as 
Proposed by the Author 
Figure 68: Generalized Modified F-Function 
Figure 69: A View of a Complete Single Round F -Function of Design 2 
Figure 70: Generalized F-Function of Design 2 
Table 1: Twofish Core 
Table 2: Full Adder 
Table 3: Ciphertext- Design 1 
Table 4: Ciphertext- Design 2 
Table 5: Cleartext/Plaintext 
Table 6: Controller- Design 1 
LIST OF TABLES 
• 
Table 7: Generated Control Signals for the Corresponding States for Design 1 
Table 8: Controller- Design 2 
Table 9: Keymodule 
Table 10: Modified _F- Design 1 
Table ll: Evenkeygenerator 
Table 12: Oddkeygenerator 
Table 13: Evenplaintextgenerator 
Table 14: Oddplaintextgenerator 
Table 15: Opselect 
Table 16: 32-bit Register 
Table 17: RAM for Storing Data Blocks 
Table 18: Wrapper 
Table 19: Pin Assignment to the Corresponding 1/0 
Table 20: Generated Sequence of Values Based on Design 1 
Table 21: Generated Sequence of Values Based on Design 1 
Table 22: Output Performance of Design 1 on Spartan 2- XC2S200-5PQ208 
Table 23: Generated Sequence of Values Based on Design 2 
Table 24: Output Performance of Design 2 on Spartan 2- XC2S200-5PQ208 
Table 25: Performance Difference of Design 1 and Design 2 on Spartan 2- XC2S200-
5PQ208 
Table 26: Performance Difference of Design 1 and Design 2 on Spartan 3 -XC3S400-
5FG456 
Table 27: Performance Difference of Design 1 on Spartan 2- XC2S200-5PQ208 & 
Spartan 3 -XC3S400-5FG456 
Table 28: Performance Difference of Design 2 on Spartan 2 - XC2S200-5PQ208 & 
Spartan 3 -XC3S400-5FG456 
Table 29: Performance Comparison With Other Implementations 
CHAPTER 1: INTRODUCTION 
1.1 BACKGROUND STUDY 
With the explosive growth in computer systems and their interconnections vm 
networks has increased the dependence of both organizations and individuals on the 
information stored and communicated using these systems. However in these 
networking environments there are no such guarantees that all kinds of information 
(databases, video programs, telecommunications etc ... ) can avoid unauthorized 
access, because the transmission medium is open, which implies that anyone with the 
appropriate protocol analyzer can eavesdrop as well. This in turn has led to a 
heightened awareness of the need to protect data and resources from disclosure to 
guarantee the authencity of data and messages and to protect systems from network 
based attacks. Many of the cryptographic algorithms that have been developed are 
being used in software implementations on computers (e.g. to have protection of 
coded passwords for users). For low complexity type of applications, such as the 
protection of information in files and databases this is probably the most economic 
solution. However, a number of applications require such high throughputs for 
encryption/decryption process that they carmot be executed on a normal general 
purpose microprocessor. These applications require dedicated ASIC or FPGA 
implementations. In the past many VLSI implementations in block cipher have been 
proposed such as DES, IDEA, SAFER, etc. In this report, I present the 
implementation of Twofish encryption algorithm on FPGA. This architecture can be 
used to implement on high speed networking. 
1.2 PROBLEM STATEMENT 
With the large and growing number of Internet and wireless communication users has 
led to an increasing demand of security measures and devices for protecting the user 
data transmitted over the open charmels. Two kinds of cryptographic systems can be 
used for that purpose i.e., the symmetric-key and asymmetric key crypto systems. The 
symmetric-key cryptography (such as DES, AES, Blowfish and Twofish) uses an 
identical key between the sender and receiver, both to encrypt the message text and 
decrypt the cipher text. The asymmetric-key cryptography (such as RSA), uses 
different keys for encryption and decryption, eliminating the key transporting 
1 
dilemma. Because of its high speed, the symmetric key cryptography is more suitable 
for the encryption of large amount of data. Almost the last 30 years, DES (Data 
Encryption Standard) has been the standard encryption algorithm around [1]. Despite 
the popularity, DES' key length is too short for acceptable commercial security. The 
64-bit block length shared by DES and most well known ciphers opens it up to attacks 
when large amount of data are encrypted under the same key. NIST (National Institute 
of Standards and Technology, or NIST) specified that new generation block ciphers 
should have longer key length, larger block size, faster speed and greater flexibility. 
Two fish encryption algorithm was one of the finalists of AES. It is a 128 bit 
symmetric block cipher with variable key lengths of 128, 192 and 256 bits. 
Furthermore it has no weak keys. For this project, I plan to use Twofish encryption 
algorithm as the implementation algorithm. 
Two fish has many qualities that make it interesting for this project. It has been 
designed with hardware implementation in mind and can thus be mapped efficiently 
to hardware devices such as FPGA and SmartCards. It also offer different possibilities 
of trade-offs between space and speed and can be pipe lined. The estimated gate count 
would be 14000 and 80000. In this project we'll be looking at the implementation of 
the Twofish encryption algorithm on FPGA. 
1.3 OBJECTIVES 
The objectives of this project are as follows: 
• To understand Twofish Encryption Algorithm. 
• To understand the complex mathematical expression especially Galois 
Field and Permutation that makes up the algorithm. 
• To understand how to translate Twofish Encryption Algorithm into a 
hardware module. 
• To implement Twofish Encryption Algorithm on FPGA. 
• To optimize the design of the algorithm onto the FPGA 
• To use VHDL in coding the algorithm onto the FPGA 
• To understand the nature ofthe FPGA 
2 
1.4 SCOPE OF STUDY 
The scope of study basically covers the implementation of Twofish Encryption 
algorithm and the various building blocks namely Feistel Network, whitening, S-
boxes, MDS Matrices, PHT and Key Schedule. This is followed by exploring and 
then implementing with an optimized design onto the FPGA. The design would be 
coded using VHDL. 
3 


















I g 1 I 
I SO 1 I 
I 1 I 
I 
I 
I I I 












K4 Ks OUTPUT WHITENING K6 
c (128bits) 
KJ WHITENING J INPUT 
15MORE 
ROUNDS 
J UNDO LAST SWAP 
Kr ] OUTPUT 
WHITENING 
Figure 1: Schematic Diagram of Two fish Encryption Algorithm 
4 
2.1 INTRODUCTION 
DES or Data Encryption Standard IS one of the most widely used encryption 
algorithm in the world. It originated between 1972 and 1974, when the National 
Bureau of Standards issued a public request for an encryption standard. Even though 
it was a very successful and popular algorithm, it was plagued with controversy. This 
was because, many cryptanalysts did not welcome the development of DES whish 
was done in a closed door policy. They feared that NBS had embedded some 
backdoor to the algorithm. In short the development was not transparent. The debate 
about whether DES' key is too short for acceptable commercial security has been 
going on for many years. Recent advances in distributed key search techniques have 
left no doubt in anyone's mind that its key is simply too short for today's security 
applications especially with the advent of powerful computers. Triple- DES which is 
basically the looping of DES by 3 times has emerged as an interim solution in many 
high-security applications, such as banking. The disadvantage of Triple-DES is it is 
slow and very time consuming. Some of the inherent problem with DES is the it could 
be broken if the data has been encrypted many times using the same key. Besides that 
in 1990, a powerful machine was developed that brute forcedly broke DES. With the 
growing request from the academic and industrial world to replaced DES, the 
National Institute of Standards and Technology (NIST) announced a new program 
called the Advanced or American Encryption Standard program in 1997. Comments 
and feedback from the public were collected and gathered on the proposed standard 
before the NIST eventually issued a call for algorithms to satisfY the standard. The 
intention is for NIST to make all submissions public and eventually, through a process 
of public review and comment, choose a new encryption standard to replace DES. 
NISI's call requested a block cipher. Block ciphers can be used to design stream 
ciphers with a variety of synchronization and error extension properties, one-way hash 
functions, message authentication codes, and pseudo-random number generators. 
Because of this exibility, they are the workhorse of modem cryptography. NIST 
specified several other design criteria: a longer key length, larger block size, faster 
speed, and greater exibility. While no single algorithm can be optimized for all needs, 
NIST intends AES to become the standard symmetric algorithm of the next decade. 
5 
Twofish is an encryption algorithm that was jointly created by Bruce Schneier, 
John Kelsey, Doug Whiting, David Wagner, Chris Hall and Niels Ferguson which 
was a submission to the AES selection process. It meets the entire required NIST 
criteria 128-bit block; 128-, 192-, and 256-bit key; efficient on various platforms; etc. 
and some strenuous design requirements, performance as well as cryptographic. 
Twofish can: 
• Encrypt data at 285 clock cycles per block on a Pentium Pro, after a 
12700 clock-cycle key setup. 
• Encrypt data at 860 clock cycles per block on a Pentium Pro, after a 1250 
clock-cycle key setup. 
• Encrypt data at 26500 clock cycles per block on a 6805 smart card, after a 
1750 clock-cycle key setup. 
2.2 TWO FISH DESIGN GOALS 
Twofish encryption algorithm was designed carefully to meet all the criteria set by 
NIST [2, 9]. Among the major design criteria specified by NIST are as follows: 
• A 128-bit symmetric block cipher. 
• Key lengths of 128 bits, 192 bits, and 256 bits. 
• No weak keys. 
• Highly efficient on both the Intel Pentium Pro and other software and 
hardware platforms. 
• Flexible design: e.g., accept additional key lengths; be implementable on a 
wide variety of platforms and applications; and be suitable for a stream 
cipher, hash function, and MAC. 
• Simple design, both to facilitate ease of analysis and ease of 
implementation. 
Additionally, the authors imposed the following performance criteria on their design 
namely [2]: 
• Accept any key length up to 256 bits. 
6 
• Encrypt data in less than 500 clock cycles per block on an Intel Pentium, 
Pentium Pro, and Pentium II, for a fully optimized version of the 
algorithm. 
• Shall be capable of setting up a 128-bit key (for optimal encryption speed) 
in less than the time required to encrypt 32 blocks on a Pentium, Pentium 
Pro, and Pentium II. 
• Encrypt data in less than 5000 clock cycles per block on a Pentium, 
Pentium Pro, and Pentium II with no key setup time. 
• Shall not contain any operations that make it inefficient on other 32-bit 
microprocessors. 
• Shall not contain any operations that make it inefficient on 8-bit and 16-
bit microprocessors. 
• Shall not contain any operations that reduce its efficiency on proposed 64-
bit microprocessors. 
• Shall not include any elements that make it inefficient in hardware. 
• Have a variety of performance tradeoffs with respect to the key schedule. 
• Encrypt data in less than less than I 0 milliseconds on a commodity 8-bit 
microprocessor. 
• Shall be implementable on an 8-bit microprocessor with only 64 bytes of 
RAM. 
• Be implementable in hardware using less than 20,000 gates. 
Besides that the authors' cryptographic goals were as follows: 
• 16-round Twofish (without whitening) should have no chosen-plaintext 
attack requiring fewer than 280 chosen plaintexts and less than 2N time, 
where N is the key length. 
• 12-round Twofish (without whitening) should have no related-key attack 
requiring fewer than 264 chosen plaintexts, and less than 2N12 time, where 
N is the key length. 
Finally, the authors imposed the following flexibility goals: 
• Should have variants with a variable number of rounds. 
7 
• Should have a key schedule that can be precomputed for maximum speed, 
or computed on-the fly for maximum agility and minimum memory 
requirements. Additionally, it should be suitable for dedicated hardware 
applications: e.g., no large tables. 
• Shall be suitable as a stream cipher, one-way hash function, MAC, and 
pseudo-random number generator, using well-understood construction 
methods. 
• Shall be a family-key variant to allow for different, non-interoperable, 
versions of the cipher. 
2.3 TWO FISH BUILDING BLOCKS 
2.3.1Feistel Networks 
A Feistel network is defined to be a general method of transforming any 
function (usually called the F function) into a permutation. Feistel network 
was first invented by Horst Feistel. The method was first used in an encryption 
algorithm developed by himself called Lucifer. It gained popularity and was 
further popularized when it was used in DES. Among the major algorithms 





The fundamental building block of a F eistel network is the F function: a key-
dependent mapping of an input string onto an output string. A F function is 
always non-linear and possibly non-surjective: 
F: {0, 1} n/2 X {0, 1} N---> {0, 1} n/2 
where n is the block size of the Feistel Network, and F is a function taking 
n=2 bits of the block and N bits of a key as input, and producing an output of 
length n=2 bits. In each round, the "source block" is the input to F, and the 
output ofF is xored with the target block," after which these two blocks swap 
8 
places for the next round. The important observation that can be made here is 
that a F- function which may be a very weak encryption algorithm when fed 
into the F eistel network and repeatedly iterated, the function becomes stronger 
and stronger. This is one method of increasing the immunity of an algorithm 
from potential attacks. In Feistel Network, a cycle consists of2 rounds. In one 
cycle, every bit of the text block has been modified once. Twofish is a 16-
round Feistel network with a bijective F function which corresponds to 8 
cycles. 
2.3.2 S - Boxes 
Twofish uses key dependent S-boxes. S-box is defined to be a table-driven 
non-linear substitution operation used in most block ciphers. Fixed S-boxes 
(e.g. DES) allow attackers to study S-boxes and find weak key points but with 
key dependent S-boxes, attacker doesn't know what the S-boxes are. It is 
basically defense for an "unknown attack". S-boxes vary in both input size and 
output size, and can be created either randomly or algorithmically. S-boxes 
were first used in Lucifer, then DES, and afterwards in most encryption 
algorithms. Twofish uses four different, bijective, key-dependent, 8-by-8-bit 
S-boxes. These S-boxes are built using two fixed 8-by-8-bit permutations and 
key material. These S-boxes can either be precomputed for a specific key or 
computed on the fly for every required value. This provides a lot of flexibility 
and tradeoffs both in hardware and software. 
2.3.3 MDS Matrices 
In pure mathematics, maximum distance separable (MDS) code over a field is 
defined to be a linear mapping from a field elements to b field elements, 
producing a composite vector of a+b elements, with the property that the 
minimum number of non-zero elements in any non-zero vector is at least b + 
1. Put another way, the distance" (i.e., the number of elements that differ) 
between any two distinct vectors produced by the MDS mapping is at least b + 
1. It can easily be shown that no mapping can have a larger minimum distance 
between two distinct vectors, hence the term maximum distance separable. 
MDS mappings can be represented by an MDS matrix consisting of a+ b 
elements. Reed-Solomon (RS) error-correcting codes are known to be MDS. A 
9 
necessary and sufficient condition for an a+ b matrix to be MDS is that all 
possible square submatrices, obtained by discarding rows or columns, are non-
singular Twofish uses a single 4-by-4 MDS matrix over GF (28). This is one 
of the main diffusion elements of Twofish. (There is also an RS-code with 
MDS property used in the key schedule; this doesn't add diffusion to the 
cipher, but does add diffusion to the key schedule). 
2.3.4 Pseudo-Hadamard Transforms 
A pseudo-Hadamard transform (PHI) IS very important m encryption 
algorithm. PHI is defined to be a simple mixing operation that runs quickly in 
software. Given two inputs, a and b, the 32-bit PHI is defined as: 
a'= a+bmod232 
b'=a+2bmod232 
Twofish uses a 32-bit PHI to mix the outputs from its two parallel 32-bit g 
functions. This PHI can be executed in two opcodes on most modem 
microprocessors, including the Pentium family. This is the second main 
diffusion element in Twofish. 
2.3.5 Whitening 
In the world of cryptography, whitening is very important. Whitening is 
defined to be the technique of xoring key material before the first round and 
after the last round. It was proven by many cryptanalysts that whitening 
substantially increases the difficulty of key search attacks against the 
remainder of the cipher. In the authors' attacks on reduced-round Twofish 
variants, the authors discovered that whitening substantially increased the 
difficulty of attacking the cipher, by hiding from an attacker the specific inputs 
to the first and last rounds' F functions. Twofish xors 128 bits of subkey 
before the first Feistel round and another 128 bits after the last Feistel round. 
These subkeys are calculated in the same manner as the round subkeys, but are 
not used anywhere else in the cipher. 
10 
2.3.6 Key Schedule 
Another important building block of Twofish is the Key Schedule. The key 
schedule is the means by which the key bits are turned into round keys that the 
cipher can use. The requirements call for a variable length key that is whether 
we use 128, 196 or 256 bits of key length. The easiest way of using this is to 
have a key schedule that expands a variable length key to a fixed set of 
expanded key values. Twofish needs a lot of key material, and has a 
complicated key schedule. To facilitate analysis, the key schedule uses the 
same primitives as the round function. Except for 2 additional rotations, each 
pair of expanded key words is constructed by applying the Twofish round 
function (with key-dependentS-boxes) to a fixed input. 
2.4 TWOFISH ENCRYPTION ALGORITHM - A DETAIL 
EXPLANT! ON 
Twofish encryption algorithm is one of the leading encryption algorithms in the 
world. It was designed with many considerations in mind. The authors have great 
experience in the world of cryptography and as a result developed a very strong 
algorithm that meets all NISI's requirements. Figure 1 above shows an overview of 
the Twofish block cipher. Twofish uses a 16-round Feistel-like structure with 
additional whitening of the input and output. The only non-Feistel elements are the !-
bit rotates. The rotations can be moved into the F function to create a pure Feistel 
structure, but this requires an additional rotation of the words just before the output 
whitening step. Besides that, the plaintext is split into four 32-bit words. In the input 
whitening step, these are xored with four key words. This is followed by sixteen 
rounds. In each round, the two words on the left are used as input to the g functions. 
(One of them is rotated by 8 bits first.) The g function consists offour byte-wide key-
dependent S-boxes, followed by a linear mixing step based on an MDS matrix. Next, 
the results of the two g functions are combined using a Pseudo- Hadamard Transform 
(PHT), and two keywords are added. These two results are then xored into the words 
on the right (one of which is rotated left by I bit first, the other is rotated right 
afterwards). The left and right halves are then swapped for the next round. After all 
the rounds, the swap of the last round is reversed, and the four words are xored with 
11 
four more key words to produce the ciphertext. More formally, the 16 bytes of 
plaintext Po, .... , Pts are first split into 4 words Po, . . .,P3 of 32 bits each using the little-
endian convention and not big-endian convention. In cryptography little endian 
method is widely used. 
3 
p, = IP(4; + J).281 i = 0, ... ,3 
}"=0 
These words are xored with 4 words of the expanded key, in the input whitening 
steps. 
R,,, = p, EB K, i = 0, ... ,3 
Furthermore, in each of the 16 rounds, the first two words are used as input to the 
function F, which also takes the round number as input. The third word is xored with 
the first output of F and then rotated right by one bit. The fourth word is rotated left 
by one bit and then xored with the second output word of F. Finally, the two halves 
are exchanged. Thus, 





for r = 0, .... , 15 and where ROR and ROL are functions that rotate their first 
argument (a 32-bit word) left or right by the number of bits indicated by their second 
argument. The output whitening step undoes the 'swap' of the last round, and xors the 
data words with 4 words of the expanded key. 
C = R16,(i+2)mod4E!lK;+4 i = 0, ... ,3 
In addition to that, the four words of ciphertext are then written as 16 bytes co, ...... , 





S(imod4) mod 2 i = 0, ... ,15 
2.4.1 The Function F 
The function F is the most important part of the encryption algorithm. It does 
almost 90% of the mixing, transformation and other mathematical operations 
in the algorithm. It is very important to have a very thorough understanding of 
the function before proceeding to any implementation. The function has many 
subparts that needs detail understanding. Basically the function F is a key-
dependent permutation on 64-bit values. It takes three arguments, two input 
words Ro and R1o and the round number r used to select the appropriate 
subkeys. R0 is passed through the g function, which yields To. R1 is rotated left 
by 8 bits and then passed through the g function to yield Tl. The results To and 
T1 are then combined in a PHT and two words of the expanded key are added. 
To= g(Ro) 
T1 = g(ROL(R1,8)) 
Fo =(To+ T1 + K2, + s) mod232 
F1 =(To+ 2TI + K2, + o)mod232 
where (Fo ,F1) is the result of F. We can define the function Fo for use in the 
analysis. Fo is identical to the F function, except that it does not add any key 
blocks to the output. (The Pseudo Hadamard Transform is still performed.). 
2.4.2 The Function g 
Within the F function lies the g function. In fact the g function is probably the 
most important element in the F- function. The function g forms the heart of 
Twofish. The input word X is split into four bytes. Each byte is run through its 
own key-dependent S-box. Each S-box is bijective, takes 8 bits of input, and 
produces 8 bits of output. The four results are interpreted as a vector of length 
4 over GF (28), and multiplied by the 4 x 4 MDS matrix (using the field GF 
(28) for the computations). The resulting vector is interpreted as a 32-bit word 
which is the result of g. 
13 
Xt =[X /2 81 ]mod28 i = 0, ... ,3 
yt=st[xt] i=0, ... ,3 
zo yo 





Z = Izt.281 
i=-0 
where si is the key-dependentS-boxes and Z is the result of g. For this to be 
well-defined, correspondence between byte values and the field elements of 
GF (28) need to be specified. GF (28) is represented as GF (2) [x] =v(x) where 
v(x) = i+i+x5+x3+ lis a primitive polynomial of degree 8 over GF (2).The 
field element 
7 
a= L at, x' with atE GF(2) 
i=-0 
is identified with the byte value 
7 
a= Ia,,2' 
Besides that, addition in GF (28) corresponds to a xor of the bytes. This is 
some sense of natural mapping .. The MDS matrix is given by: 
MDS= 
01 EF 5B 5B 





where the elements have been written as hexadecimal byte values using the 
above defined correspondence. 
2.4.3 The Key Schedule 
Another very important building block is the Key Schedule. Complexity often 
depends on the length of the key. The key schedule has to provide 40 words of 
expanded key KO, .... .. , K39, and the 4 key-dependentS-boxes used in the g 
14 
function. Twofish is defined for keys oflengthN= 128, N = 192, andN= 256. 
Keys of any length shorter than 256 bits can be used by padding them with 
zeroes until the next larger defined key length. We define k = 64. The key M 
consists of 8k bytes mo, ..... ,msk-1· The bytes are first converted into 2k words 
of 32 bits each 
3 
"' 8 M; = L.m<" + JJ.2 .J i = 0, ... ,2k -1 
}=0 
and then into two word vectors of length k. 
M, = (M o,M,, ... ,M2k-2) 
Mo = (Ml,M3, ... ,M,k-l) 
A third word vector of length k is also derived from the key. This is done by 
taking the key bytes in groups of 8, interpreting them as a vector over OF (28), 
and multiplying them by a 4 x 8 matrix derived from an RS code. Each result 



















for i = 0, ... , k -l and 
S = ( Sk - 1, Sk- 2, ... , So) 
It is very important to note that S lists the words in "reverse" order. For the RS 
matrix multiply, OF (i) is represented by OF (2) [x] =w(x), where w(x) = 
15 
x8+i+x3+x2+ I is another primitive polynomial of degree 8 over GF (2). The 
mapping between byte values and elements of GF (28) uses the same definition 
as used for the MDS matrix multiply. Using this mapping, the RS matrix is 
given by: 
01 A4 55 87 SA 58 DB 9E 
RS= 
A4 56 82 
02 AI FC 
A4 55 87 
C6 F3 IE 
Cl 47 AE 







Now we have 3 of the most important vectors .The three vectors M., M0 , and S 
form the basis ofthe key schedule. 
2.4.3.1 Additional Key Lengths 
Twofish encryption algorithm is a very flexible algorithm. Twofish can accept 
keys of any byte length up to 256 bits. For key sizes that are not defined 
above, the key is padded at the end with zero bytes to the next larger length 
that is defined. For example, an 80-bit key mo, .. .. , m9 would be extended by 
setting m; = 0 fori= 10, ...... , 15 and treating it as a 128-bit key. 
2.4.3.2 The Function h 
Another very important and subset of the F-function is the h-function. The 
figure below shows an overview of the function h. This is a function that takes 
two inputs - a 32-bit word X and a list L = (Lo, ..... ,Lk-1) of 32-bit words of 
length k - and produces one word of output. This function works in k stages. In 
each stage, the four bytes are each passed through a fixed S-box, and xored 
with a byte derived from the list. Finally, the bytes are once again passed 
through a fixed S -box and the four bytes are multiplied by the MDS matrix 
just as in g. More formally: we split the words into bytes. 
8 8 lu = [L;/2 1 ]mod2 
x1 =[X /281 ]mod28 
16 
fori= 0, ... .. ,k- I and j = 0, ..... ,3. Then the sequence of substitutions and xors 
is applied. 
Yk.j=Xj j=0, ... ,3 
yo = qt[ qo[ qo[y2.o] E& II,o] E& l o,o] 
yt = qo[qo[ qt[y2,1] E& lt,t] E& /o,t] 
y2 = qt[ qt[ qo[y2,2] E& lt,2] E& l 0,2] 
y3 = qo[ qt[ qt[y2,3] E& lt,3] E& l o,3] 
Furthermore, qo and q1 are fixed permutations on 8-bit values that will be 
defined shortly. The resulting vector of y 1 's is multiplied by the MDS matrix, 














where Z is the result of h 
Ifk = 4 we have 
Ifk >= 3 we have 
y3,0 = qt[y•.o] E& h,o 
y3,1 = qo[y4,t] E& h,t 
y3,2 = qo[y4,2] E& h.2 
y3,3 = qt[y4,3] E& !3,3 
y2,o = qt[y3,o] E& l2.o 
y2,1 = qt[y3,t] E& Z,t 
y2,2 = qo[y3,2] E& /2,2 
Y2,3 = qo[y3,3] E& h,3 
2.4.3.3_Key-dependent S-boxes 
Key dependent S-boxes are very important in cryptography. They prevent the 
attackers from knowing what is in the S-boxes. It is well understood that the 
17 
complexity of the keyed S-boxes depends on the length of the key. S-boxes in 
the function g can be defined by 
g(X) = h(X,S) 
That is, fori= 0, ..... , 3, the key-dependent S-box s;is formed by the mapping 
from xi toy; in the h function, where the list L is equal to the vector S derived 
from the key. The downside of this operation is that it takes longer to set up 
for a key since S-boxes have to be built for each key. 
MDS 






2.4.3.4 The Expanded Key Words Kj 
The words of the expanded key are defined using the h function. 
p = 224 + i 6 + 28 + 2° 
At= h(2ip,M,) 
Bt = ROL(h((2i + l)p,Mo),8) 
K2t =(At+ Bt)mod232 
K2t +1 = ROL((At +2Bt)mod232 ,9) 
The constant p is used here to duplicate bytes; it has the property that for i = 
0, ... , 255, the word i p consists of four equal bytes, each with the value i. The 
function h is applied to words of this type. For Ai the byte values are 2i, and 
the second argument of h is Me. Bi is computed similarly using 2i + I as the 
byte value and Mo as the second argument, with an extra rotate over 8 bits. 
The values Ai and Bi are combined in a PHT. One of the results is further 
rotated by 9 bits. The two results form two words of the expanded key. 
2.4.3.5 The Permutations qo and q1 
Permutations are very important in cryptography. The permutations qo and q1 
are fixed permutations on 8-bit values. They are constructed from four 
different 4-bit permutations each. For the input value x, the corresponding 
output value y is defined as follows: 
ao,bo = [x !16],xmod16 
m = aotf!bo 
b1 = ao tfJ ROR•(bo,1) tfJ 8aomodl6 
m,b2 = to[m],fl[b1] 
GJ = G2 tfJ b2 
bJ = a2 tfJ ROR•(b2,1) tfJ 8mmodl6 
a•,b• = t2[m],tJ[bl] 
y = 16b4+ G4 
where ROR4 is a function similar to ROR that rotates 4-bit values. First, the 
byte is split into two nibbles. These are combined in a bijective mixing step. 
Each nibble is then passed through its own 4-bit fixed S-box. This is followed 
19 
by another mixing step and S-box lookup. Finally, the two nibbles are 
recombined into a byte. For the permutation qo the 4-bit S-boxes are given by 
tO= [8 1 7 D 6 F 3 2 0 B 59 E C A 4] 
tl = [E C B 8 I 2 3 5 F 4 A 6 7 0 9 D] 
t2= [B A 5 E6D90C SF 3 2 4 71] 
t3 = [D 7 F 4 1 2 6 E 9 B 3 0 8 5 C A] 
where each 4-bit S-box is represented by a list of the entries usmg 
hexadecimal notation. (The entries for the inputs 0, 1, ... ,15 are listed in 
order.) Similarly, for q1 the 4-bit S-boxes are given by 
tO = [2 8 B D F 7 6 E 3 1 9 4 0 A C 5] 
tl = [1 E 2 B 4 C 3 7 6 D A 5 F 9 0 8] 
t2 = [4 C 7 5 1 6 9 A 0 ED 8 2 B 3 F] 
t3 = [B 9 5 1 C 3D E 6 4 7 F 2 0 8 A] 
One interesting point to note here is the usage of the 1 bit rotation. The 
technique is widely employed in the field of cryptography and Twofish round 
to break up the byte aligned nature of the operations. Each of the 4 32-bit 
quantities in the block is used once in each of the 8 possible bit positions (mod 
8). 
2.4.4 Round Function Overview 
The figure below shows a more detailed view of how the function F is 
computed each round when the key length is 128 bits. Incorporating the S-box 
and round subkeygeneration makes the Twofish round function look more 
complicated, but is useful for visualizing exactly how the algorithm works. 
20 
2r + 8 r PHT - 1 
MDS r,--1 --,--+1...!-1----,-h 
2r+8-~ I 
I 
2r+B-~ I [ ________________ J 
~----------------1 
I I 
I I rPHT- 1 
RO __,I Mosl-1-1---+--->~--~+--r-~-____,.f-H--+--- Fo 
I I 
I I 
I I I 
[ ___ ----- ______ j 
rs~ ~~ 1 
------------
: I 
I I I 
Mosf-LI _LJ---..1-tT;----->ttt-----+ F1 
: L __ l 
R1 
I 
, ________________ J 
Figure 3: A view of a Single Round F Function (128-bit Key) 
21 
CHAPTER3:METHODOLOGY 
3.1 PROCEDURE IDENTIFICATION 
Undersianding the Algor~hm 
Mathematical Conceflls and Theory 
Ability to Explain in Detail 
-Prove that understanding is clear 
\MDL 
Tools arid FPGA 
Figure 4: Project Development Block 
Blodc optimization 
From the chart above, the project is broken into 3 main blocks namely Twofish 
Algorithm, FPGA and VLSI Techniques, and Optimization. For the project, the short 
term priority is to understand the algorithm including the mathematical concepts and 
theory. This is a VERY IMPORTANT step, before the next step is considered. Once 
the nature of the algorithm is fully understood, implementation of the other steps 
would be more straightforward. The medium term and long term objective is to 
incorporate both the learning of VHDL, understanding the tools and FPGA as well 
combining the digital design with the block optimization. 
3.2 TOOLS 
The tools required are as follows: 
• VHDL Simulator- ALDEC 
• FPGA Synthesizer - XILINX Synthesizing Tools 
• FPGA- Spartan2 - XC2S200-5PQ208 
22 
CHAPTER 4: PROJECT WORK 
This is the heart of the report. Basically the project work is divided into 2 main 
categories:-
• Design Implementation - This section all the design decisions that have 
been made to implement the project. It covers the major building blocks of 
the design and how the various blocks have been integrated to achieve an 
overall complete design. 
• Programming Strategy - This section covers how the whole 
programming work started with a top to down approach. It only explains 
the top level blocks without going deep into each block. The justification 
is, each block consists of multi level blocks and it is too long to explain in 
this report. 
4.1 DESIGN IMPLEMENTATION 
Over the period of implementation of this project, 2 different designs with different 
initial objectives were implemented. The first design was implemented in the first 
semester and the second design was implemented in the second semester. The general 
objectives of both designs are described below:-
• Design 1 
o Minimum hardware resources usage 
o Using a single F-Fnnction (modified version) 
o Optimized design with reasonable latency, throughput and throughput 
per gate. 
• Design 2 
o Reasonably minimum hardware resources usage 
o Using 4 units of modified F-Function of Design 1 
o Very small latency 
o Very high throughput 
o Very high throughput per gate 
23 
Besides that, decision has also been made to stick to a general design architecture 
(Figure 1) but modifying the F-Function accordingly to the objectives of Design 1 and 
Design 2. The design implantation covers 2 main parts namely design decisions and 
also integration and overall structure. 
Note: The explanation begins with the common design modules and where 
necessary the differences between Design 1 and Design 2 would be highlighted. 
4.1.1 Design Decisions 
The Two fish structure offers a great deal of flexibility in terms of space versus 
speed tradeoffs. I have decided to go for a minimum hardware implementation 
of the algorithm with hopes of fitting the circuit on Xilinx Spartan 200 FPGA. 
Among the design decisions are as follows: -
• One h-function, instead of 2 would be used in computing the K-
subkeys and the encrypted data (Refer Figure 2). This means the 
modified F- Function would only consist of I h-function and not 2 h-
functions as proposed by the author. (Applicable to Design 1 only) 
• Uses 4 individual units of h-functions whereby in 1 F-function there is 
only 1 h-function. In other words, 4 units of modified F-function of 
Design 1 would be used to construct I major unit ofF-Function of 
Design 2. (Applicable to Design 2 only) 
• Zero keying would be implemented as it would be a better choice as 
the RAM needed would consume too much space on the FPGA. 
• An ecryptor/decryptor would be implemented. 
4.1.1.1 Building Blocks 




• Maximum Distance Separable (MDS) 
• Reed-Solomon Matrix 
• Operation Selector 
24 
4.1.1.1.1 Q-Permutations 








LEAST SIGNIFICANT 4 BITS MOST SIGNIFICANT 4 BITS 
Figure 5: Q·Permutation 
25 
The Q-Permutation is one of the most important elements at the core of the 
design of Two fish. Lookup tables are involved in this part. The permutation is 
executed on a byte of input, which is split in two before being rotated and 
modified along different paths. The most important operations of the 
permutation are executed with four lookup tables labeled tO, tl, t2 and t3. Each 
lookup table takes 4 bits of input and produces a 4-bit value. Each lookup 
table thus has 16 entries of 4 bits. There are four lookup tables per q 
permutation and two different q-permutations, qO and q 1, each with its set of 
lookup table. The implementation of the q permutations especially the lookup 
tables could have been implemented using the ROM but it would have taken 
up a lot of space. This is because a large number of q permutations have to 
operate in parallel. Thus a separate ROM would have needed for each instance 
of the q permutation. To save space and make it more efficient, the lookup 







Figure 6: S-boxes 
26 
The S-Box operates on a 32-bit word. Each byte of the word passes through 
three Q-Perrnutations. This is evident by the figure above. The output of each 
bank of q-perrnutations is then recombined into a word and XOR-ed with a 
32-bit value [7]. These two values are derived from the key material and 
Twofish's s-boxes are thus referred to as "key-dependent" s-boxes. 
4.1.1.1.3 Maximum Distance Separable Matrix 
MDS= 
01 EF 5B 5B 
5B EF EF 01 
EF 5B 01 EF 
EF 01 EF 5B 
Figure 7: MDS 
One of the most important diffusion elements is the Maximum Distance 
Separable. The Maximum Distance Separable (MDS) Matrix is a 4x4 matrix 
of bytes that multiplies a vector of four bytes. Multiplications are carried out 
in the Galois Field GF(28) with the primitive polynomial x8 + x6 + x5 + x3 + 1. 
Each byte is converted into a polynomial in which each power p of x is present 
only if the p-th bit is 1. A multiplication in GF amounts to a multiplication of 
polynomials followed by a division by the primitive polynomial. The result is 
converted back to a bit vector by setting a bit to 0 if the corresponding power 
of x has an odd coefficient and 1 otherwise (modulo 2 divisions). In this case 
the computations are fairly straightforward since there are only three 
coefficients: Ox01, OxEF and Ox5B. The result of a multiplication can be 
reduced to a series of XOR's for each bit of the output. For example, 
multiplying a by 5B results in byte b [ 4-5]: 
bO = a2 xor aO 
b1 = a3 xor a1 xor aO 
b2 = a4 xor a2 xor a! 
b3 = a5 xor a3 xor aO 
b4 = a6 xor a4 xor a! xor aO 
27 
b5 = a7 xor aS xor a! 
b6 = a6 xor aO 
b7 = a7 xor a! 
NOTE: We can't continue the coding, without generating these tables because 
without reducing the expression, the expression can't be 
implemented on hardware. So it is very important, to generated the 
tables in a proper sequence and to make sure they are free from 
errors. If any of the tables contain errors, the encryption/decryption 
process would not produce correct results. 
Refer APPENDIX A, to view the remaining minimization of multiplicand 
and how the minimization was implemented through a structured table. 
4.1.1.1.4 Reed-Solomon Matrix 
msi 
mBi+I 
5 i,o 01 A4 55 87 SA 58 DB 9E mBi+2 
si,J A4 56 82 F3 IE C6 68 ES m8i+3 
= * 5 i,2 02 AI FC Cl 47 AE 3D 19 mBi+4 
si,3 A4 55 87 SA 58 DB 9E 03 mBi+5 
msi+6 
mBi+7 
Figure 8: Reed-Solomon Matrix 
The Reed-Solomon (RS) matrix is also another diffusion element. But the 
diffusion element takes effect on the key materials and not on the ciphertext 
(data). The Reed-Solomon (RS) matrix is similar to the MDS matrix. In this 
case the multiplication is executed between an 8x4 matrix and a vector of 8 
bytes. The computations are done in GF (28) with a different prime 
polynomial: x8 + x6 + x3 + x2 + I. Unlike the MDS matrix, the RS matrix has a 
large number of different multiplicands. In order to minimize the resources 
used, a series of XOR's was derived for each of the 23 multiplicands to give 
equations similar to those seen in the MDS matrix. These equations were not 
given and had to be derived [6]. (Refer APPENDIX B) 
28 
NOTE: We can't continue the coding, without generating these tables because 
without reducing the expression, the expression can't be 
implemented on hardware. So it is very important, to generated the 
tables in a proper sequence and to make sure they are free from 
errors. If any of the tables contain errors, the encryption/decryption 
process would not produce correct results. 
Refer APPENDIX B, to view the remaining minimization of multiplicand 
and how the minimization was implemented through a structured table. 
4.1.1.1.5 Operation Selector 
.·. <<<1· .. 
I .<<<1 
.·· . 
. .... . 
--.J····.· xor ···• 
. xor 
---t xor .. 
ENCRYPTION DECRYPTION 
Figure 9: Operation Selector 
The authors developed Twofish encryption algorithm to be a very symmetric 
algorithm. Encryption and decryption can be executed with almost all the 
29 
same pieces of hardware. This is evident from the figure above. The first 
difference is that the sub-keys must be used in reverse order. The second 
difference is at the output stage of the round. As shown above, the output stage 
consists of a rotation and an XOR. Basically the block diagram of the 








Figure 10: Building Block for Operation Selection 
4.1.2 Integration and Overall Structure 
The integration and overall structure basically covers the following sections: -
• Input Register Module 
• Outer Register Module 
• Key Register Module 
• Modified F-Function 
• Design Overall Structure 
30 
• Controller 
4.1.2.1 Input Register Module 
DATA 
LATCH/INPUT ---1 DATA (1.28 BITS) . 
LSW MS w 
KO XOR XOR XOR XOR ----f +- ~ ·· .. 
K1 
POxorKO P1xorK1 P2xorK2 P3xorK3 
Figure 11: Input Register Module for Design 1 
The above diagram shows the input register module. The input register module 
combines the data register with the input whitening stage. For this project, it 
was decided to use the data block that is 128 bits wide. A signal is needed to 
force data to be latched into the register. The input whitening is done 
asynchronously. Here subkeys are xored with 32 bits data blocks. Basically 
the output of this module are four 32 bits words. These words corresponds to 
the input whitening of the 4 words on the input data. This output is valid when 
the 4 sub-keys are set correctly on the input. Sub-keys KO, Kl, K2 and K3 










' > '',', ••',,,'. ,'' ,'•' ·xoR~ ~XOR 





I' - , ' ,',,,, , ,.,,,, , ' , , ' ' • ', ,,, ,,',,, 
LATCH LS ----.. LEASTSIGNIFICANl64 M0STSIGNIFICANl64 ~ LATCH MS 
BITSOUTPUT ,.,,, 1 BITSOUTPUT ',,',' 
'', , , , , , , , ,, , • ', I , , ', ,,,,, ,' , ',>, , , 
Figure 12: Output Register Module for Design 1 
The diagram above shows the output register moduk The output register 
module incorporates the output whitening stage and the output register. Four 
words of input are XOR-ed with 4 sub-keys to produce the output The sub-
keys must be set to K4, K5, K6 and K7 for encryption and KO, Kl, K2 and K3 
for decryption. Due to the structure of the cipher only two keys will be 
available at any time. The output is thus latched in two steps with signals 
latchLS and latchMS. In short, at any time, two 64 bits of output values are 
available at the output registers of which only I value corresponds to the 
instantaneous value. This depends on which of the pair of subkeys are fed at 













Figure 13: Output Register Module for Design 2 
K7 
K6 
Both Design 1 and Design 2 are quite similar, but the only difference is for 
Design 2, by sending a LATCH_ CIPHERTEXT signal with the presence of 
the output as mentioned above, the whole complete 128 bit ciphertext will be 
available at the output register. The benefit is only 1 clock cycle is needed to 
achieve this result unlike 2 clock cycles for Design 1. 
4.1.2.3 Key Register Module 
Another very important part in the design consideration is the key register 
module. The key register module is used to store the 128- bit key material and 
to produce two derived values (SO and S 1) using the RS matrix. The key is 
simply latched into the register when the 'Latchkey' signal is raised. The two 
S-values are computed in two steps by reusing a single RS-matrix. During the 
33 
clock cycle in which the key is latched a multiplexer sets the input of the RS 
matrix to the least-significant double word of the key and the result is stored in 
the SO register. At all other times the input is set to the most-significant double 




K. ·E· ••·•Y·····•·•.·M.A'T•.···•.E'~I·A·· .. ···•L··•····. 4 .•~·s······.•··•B• ·.I·.T·· • •&.··• ·.. • .. '"'' ' " '"" ft ' ' :1 ,., " ·' " 
'"" ''" "' '"''" "'""'""'""' .,,, .. , '"' '' "'''' ,,,,,, 
MS DOUBLE WORD LS DOUBLE WORD 
. RSJvlATRI.X. 
so S1 
Figure 14: Key Register Module 
34 
4.1.2.4 Modified F-Function 
Design 1 
One of the fundamental design decisions is in the minimizing the hardware 
resources. As a result the structure of the cipher had to be modified in order to 
be able to use the same h-function for all computation purposes. The diagram 
below shows the modified F -function that was designed in order to meet the 
design requirements. As evident in the diagram above, the input and a rotated 
version of the input are multiplexed to reproduce both inputs that can be 
presented at the input of an h-function. Moreover, in order to compute the 
result of the PHT, two registers had to be added to store the value of the first 
input or the former while the second was being computed. This registers are 
needed to temporary hold the values. Thus, these two registers act as 
pipeline registers. The other two multiplexers added are used to choose 
between the path used for computing a key or the path used to encrypt or 
decrypt data. The signal INPUT _F could carry either key data or plaintext 
data. The key data could be even key data or odd key data. The same thing 
goes for the plaintext data. It could be even plaintext data or odd plaintext 
data. A complete key cycle takes place when we obtained the signal 
OUT_EVEN and OUT_ODD. This can only takes place after 2 clock cycles. 
The reason is, at any clock cycle only one data appears at INPUT_ F signal. At 
this time, the OUT_EVEN is carrying output even key signal and the 
OUT_ODD refers to output odd key signal. The same thing goes for the 
plaintext. For a complete plaintext cycle to take place, two clock cycles are 
needed. Once again, after 2 clock cycles, the OUT_ EVEN is carrying output 
even plaintext signal and the OUT_ ODD refers to output odd plaintext signal. 
As for the overall design, a complete cycle for a round only takes place after 4 
clock cycles that is after obtaining the signal values of output even key signal, 













I H-FUNCTION I 
I (BLOCK) J 







Figure 15: Generalized F-Function of Design 1 
Design 2 
The modified F-Function of Design 2 has utilized the flexibility of the 
modified F-Function of Design 1. As can be seen in the diagram, there are 4 
input signals namely evenkeygenerator, oddkeygenerator, 
evenplaintextgenerator and oddplaintextgenerator. This 4 input signals would 
generate 4 corresponding output signals namely output even key signal, output 
odd key signal, output even plaintext signal and output odd plaintext signal. 
Furthermore, the complete cycle for a round process that takes 4 clock cycles 
to be performed in Design 1 takes only 1 clock cycle in Design 2. Another 
36 
interesting point to note is that the lower 2 blocks namely Block 3 and Block 4 
could be used to calculate (generate) signals evenkeygenerator and 
oddkeygenerator. These 2 blocks are special modified blocks that could 
perform its functions besides performing functions of Block 1 and 2. This 
could be done by sending the input signals of evenkeygenerator to 
evenplaintextgenerator and oddkeygenerator to oddplaintextgenerator. These 
special capabilities make Design 2 a very impressive design because this 
feature is needed during input and output whitening processes. During these 
processes, only keys are calculated. If we don't have these features, then 
Block 3 and Block 4 would be idle because they would not be able to calculate 
the keys. Benefiting from these features, no blocks would be idle anymore 
thus reducing the latency and increasing the throughput. 
NOTE: The term evenplaintextgenerator and oddplaintextgenerator 
would be used extensively through out the report. The plaintext term that 
is mentioned here actually refers to intermediate value and not the 
original value. This is because there are 16 rounds, and the intermediate 





I [FjP~ 1-r.l_ _____ EVEN_KEY 
• I I H-FUNCTION I 
ODDKEYGENERATOR ,_(BLOC~ j 
s-
BOXE$ 
I H-FUNCTION I 






I L-----+ +h.l_ ____ EVEN_PLAINTEXT 
I H-FUNCTION I 
,_!BLOCK 3) j I. 
I 
ODDPLAINTEXTGENERA TOR OUTPUTODDPLAINTEXTGENERA TOR I 
;-] 
I 
I H-FUNCTION I 
,_!BLOCK 4) j 
Figure 16: Generalized F-Function of Design 2 
38 
ODD _PLAINTEXT 
4.1.2.5 Design Overall Structure 
Design 1 
One of the most challenging parts is combining all the individual parts and 
laying them to form a complete architecture. This is shown by the diagram 
below. The modified F-function is the most important element and it needs to 
be carefully arranged with other blocks. Integrating this modified F-function 
to the other elements of the design resulted in the final structure shown above. 
The four multiplexers shown on top are used to select between the data from 
the clear text module used at the beginning of an encryption/decryption cycle 
and the data obtained after each round of encryption/decryption. The registers 
at the output of the modified F function are also "pipeline" registers. They 
store the values of the K-subkeys (K2r+8 and K2r+9, r ranging from 0 to 15) 
used in each encryption/decryption cycle. After each round the data is latched 
into 4 registers. This step also performs the required swap after each round. A 
finite state machine was implemented to control the system. The design had to 
be carefully implemented so that the modified F- function is highly efficient 
and continuously processes information. 
39 
.REG REG REG. .REG 
REG 
32 MODIFIED 










,,,,, . . 
,.,, .. ,,'' '',,,,,••,'·,···' • 
,'.,,',,,,,', .. , ',.,.,.,' ... ,, ,',' ,,, 
',,' .. ,,,''''• 
PLAINTEX.T.MODUtE ., ',,. 
,' .. ,, . 
',', 
,' ' .,,','''' ',• ', . ,,';•'•,•'', ,, '': > •.• ' ',,'', ' ' '., 
~1 ~uy '{uy ~uy 
' ',' 
,,,, ,',,,, 
' ' ,,'' 
'' . 
REG REG REG •REG 
,',''. ' .•. '.,' 
,.,,. 









7 1¥ r-> 
--+ 
,,,., . F· , ,,'•,• 
I 
CONTROlLER 




' ' '',' ,'' ,'., '., ,,'' 
•,', SELECTOR 
' 












,, ,,,,'.,, .. ,
' '' 
,,, ','', 
',,' ',,,.' ,'. ,' ,' ,. ,,' ', 
Figure 18: Structure of the Cipher of Design 2 
41 
As can be seen in the above diagram, Design 2 is almost identical to Design 1, 
but the difference is the modified F-Function. With this new design, we could 
observe that there are 4 inputs to the modified F -Function namely 
inputevenkeygenerator, inputoddkeygenerator, inputevenplaintextgenerator 
and inputoddplaintextgenerator as compared to Design I where there is only I 
input. This particular new feature speeds the completion time by a factor 
of 4. Besides that, we do not need any registers at the output of the modified 
F-Function to store any temporary values. Other than that the overall 
architecture is quite similar to Design I. It was observed that significant 
performance improvement could be achieved with this design. 
4.1.2.6 Controller 
Design 1 
One of the most challenging and tedious element is the controller. Once the 
architecture have been laid, it is very important to have some sort of controller 
mechanism to generate control signals for the proper operation of the 
algorithm. The cipher is controlled by a relatively complex finite state 
machine. While in the idle state, the controller waits for a user request and 
advertises its availability by raising a signal. The 'loadKey' signal makes the 
controller go to the 'loadKey' state in which it latches the key material from 
the 128-bit input port. 
42 
Output Whitening 
Compute Output Compute Output Compute Output Compute Output 
Whitening Key 3 
-
Whitening Key 2 Whitening Key 1 
-




Compute Odd. Com~~:kEven 
. ·.•·. ·. 
1 
Block ...••. 
Load 128 bit key .. . . .. Idle 
·. 
Compute Even Compute Odd Load Key Key 





'··· .. Compute Input Compute Input 1 Compute Input 
1----
Compute Input 





Figure 19: The Controller of Design 1 
The 'start' signal puts the controller in the encryption or decryption mode. 




whether in encryption or decryption mode. It is quite important to note that the 
controller operates almost exactly the same way when doing encryption or 
decryption. The first step is to compute the four input whitening sub-keys. 
After each pair is computed the two registers on the top left of Figure 15 are 
latched with the output of the input register module. After the second pair, the 
two other registers on the right are latched. The controller then goes through 
16 rounds of encryption or decryption. These rounds consist of four states. The 
even and odd keys for the round are computed and latched into to registers. 
The F-function is then used to compute a pair of results, which are added to 
the content of the registers. At the end of the round the result is latched into 
registers. The outputs of the rounds are also swapped between the left and 
right side of the circuit. After the end of the 16 rounds the four output 
whitening sub-keys are computed and used two-by-two to whiten the output. 
Soon after the controller goes back to the idle state, the cipher text register is 
latched with the output of the process. Note that sub-keys have to be generated 
by setting the input of the F-function with values ranging from 0 to 39. Doing 
this directly in the controller would be awkward. The controller thus uses a 
counter to do the job. This counter counts in the order in which the keys 
need to be generated (0,1,2,3,8,9,10 ... 39,4,5,6,7 for encryption). It can be 
disable so that it counts only when needed. The counting is in the reverse 
direction for the decryption part. 
Design 2 
The controller of Design 2 is almost similar to Design I. The only difference 
is the controller sends control signals so that 4 elements are computed 
simultaneously at any one time. This could be observed in the figure below. 
After loading key, the controller computes the 4 input whitening keys 
simultaneously before proceeding to the computation of the round values. In 
each round, 4 values are computed namely even key, odd key, even plaintext 
and odd plaintext. After 16 rounds, the controller starts computing the output 
whitening keys. All 4 keys are computed simultaneously. Completion of this 
loop indicates the end of the process. Therefore, it is important to note that the 
controller must be designed very carefully so that proper synchronization of 
values takes place. 
44 
1··· ... ·. ..•. ·.· •.• < . 
Output COMPUTE •· .. ·. COMPUTE. . •.. ·· COMPU~E .··· COMPUTE .. • 
• 
OUTPUT .·. OUIPUT .OUTPUT OUIPUl · Whitening 




·.•·. < ·. 








Idle ··bit . 




I . ··•·• .. •••• 
. . •.· 
'-'--"" CdMPUTE.EVE~ CO~PUTEOPQ COMPUTE EVEN ·COMPUTE ODD 
! • ·. 
KEY KE.Y . . PlAINTEXT •... ·. . PlAINTEXT 
Load ... .· . . . ····· .. 




..... · ...... I · .. ····• .. · ..•. . ... • 
... Key 16 Rounds 
Signal 
Start Signal 
......•... · .··.·•···· 
. I. I< ... . .... .... : 
. ··• Input CO~PUIEI~Pilt COMPUT~INPU1 CQMPUIEINPUJ COMPOTEINPUf Whitening WKITENINGKEYO WHITENINGKEYJ WHITENINGKEYi WHITENING KEY! 




1 •.••••... ··.· ..· ...•.. I 
••• •••••••• 
Figure 20: The Controller of Design 2 
45 
4.2 PROGRAMMING STRATEGY 
I have decided to use VHDL to program the design. The simulation tools would be 
Aldec. The project manager would be Xilinx Webpack- ISE6.2i. With the design 
decisions have been laid, it is very important to strategize the programming decision. I 
have decided to use the top to down method. Here I break the overall design into 
major modules. The major modules are further broken into smaller modules until 
individual discrete modules are obtained. This is a very structured programming style. 
Besides that, breaking the design into smaller discrete units enables reuse of the 
modules in other bigger modules. This enables saving of time and a more efficient 
programming style. Furthermore discrete unit modules, enables efficient unit testing 
to be performed, followed by module testing, and finally integration testing. Overall 












Figure 21: Hierarchy of the Design Flow 
46 
As can be seen above, the Twofish core is basically divided into 9 major sub modules. 
Each sub modules perform specific task. The block diagram of the various main 
modules is shown below. Differences between Design I and 2 will be highlighted 
where necessary. 
NOTE: Only major sub modules are described. 
4.2.1 Twofish Core 
This component is the topmost component in the hierarchy . 
INPORT[127:0] . 









Figure 22: Block Diagram of the Two fish Core- with 1/0 pins 
Table 1: Two fish Core 
INPORT[127:0] Accepts 128 bit of plaintext from the user. 
INKEY[127:0] Accepts 128 bit of input key from the user. 
CLK Clock 
RESET Reset 
USR LD KEY Loads the key. 
USR START Starts the whole process. 
-
(Encryption/Decryption) 
USR ENCRYPT States the process. (Encryption/Decryption) 
-
47 
IDLE Shows whether the system is idle. 
OUT_CIPHERTEXT [127:0] Outputs the ciphertext. 
For this component, the main operations are as follows: 
Load Key 
The following signals have to be high: 
• INKEY='l' 
• CLK ='I' 
• USR LD KEY= 'I' 
• RESET= '0' 
• USR START ='0' 
• USR_ENCRYPT ='!'if encryption and '0' if decryption 
Encryption 
The following signals have to be high: 
• INPORT='l' 
• CLK= 'I' 
• USR START='!' 
• USR ENCRYPT= 'I' 
• USR LD KEY= 'I' 
• RESET= '0' 
Decryption 
The following signals have to be high: 
• INPORT='l' 
• CLK ='I' 
• USR START='!' 
• USR ENCRYPT= '0' 
• USR LD KEY= 'I' 
• RESET= '0' 
48 
The Twofish Core is basically the module that a given user can control from 
the top layer. By sending the appropriate signals a given user can control 
which process is to be executed i.e. load key, encrypt, or decrypt. 
4.2.2 Full Adder 





Figure 23: Block Diagram of the Full Adder 
Table 2: Full Adder 
DATA A[31:0] Accepts a 32 bit input 
DATA B[31:0] Accepts a 32 bit input 
CIN Accepts an input carry bit 
COUT Outputs the output carry bit 




The full adder basically adds 2 -32bit values and outputs the appropriate 
results. The full adder is mostly used in PHI. 
4.2.3 Ciphertext 
Design 1 
















Figure 24: Block Diagram of the Ciphertext of Design l 
Table 3: Ciphertext- Design l 
K4[31 :0] Accepts K4 values. 
K5[3l :0] Accepts KS values. 
K6[31:0] Accepts K6 values. 
K7[31 :0] Accepts K7 values. 
1[31 :0] Accepts LO values. 
L1 [31 :0] Accepts L l values. 
L2[31:0] Accepts L2 values. 
13[31 :0] Accepts L3 values. 
CLK Clock signal 
LATCH MS Most Significant 64 bits of output 
signal. 
LATCH LS Least Significant 64 bits of 
output signal. 
CIPHERTEXT[127:0] Output Ciphertext. 
Most Significant 64 bits of output signal 
The following signals have to be HIGH or available: 
K6 [31:0] = K6 
K7 [31:0] =K7 
50 
L2 [31 :0] = available values 
L3 [31 :0] = available values 
CLK= 'I' 
LATCH MS = 'I' 
Least Significant 64 bits of output signal 
The following signals have to be HIGH or available: 
K4 [31:0] = K4 
K5[3l:O]=K5 
LO [31 :0] = available values 
L I [31 :0] = available values 
CLK= '1' 
LATCH LS = '1' 
Ciphertext 
This signal is available at any time based on any immediate input values. 














Figure 25: Block Diagram of the Ciphertext of Design 2 
51 
CIPH~RT~XT(127:0f 
Table 4: Ciphertext- Design 2 
K4[31 :0] Accepts K4 values. 
K5[31:0] Accepts KS values. 
K6[31 :0] Accepts K6 values. 
K7[31 :0] Accepts K7 values. 
L[31:0] Accepts LO values. 
Ll[31:0] Accepts Ll values. 
L2[31 :0] Accepts L2 values. 
L3[31:0] Accepts L3 values. 
CLK Clock signal 
LATCH CIPHERTEXT Latching signal. 
CIPHERTEXT[l27:0] Output Ciphertext. 
Latching 128 bits of output signal 
The following signals have to be HIGH or available: 
K4 [31:0] =K4 
K5[3l:O]=K5 
K6 [31:0] = K6 
K7 [31:0] = K7 
LO [31 :0] = available values 
Ll [31 :0] = available values 
L2 [31 :0] = available values 
L3 [31 :0] = available values 
CLK= '!' 








PO XOR KO (LSB) 
RJ'SET 
. 
LATCH_CLEARTEXT P1 XOR K1 (LSB) 
K0[31:0} Cl..fi..RTEXTIPLAINTEX'f' : . 
P2 XOR K2 (LSB) 
K1[31:0} 
K2[31:0} P3 XOR K3 (LSB) 
K3[31:0} 
. 









PO XOR K0[31 :0] 
Pl XORK1[31:0] 
P2 XOR K2[31 :0] 
P3 XOR K3 [31 :0] 
PO XOR KO (LSB) 
INPUT [31 :0] =plaintext 
CLK = '1' 
RESET= '0' 
KO [31:0] = KO 
Table 5: Cleartext/Piaintext 
Accepts 128 bits of input plaintext. 
Clock signal. 
Reset signal. 





PO XOR KO output. 
Pl XOR Kl output. 
P2 XOR K2 output. 
P3 XOR K3 output. 
LATCH CLEAR TEXT = '1' 
53 
PlXORKl 
INPUT [63:32] =plaintext 
CLK = '1' 
RESET= '0' 
K1 [31:0] =Ki 
LATCH CLEARTEXT= '1' 
P2XORK2 
INPUT [95:64] =plaintext 
CLK= '1' 
RESET= '0' 
K2 [31:0] = K2 
LATCH CLEAR TEXT= '1' 
P3XORK3 
INPUT [127:96] =plaintext 
CLK= '1' 
RESET= '0' 
K3 [31:0] = K3 
LATCH CLEARTEXT= '1' 
Reset System 
RESET= 'I' 
This part basically takes the input plaintext and XORs it with the 
corresponding input whitening key. 
4.2.5 Controller 
This is the most complex system of the whole process. It integrates all the 
components by sending the appropriate signals. By sending the wrong signals, 
the integrated components would not work as expected. As results a complex 




Compute Output CQmpute Output Compute Output Compute Output 










Load 128 bit key . Idle 
.. 










Whitening Key 0 Whitening Key 1 Whitening Key 2 Whitening Key 3 
Input Whitening 
Figure 27: Finite State Machine of the Design 1 
55 
POXOR KO 
PI XOR Kl 
OUT_ROUNO_EVEN 
OUT_ROUNO_ODD 
SOURCE_PRE _ ODD_REG 
SOURCE _PRE_ EVEN_REG 
PRE _PARAM _EVEN 














MUX MUX MUX MUX 
~ 
-
_L_ _L_ _L__ ,----
-








~ MODIFIED F, ~ CONTROLLER FUNCTION - OPERATION 
.. 

















































Figure 29: Block Diagram of the Controller for Design l 
Table 6: Controller- Design l 
PRE _p ARAM _ EVEN[31 :0] Even 32-bit input values. 
PRE _p ARAM _ ODD[31 :0] Odd 32-bit input values. 
CLK Clock signal. 
RESET Reset signal. 
LD KEY Load key signal. 
START Start encryption/decryption process. 
SEL ENC Select encryption or decryption process. 
INPUT_F[31:0] Selected 32 bit input values. 
LATCH CIPHERTEXT Signal to latch ciphertext (plaintext). 
-
LATCH KEY Signal to latch key. 
SEL ODD Signal to select odd path (odd values). 
SEL K Signal to select key. 
-




LOAD REG PRE EVEN Signal to load pre even register. 
- - -
LOAD REG PRE ODD Signal to load pre odd register. 
- - -
LOAD REG POST EVEN Signal to load post even register. 
- - -
LOAD REG POST ODD Signal to load post odd register. 
- - -
LATCH CIPHER MS Signal to latch most significant 64 bit 
- -
ciphertext. 
LATCH CIPHER LS Signal to latch least significant 64 bit 
- -
ciphertext. 
LOAD REG EVENK Signal to load even key to register. 
- -
LOAD REG ODDK Signal to load odd key to register. 
- -
LOAD MODF REG Signal to load values into MODF 
- -
register. 
CTRL ENCRYPT Control encrypt signal. 
IDLE Idle status signal. 
As we can see in the Finite State machine, the controller has to control both 
the key calculation process and also encryption/decryption process. Within 
these processes, some are even and some are odd. In general we can divide the 
whole process into 3 main processes namely as follows: 
Load Key 
PRE _P ARAM _EVEN [31 :0] = doesn't matter 
PRE_PARAM_ODD [31:0] =doesn't matter 
CLK= '1' 
RESET= '0' 
LD KEY= 'I' 
START= '0' 
SEL _ ENC = 'I' for encryption and '0' for decryption 
Even Key/Plaintext 
PRE _p ARAM _EVEN [31 :0] =even key/plaintext 




LD KEY= '0' 
START= '1' 
SEL _ ENC = '1' for encryption and '0' for decryption 
Odd Key/Plaintext 
PRE_pARAM_EVEN [31:0] =doesn't matter 
PRE_PARAM_ODD [31:0] =odd key/plaintext 
CLK= '1' 
RESET= '0' 
LD KEY= '0' 
START= '1' 
SEL_ENC = '1' for encryption and '0' for decryption 
From here on, it is quite surprising how the process can continue on its own. 
Well, there is a counter that is embedded in the controller itself. The counter 
counts in a fixed pattern. Once START goes to high. The pattern follows the 
Finite State Machine. It counts 0, 1,2,3,8,9, 10, ...... ,39,4,5,6, 7 for encryption. 
The whole counting sequence is reverse for decryption. With the counting 
sequence going on, appropriate states would be selected and thus appropriate 
control signals are generated. The necessary control signals generated for the 
appropriate state is shown below. It was a very complex effort to 
synchronize the whole system. 
59 
my_ Load Compute Compute Compute Compute Compute Compute Compute Compute Compute Compute Compute Compute 
outputlstate idle Key A KO K1 K2 K3 K4 KS K6 K7 K2r 8 K2r 9 even odd 
input f Some finite Input value at input f-KEY/DATA 
latch key 0 1 0 0 0 0 0 0 0 0 0 0 0 0 
reset cntr 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
latch cleartext 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
sel odd 0 0 0 1 0 1 0 1 0 1 0 1 0 1 
sel k 0 0 1 1 1 1 1 1 1 1 1 1 0 0 
param_source_i 
n result nextcc 0 0 0 0 0 0 1 1 1 1 1 1 1 1 
load_reg_pre_ 
even nextcc 0 0 0 1 0 0 0 0 0 0 0 0 0 1 
load _reg_pre _ 
odd nextcc 0 0 0 1 0 0 0 0 0 0 0 0 0 1 
load _reg_post_ 
even nextcc 0 0 0 0 0 1 0 0 0 0 0 0 0 1 
load _reg_post_ 
odd nextcc 0 0 0 0 0 1 0 0 0 0 0 0 0 1 
latch_cipher_ 
MS nextcc 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
latch_ cipher_ 
LS nextcc 0 0 0 0 0 0 0 1 0 0 0 0 0 0 
load reQ evenk nextcc 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
load req oddk nextcc 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
load modi req 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
load encrypt 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
idle 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
cnt enbl - 0 1 1 1 1 1 1 1 1 1 1 0 0 



















'' Key ', 
Signal 
Start Signal 
I.. .·.·.·.•.··· .·· . . •····... . . . . .·• COM~UTE COMPUTE COMPUTE COMPUTE 
OUTPUT .··.··•• OUTPUT . OUTPUT . •. • OUTPU:f . 
WHITENINGKEYO WHITENINGKEY1 I'IHITENINGKE¥2 WHITENINGKEY 3 
' ,, . .' ' ' ' - ' ' ' 
"----'---.---'------"' •. ' ' ' ' • ' ' ' ' ' ' > 
. , I .•... ··· ·•·.· .··, 
- ' ,' 
COMPUTE EVEN COMPUTE ODD COMPUTE EVEN COMPUJEODD . 
KEY ·. ••. • I • KEY •... ' j> PI.AINTExt I PtA!NTEXL 




j··.······.,, ... •.•· .· .. · •.• .... ·.·•···. ·. ) .. ··· .·,·- I 
COMPUTEINP.Ui COMPUTEINPUT COM~UtEINBUT cOMP~TEINPOJ Input 
\'lal:rENINGKEYO ~IIE~ING.KSH ~HIT;NINGKEY2 ~1IENiNGKEY3 Whitenin! 
.. . •· •. . .... I· .. ·.. . ·.··· '---~ .• "'. c--'-~ 
Figure 30: Finite State Machine of the Design 2 
61 


































' ' ' ',, ' 
CIPHERTEXTMODULE 














..•. ··• F==::::::> PARAM}OURCE _IS _RESULT 
• ····.•••F==::tf LOAD _REG_PRE_EVEN 
> 1====:¢> LOAD_REG_PRE_ODD 





Figure 32: Block Diagram of the Controller for Design 2 
Table 8: Controller- Design 2 
PRE_PARAM~EVEN[31:0] Even 32-bit input values. 
PRE _P ARAM ~ ODD[31 :OJ Odd 32-bit input values. 
CLK Clock signal. 
RESET Reset signal. 
LD KEY Load key signal. 
START Start encryption/decryption process. 
SEL ENC Select encryption or decryption process. 
INPUT~ EVEN~ KEY 32-bit Even key input 
INPUT~ ODD _KEY 32-bit Odd key input 
63 
INPUT EVEN PLAINTEXT 32-bit Even plaintext input 
- -
INPUT ODD PLAINTEXT 32-bit Odd plaintext input 
- -
LATCH CLEARTEXT Signal to latch cleartext 
LATCH KEY Signal to latch key. 
SEL ODD Signal to select odd path (odd values). 
SEL K Signal to select key. 
PARAM_SOURCE_IS_RESULT Signal to indicate parameter source is 
result. 
LOAD REG PRE EVEN Signal to load pre even register. 
- - -
LOAD REG PRE ODD Signal to load pre odd register. 
- - -
LOAD REG POST EVEN Signal to load post even register. 
- - -
LOAD REG POST ODD Signal to load post odd register. 
- - -
LATCH CIPHERTEXT Signal to latch ciphertext (plaintext). 
LATCH CIPHERTEXT Signal to latch ciphertext. 
CTRL ENCRYPT Control encrypt signal. 
IDLE Idle status signal. 
Once again as we can see in the Finite State machine, the controller has to 
control both the key calculation process and also encryption/decryption 
process. Within these processes, some are even and some are odd. In general 
we can divide the whole process into 3 main processes namely as follows. 
These processes for Design 2 looks similar as Design I but there are some 
differences indeed internally: 
Load Key 
PRE]ARAM_EVEN [31:0] =doesn't matter 
PRE]ARAM_ODD [31:0] =doesn't matter 
CLK = '1' 
RESET= '0' 
LD KEY= '1' 
START= '0' 
SEL_ENC = '1' for encryption and '0' for decryption 
64 
Even Key/Plaintext 
PRE _P ARAM _EVEN [31 :0] = even key/plaintext 
PRE _P ARAM _ODD [31 :0] = doesn't matter 
CLK= '1' 
RESET= '0' 
LD KEY= '0' 
START='1' 
SEL _ ENC = '1' for encryption and '0' for decryption 
Odd Key/Plaintext 
PRE_PARAM_EVEN [31:0] =doesn't matter 
PRE_PARAM_ODD [31:0] =odd key/plaintext 




SEL _ENC = '1' for encryption and '0' for decryption 
From here on, it is quite surprising how the process can continue on its own. 
Well, there is a counter that is embedded in the controller itself. The counter 
counts in a fixed pattern. Once START goes to high. The pattern follows the 
Finite State Machine. It counts 0,1,2,3,8,9,10, ...... ,39,4,5,6,7 for encryption. 
Remember for Design 2, the counter supplies values in blocks of 4 namely 0, I 
2 and 3, the next block would be 8, 9, 10 and 11, and so on. The whole 
counting sequence is reverse for decryption. With the counting sequence going 
on, appropriate states would be selected and thus appropriate control signals 
are generated. The necessary control signals generated for the appropriate state 











Figure 33: Block Diagram of the Keymodule 
This module basically latches the input key. Then it outputs 2 derived values 


















Table 9: Keymodule 
128 bit input key. 
Signal to latch input key. 
Clock signal. 
Reset signal. 
Derived values using RS matrix (SO) 
Derived values using RS matrix (S I) 





Figure 34: Block Diagram of the Modified _F of Design 1 
66 
Modified _F is the most important component. It basically computes the even 
and odd part of the Key and encrypted data. (blocks). 
Table 10: Modified_F- Design 1 
INPUT[31 :OJ Input values to be computed - Key I data 
IN_ S0[31 :OJ Derived Values- SO 
IN_S1[31:0J Derived Values -S 1 
CLK Clock signal 
RESET Reset signal 
SEL ODD Odd signal. 
SEL K Key signal. 
IN LOAD REG Signal to load odd register. 
- -
OUT EVEN[31 :OJ Even output values. 
OUT_ ODD[31 :OJ Odd output values. 
Even Key 
INPUT [31 :OJ= Even Key -32 -bits 
IN SO [31 :OJ =Derived values- SO -32 bits 





IN LOAD REG ='0' 
- -
Odd Key 
INPUT [31 :OJ =Even Key -32 -bits 
IN SO [31 :OJ =Derived values- SO -32 bits 
IN_ S I [31 :OJ =Derived values- S 1 -32-bits 
CLK= '1' 
RESET= '0' 
SEL ODD= '1' 
SEL K='1' 
67 
IN LOAD REG=' 1' 
- -
Even Encrypted Data Blocks 
INPUT [31 :OJ =Even Key -32 -bits 
IN_ SO [31 :OJ =Derived values- SO -32 bits 
IN Sl [3l:OJ =Derived values-Sl-32-bits 
CLK= '1' 
RESET= '0' 
SEL ODD= '0' 
SEL K='O' 
IN LOAD REG ='0' 
- -
Odd Encrypted Data Blocks 
INPUT [31 :OJ =Even Key -32 -bits 
IN_SO [3l:OJ =Derived values- SO -32 bits 
IN Sl [3l:OJ =Derived values- Sl -32-bits 
CLK= '1' 
RESET= '0' 
SEL ODD= '1' 
SEL K='O' 
IN LOAD REG='!' 
- -
Design 2 
Modified _F is the most important component for Design 2 also. For Design 2, 
there are 4 components that make up the Modified F-Function namely:-
• EVENKEYGENERATOR 
• ODDKEYGENERA TOR 
• EVENPLAINTEXTGENERATOR 







EVENKEYGENERATOR F==::==~ OUTEVENKEY[31:0] 
RESET 
Figure 35: Block Diagram ofEvenkeygenerator 
Table 11: Evenkeygenerator 
INPUT[31 :0] Input value - Even Key 
IN_ S0[31 :0] Derived Values- SO 
IN_Sl[31:0] Derived Values -S 1 
CLK Clock signal 
RESET Reset signal 
OUTEVENKEY Output value- Even Key 
Even Key 
INPUT [31 :0] = Even Key -Input- 32 -bits 
IN _SO [31 :0] =Derived values- SO -32 bits 
IN_Sl [31:0] =Derived values-Sl-32-bits 
CLK= '1' 
RESET= '0' 









Figure 36: Block Diagram of Oddkeygenerator 
Table 12: Oddkeygenerator 
INPUT[31 :0] Input value - Odd Key 
IN_ S0[31 :0] Derived Values- SO 
IN_Sl[31:0] Derived Values -S 1 
CLK Clock signal 
RESET Reset signal 
OUTODDKEY Output value - Odd Key 
Odd Key 
INPUT [31 :0] =Odd Key -Input- 32 -bits 
IN_ SO [31 :0] =Derived values- SO -32 bits 
IN S 1 [31 :0] =Derived values- S 1 -32-bits 
CLK= '1' 
RESET= '0' 









VENPLAINTEXTGENERA TO'll=====~ OUTEVENPLAINTEXT[31 :0] 
==~ 
RESET 
Figure 37: Block Diagram of Evenplaintextgenerator 
Table 13: Evenplaintextgenerator 
INPUT[31 :OJ Input value- Even Plaintext/Even 
Key 
IN_ S0[31 :OJ Derived Values- SO 
IN_S1[3l:OJ Derived Values -S 1 
CLK Clock signal 
RESET Reset signal 
OUTEVENPLAINTEXT Output value- Even 
Plaintext/Even Key 
Even Plaintext I Even Key 
INPUT [31 :OJ = Even Plaintext -Input- 32 -bits 
IN_SO [31 :OJ =Derived values- SO -32 bits 
IN_S1 [31:0J =Derived values- S1 -32-bits 
CLK= '1' 
RESET= '0' 












Figure 38: Block Diagram of Oddplaintextgenerator 
Table 14: Oddplaintextgenerator 
INPUT[31 :0] Input value - Odd Plaintext/ Odd 
Key 
IN_ S0[31 :0] Derived Values - SO 
IN_Sl[31:0] Derived Values -S I 
CLK Clock signal 
SEL ODD Selecting Odd Signal 
SEL K Selecting Key 
RESET Reset signal 
OUTODDPLAINTEXT Output value- Odd Plaintext/Odd 
Key 
Odd Plaintext 
INPUT [31 :0] = Odd Plaintext -Input- 32 -bits 
IN_SO [31:0] =Derived values- SO -32 bits 
IN _S 1 [31 :0] =Derived values- S 1 -32-bits 
CLK= '1' 
SEL ODD= '1' 
SEL K= '0' 
RESET= '0' 
OUTODDPLAINTEXT = Odd Plaintext -Output - 32 -bits 
72 
Odd Key 
INPUT [31 :0] = Odd Key -Input - 32 -bits 
IN_SO [31:0] =Derived values- SO -32 bits 
IN_S1 [31:0] =Derived values- S1 -32-bits 
CLK= '1' 
SEL ODD= '1' 
SEL K= '1' 
RESET= '0' 
OUTODDPLAINTEXT = Odd Key -Output - 32 -bits 
NOTE: Evenplaintextgenerator and Oddp1aintextgenerator can calculate 










Figure 39: Block Diagram of the Opselect 
Table 15: Opselect 
ENCRYPT Signal to indicate encryption 
decryption process. 
F_FCT_A[31:0] Even output from Function F- 32 bits 
F_FCT_B[31:0] Odd output from Function F - 32 bits 
IN_A[31:0] Input from post_param _ even- 32 bits 
IN_B[31:0] Input from post param _odd- 32 bits 
OUT A[31:0] Even output -3 2 bits 





F _FCT _A [31 :0] =Even output from Function F 
F _FCT _ B [31 :0] = Odd output from Function F 
lN_A [31 :0] =Input from post_param_even 
lN_B [31 :0] =Input from post_param_odd 
Decryption 
ENCRYPT= '0' 
F _FCT _A [31 :0] =Even output from Function F 
F _FCT _ B [31 :0] = Odd output from Function F 
lN_A [31 :0] =Input from post_param_even 
1N _B [31 :0] = Input from post_param _odd 
This part basically performs the necessary rotation based on the chosen 
process - either encryption or decryption. 
4.2.9 32 bit Register 
DATA[31:0] 
CLK 
.. . 0[31:0] 
fLR 32·bit REGISTER 
ENABLE 
Figure 40: Block Diagram of the 32-bit Register 
Table 16: 32-bit Register 
DATA[31:0] Accepts a 32 bit value 
CLK Clock signal 
CLR Clear signal 
ENABLE Enable signal 
Q[31:0] Output value that have been latched -32 bits 
74 

















Figure 41: Wrapper for Both Design 1 and Design 2 
Due to the limitation of the I/0 ports of the Spartan 2 FPGAXC2S200 which 
has only 140 I/0 ports, a wrapper is needed to download the design into the 
FPGA. To have a 128 individual pins for plaintext, key and ciphertext each 
would not be enough for the FPGA. We have to consider that we need I/0 pins 
for control signals too. Therefore, I have decided to build a RAM for the 
purpose of storing, plaintext, key and ciphertext values. The RAM that I built 
is shown below. 
Table 17: RAM for Storing Data Blocks 
ADDRESS DATA BLOCK (32 bits each) 
lOll CIPHERTEXT 3 
1010 CIPHERTEXT 2 
1001 CIPHERTEXT I 
1000 CIPHERTEXT 0 
0111 KEY3 
OliO KEY2 





0001 PLAINTEXT I 
0000 PLAINTEXTO 
To input or output any data blocks, the necessary address and control signals 
have to be specified. This wrapper could be used for both Design 1 and Design 
2. 
Table 18: Wrapper 
ADDRESS[3:0] Signal to indicate the address 
location in the RAM 
WRITE Write signal. 
READ Read signal 
CLOCK Clock signal 
RESET Reset signal 
ENABLE Enable signal 
LOAD KEY Load key signal 
START Signal to indicate the start of a 
process (encryption/decryption) 
ENCRYPT Signal to indicate 
encryption/decryption 
APPEND Signal to append individual data 
blocks that have been extracted 
INDATABUS[31:0] 32-bit input databus 
IDLE Signal to indicate idle 
OUTDATABUS[31:0] 32-bit ouput databus 
Writing Data into the Data Block 






INDATABUS[31 :0] =Input data to be written 
Reading Data from the Data Block 









NOTE : Appending data means joining 4 data blocks of 32 bits each to obtain 
a larger data block of 128 bits. It appends everything. (Plaintext, Key and 
Ciphertext) 
Example: KEY= KEY 3 & KEY2 & KEYl & KEYO (128 bits) 
Load Key 
ENABLE= '1' 


























CHAPTER 5: RESULTS 
5.1 SIMULATION 
Two simulations each for Design 1 and Design 2 will be performed just to justify the 
validity of the results obtained. After implementing the Twofish Encryption 
Algoritlnn using VHDL, the system is ready for simulation. The system that was built 
is capable of performing both encryption and decryption. Official test vector given by 
the author during simulation to verify that the system is working in a consistent 
manner. The whole process is explained below:-
5.2 TEST VECTOR 1 -DESIGN 1 
The official test vector from the authors is as follows:-
KEY :00000000000000000000000000000000 
PLAINTEXT : 00000000000000000000000000000000 
CIPHERTEXT: 9F589F5CF6122C32B6BFEC2F2AE8C35A (expected) 
With this test vector, the simulation would be carried out for both encryption and 
decryption. 
5.2.1Encryption 
The following are the main steps that need to be followed, before beginning 
the process. 
• LoadKey 
• Start Encryption 
5. 2.1.1 Load Key 
Before we start this process, the following signals need to be set first. 
• INKEY= 00000000000000000000000000000000 
• CLK= '1' 
• USR LD KEY= '1' 
• RESET= '0' 
• USR START ='0' 
• USR ENCRYPT='!' 
79 
The clock speed that is used in the simulation is 1 OMHz . 




Figure 42: Idle- Text Vector 1 
The values when idle . 
.. reset 
"' usr_ld_key 1 
...... '" '"' .. '""' .... " ... ""' .. " ........ 
.. usr_ start 0 
"''"','"" 
.. usr_encrypt 
..... "'," ....... "'"'"'" .............. " 
.. idle 
-~> outCipherteKt UUUUUUUUUUUUUUUUUUUUUUU ... 
"slate load_keya 
Figure 43: Load Key- Text Vector 1 
The Key has been loaded. This is shown by latch _key = 'I' after 50 ns. The 
state also proves that the state is Load Key. After the key has been loaded, the 
state becomes idle, waiting for the encryption process to start at !50 ns. 
80 
5.2.1.2 Start Encryption 
Before we start this process, the following signals need to be set first. 
• INPORT= 00000000000000000000000000000000- (Plaintext) 
• CLK= '1' 
• USR LD KEY= '0' 
• RESET= '0' 
• USR START='!' 
• USR ENCRYPT='!' 
Do elk 
o- reset 
Ill 111111111 II 111111111 II 
o- usr_ld_key 
" ...... "" "''" 
o- usr_start 
.................... 
o- usr_ encrypt 
"'"' "'"'"'" .. '"'" ·:" '""'"'" ........ 
-e idle 
nr state 
Figure 44: Start Encryption- Text Vector 1 
We can see that the encryption starts at 250ns. This is proven by the change of 






• usr_ encrypt 
• idle 
...... i ..... 
u state 
Figure 45: End of Encryption -Text Vector l 
As we can see above, the outCiphertext is obtained at 7550ns. The encrypted 
data obtained is 9F589FSCF6122C32B6BFEC2F2AE8C35A. The output 
ciphertext obtained matches with the expected value given by the authors. 
One observation made is as follows: 
Encryption duration= (7550-250) ns = 7300ns -73 clock cycles 
One clock cycle= lOOns 
Expected encryption duration = 7200ns - 72 clock cycles 
Observation: Delay of lOOns 
Justification 
This can actually be justified. Actually, the whole encryption process was 
completed at 7450ns. This value actually corresponds to 72 clock cycles -
expected value. A delay of I OOns is obtained due to locking the encrypted data 
into the output register. This locking takes one clock cycle and is done during 
idle state as seen above. 




The decryption process is exactly opposite of the encryption process. Some 
changes need to be done. The changes are as follows: 
KEY : 00000000000000000000000000000000 (maintains) 
PLAINTEXT : 9F589F5CF6122C32B6BFEC2F2AE8C35A (encrypted data) 
CIPHERTEXT: 00000000000000000000000000000000 (expected value) 
The following are the main steps that need to be followed, before beginning 
the process. 
• LoadKey 
• Start Decryption 
5.2.2.1 Load Key 
Before we start this process, the following signals need to be set first. 
• INKEY= 00000000000000000000000000000000 
• CLK= '1' 
• USR LD KEY= '1' 
• RESET= '0' 
• USR START ='0' 
• USR ENCRYPT ='0' 
The clock speed that is used in the simulation is 1 OMHz. 
o- reset 
~ usr_ld_kej 
o- usr _start 
D- usr_ encr~pt 




o- usr_ start 
~ usr_encrypt 
Figure 47: Decryption- Loading Key- Text Vector 1 
The key has been loaded at SOns. 
5.2.2.2 Start of Decryption 
Before we start this process, the following signals need to be set first. 
• INPORT= 9F589F5CF6122C32B6BFEC2F2AE8C35A- (Plaintext) 
• CLK= 'I' 
• USR LD KEY= '0' 
• RESET= '0' 
• USR START='!' 
• USR ENCRYPT ='0' 
....•..... r.· ~ ·~. 1~0:1, 2oo •. 1 
<· 0 
~····••I llJ ~ inporl · ,,._.:.._.,. '"" ' '~,,,)' ' ''' ' I r.:::·" . 
""""""" '"""' ....... """ "'" '"""''"""""" .. 
ji!J~ inkey <~ 0 
~elk Clock I I 
...... L . 
~reset •0 
'<•0 I 
"""'"'"""' .......... , ....... '"''"'"""""""""" ............... . 
~ usr_start I 
............... '""""""""''"' ~ usr_encrypt /o 
~idle io ···· J 
Ill~ outCiphertext .. !itittllllltlittti ttittllllllllltitttlitlllltillt···· 
Figure 48: Start of Decryption- Text Vector 
84 








Figure 49: End of Decryption- Text Vector 
From the above figure, it is true that the decrypted data obtained is 
00000000000000000000000000000000. This is the expected value. The 
decrypted data was obtained at 7550ns. 
One observation made is as follows: 
Decryption duration= (7550-250) ns = 7300ns- 73 clock cycles 
One clock cycle= lOOns 
Expected decryption duration = 7200ns - 72 clock cycles 
Observation: Delay of 1 OOns 
Justification 
This can actually be justified. Actually, the whole decryption process was 
completed at 7450ns. This value actually corresponds to 72 clock cycles -
expected value. A delay of 1 OOns is obtained due to locking the decrypted data 
into the output register . This locking takes one clock cycle and is done during 
idle state as seen above. 
Therefore it is only fair to say that the total decryption time taken is 73 
clock cycle. 
85 
From the above simulation, we could make the following observation. 
• The encryption cycle takes 73 clock cycles. 
• The decryption cycle takes 73 clock cycles. 
• Latency is 7300ns. 
5.3 TEST VECTOR 2- DESIGN 1 
KEY : 137A24CA47CD12BE818DF4D2F4355960 
PLAINTEXT : BCA724A54533C6987E14AA827952F921 
CIPHERTEXT: 6B459286F3FFD28D49515B1581B08E42 (expected) 




~~>- usr_ld_key <= 0 
~~>- usr_start <= 1 
~~>- usr_encrypt <= 1 
-o idle 
M state 
-o outCiphertexl uuuuuuuuuuuuuuuuuuu ... 
Figure 50: Initialization oflnput Values- Test Vector 2 
86 
Name 
~r inporl ! B CA 724A54533C6987E 14M827952F921 ! < = 0 ! 
' ' ~~-r--~=-----~== i .. ~ i~k~; .............. 11 3?;;24G;'47co·1·28.E81 soF4o2F435596o ... k: .. o ..... r .. ............ ......... ...............................  ... . . ....  
................................................................... ···············~··················:················"''''" .......... . ............................................................... . 
~r elk 1 !clock ! 
...... ~.;~~~;· ·:a........... ..... .................... ..... ............ .... !~·: o ........ b.:: ....::::: ..... ::::: ...... ::::: ..... :::l ................................................................................. . 
' ' =· ::::::::::::::::::::~::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::~ ·~·~~·;~id~k~;· ...... ra· ............ ......... ................ ·· ........................... r;: .. 0 ...... r............. ·· ...... ................... .. ..........................  
" ·~~;1~;;~;·; ................ .,. ..... """'"""""""""' """"""""'"""' """;·:"1"'"'' ' ""'""""""""''' ''""'"""""""""'"' '" ""' 
~:=-=~=== 
i ~~~~~;~~·ko.... ....... ............... ... .. .................. r .. · ..................... ·~~~;~;~~;;....................... .. ....................  
~ .. ~ .. ·~~;;:;~h~;;~;; l684592ssFJFFo28o4sF1.58.15818osE.42 .... r .............. r .................... ~~~~;;;~~;~~~;;~;;~;~~;;;;~;~ 
Figure 51: End of Encryption- Test Vector 2 
As can be seen the output encrypted value tallies with the expected value. This 
proves that the system is behaving properly. 
NOTE: Timing analysis was not considered for the second test vector -
Only the output was considered. 
5.3.2 Decryption 
Decryption is exactly the opposite of the encryption. By using the same key 
and feeding the output value of the encryption process, the input of the 
encryption process would be obtained as the output of the decryption process. 
KEY : 137A24CA47CD12BE818DF4D2F4355960 
PLAINTEXT : 6B459286F3FFD28D49515B 1581B08E42 
CIPHERTEXT: BCA724A54533C6987E14AA827952F921 (expected) 
87 
Narne . < .• .·.IWalue ··· .. · . ·•· .. • > . •.·· .. · •· • fs.timulat9t -0 t~ 1 2~ 1 3.0 1 1.0 1 5.o 1 
. . ps ~::::::::::::=~~== 
[±] D- inport [68459286F3FFD28D49F1581581808E42 !<= 0 ~28SF3FFD28019F15B1581BOSE12 
~ ; i~k~;··· ·· ·· ··h37A24~47co·;·28E818o.F4o2F43'5596o · l~·a····· · ·· ;;;;;~~;;;~~;;~~;;;~~;~;~;;;~;;;······ ... 
: : 
...... ;·~1k ................... o................. . ............................................................... Tc1~~k .......................... ······ ... ....................... . ......... .. 
::: ~:;i;~~:::.. ::::1~: :::::::::::::::::::::: :::: ::::::::::::::: ::: ::: :::: ::::::r;~:~:.. :·::: ·:: :::.: ::::::::::::::::::.:.:.:::,::.::::::::::::::.::::::::::::: .. ::: 
11- usr_ld_ke~ !1 !<= 0 
;·~~;~;·;~;; · ·a ............ · · ............. · ·· ·····r;~; · · .............. ·· .................. · ·· ·· 
: o ··· · ··············· ............... · ··· · · · ············ ············· ·;~· a· · ··· ··· ····· ···· · ........................ ··· · ··· ................. · ·· ...... · ····· · 
l ...... il ...... id ..,le"""""""""""""""li' ,,,,,, .. ,,, .. , m ' ,,,, .... ,,,,,,,,, '' ' ''''"""""!"'' '""'" , ......................................................................... , "'' """' 
: : 
""""'''"""""""'""""'""~""'"""""'""""""""'""''"""""""""""""""""""'"'""""""""'""""""""'"""~""""'"""'""""'""'" "'''""'''''"""""'""""""'"""""""""""""'"""""""""""""""''""''"""' 
nr state [mtidle ! mlidle load_ 
m . ..c. teit : uuu uuuuuuuuuuuuuuuu'u'u'uuu'uuuuu·::·.··"""'""""''''''''' ·~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'"'''''''' 
1 .................................................................................................................................................................................................................... ,.. ...................... ; ........................... ; ...................... ; .. .,,;.,,,., ............. . 
Figure 52: Initialization oflnput Values- Test Vector 2 
Name • .··· I Value . .. . · > .. ; .· < Jstim~lalor · L~· ~5. 7550 ns 1 1 75.70 1 75.80 1 75.90 
1B o. inport i 68459286F3FFD28D49F15B1581 B08E42 i <= 0 
................. f ......................................................................................................................... y ................................ ~ .............. .................................................................................. . 
. ~ ... ~ .. ~~~~~·········· ... ····f1~?~~~~47CD12B E818D~~~?F.~?~~~?~ .............. l:~.~·········· ..........................................................................................  
....... ~ .. ~~~ .................. !~.......................... .......................... ............................... )Clock ········f=······ ........................................... . 
• reset :o ···r;~·o m !"'=::1::=:======= 
·.·.·.·.·.·.·.·.~.·.·.·.·.·.~.·.·.~.·.·.;.-.:.~d.~~~i:::.·.·.·.·.:,·.a.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.: ................................................ ·························· ·······::~a········ .................................................... ························ 
................................................................................. ~ ................................ ~ ............................................................................................... .. 
• usr_start )1 !<= 1 
............................................... ~............................................................................... . .............................. '"''1'""'"'"''"'"'"'"''""''~''"'"'"''" ................................................................................ .. 
........ ~ .~:~:~~~~~~~ ...... ;.~ .................................. ' ............ ...... ' , ........ ' ......... ......... . ........ 1 <~ 0 
.. idle iO 
·····:········""'·················"''''"''""''"""''''''''''''''' 
'"'"'""'"""'""'"'"'"''""'"'"''1"'"'"'"'"'"'"'"'"'""'"'"'"'""'"'"'"'""'"'""'""'"'"'"'""'"'"'"'""'" ............... ~ ........ .. '"''"'"''~'"'"'"'"'' ................................................................................. . 
M slate jcompule_kO ! 
1"""""""""""""""""""""'"""""1"""""""""""""""""""""'"'"""""""""""""""""""""""'"'"""""""""""""'"1'""""'""""""""""'""""""'"""""""""'""'"""""""""""""""'""'"""""""""""' 
IB .. ouiCiphertexl i 8CA724A54533C6987E14AI\827952F921 i ' BCA724A54533CS987E14AA827952F~ 
Figure 53: End of Decryption- Test Vector 2 
As can be observed, the output values of the decryption process tallies with the 
expected output values. One interesting point to note here is with a given key, 
the decryption process is exactly the opposite of the encryption process. 
88 
5.4 TEST VECTOR 1 -DESIGN 2 
As for Design 2, it is capable of performing both encryption and decryption too. The 
process is explained below. 
The official test vector from the authors is as follows:-
KEY : 00000000000000000000000000000000 
PLAINTEXT : 00000000000000000000000000000000 
CIPHERTEXT: 9F589F5CF6122C32B6BFEC2F2AE8C35A (expected) 
With this test vector, the simulation would be carried out for both encryption and 
decryption. 
5.4.1Encryption 
The following are the main steps that need to be followed, before beginning 
the process. 
• LoadKey 
• Start Encryption 
5.4.1.1 Load Key 
Before we start this process, the following signals need to be set first. 
• INKEY= 00000000000000000000000000000000 
• CLK= '1' 
• USR LD KEY= '1' 
• RESET= '0' 
• USR START ='0' 
• USR ENCRYPT='!' 
The clock speed that is nsed in the simulation is lOMHz. 
89 
<< __ , • 11 
---- 0 ps ' 
', 
-- ' __ .. ---- • ... l $ timJ-!I~tbL --•----
------------·· .. ]Val(..le Name 
.. l±l .,_ inport ' ! < 0 
-[B--;;;:--i~k~~.,---- ···················-····· ~--·· ·················· ·········l~-~--0··· ···················-··· ··················· .... ··-················ 
········;;;:·-~ik·--------------··-···················r ··· -----------------··········-rc:i~-~-k----------------------- -- ---------------- --- --- ------------
········;;;:--;~~~~------------------- ············r··------··················· ···--r~-~--a··-·· -··················-···- ················· ······--· ············· 
······-·;;;:·-~~-;~id~k~~---················ ~--····················-- "' ····---~-~-~-a···················· ······················ --- ·····················-
·······-;;;:--~~-;~~-;~;·;······-· ··············--r····················-··-- ·······-r~-~-0·-· ············-·····-· ·- ···················- -··- ················ 
D--' u-zr~encrypt ' < = 1 
..., idle ' 
;~;-~;~~:;·:·~·~ ] ~: ~: ::] ~;~:: ~ = :- ;:~ :-; ~ 
:::::::~)::;.::;:;:::::·::·::::::::::::::::::::::l-: .. ::::::::::::::::·::::::::::::::1::::::::::.::::::::::::::··:·::.:.::::::::::::::::·::::::.: :::::::::::::·:·::::::: 
Figure 54: Idle- Text Vector 1 
The values when idle. 
Name 
± .. inport ! 00000000000000000000000000000000 ! < = 0 
E8·;i~k~~- ---------------------------:oooooooooooooooooooooooooooaaooo _____ -------r~:·r;···------
...... j .................... . 
.. elk !Clock 
i 00000000000 000000000000000001 
......................... i ......................... .. 
........................... !''"'"''''"'"''''" 
'"'!""''' ... 
~> 1eset !<= 0 
;·-~;;~,d~k~y··· ----------~;-······ '"''''i ............................ . ········· ·····~............................ . ..................... ., ............... .. 
............................... ................................... . ....................... . 
"""'''!"'''' 
c- us1_ start 
..•..• j ............ . 
"''"'''"!""' 
---------------!=·-----------=-=J==::::i:= 
.............. f ..................................................................... . 
o- usr_ encrypt 
~···idi~··························· ·······1·a····· ·································· 
.... ..!. .... ........................ ...... .j. ............... . 
... outCiphertext iUUUUUUUUUUUUUUUUUUUUUUUUUU : 
. ................................... ! ......................................................................  
................................. ······!............................... . .............................. : 




''''!"'"' ........................... . 
!load_keya 




Figure 55: Load Key- Text Vector 1 
:uuUUUUUUU UUUUUUUUUUUUUUI 
............... ~ ...................................................................... . 
: m!lidle lo~d_kega 
......... ·····~ ...................................................................... . 
: my_idle load_key.a 
• ............ !''"'"''''"'"'"''"''''"""'""'""'"''""" 
: my_idle load_keya 
............... ~ .................................................. .. 
; my_idle load_key~ 
The Key has been loaded. This is shown by the change of the 4 different states 
to Load_ Key A at SOns indicating that the key has been loaded. 
90 
5.4.1.2 Start Encryption 
Before we start this process, the following signals need to be set first. 
• INPORT= 00000000000000000000000000000000- (Plaintext) 
• CLK= '1' 
• USR LD KEY= '0' 
• RESET= '0' 
• USR START='!' 
• USR ENCRYPT='!' 
Name .•..• ]Value •• ·.·• •. .· ..... · .· J.stimulqtor · ... ····· I ' 150 ' I ' 100 ' I ' 1!':::1· 3QO ' I 
. · · 1250ns~ 
IE" inport :00000000000000000000000000000000 i <= 0 
''""'"""''""'"""'""''"""'"'"'"'''""'"'!"""""""'""'""""""""'""'''"'""'"'"'""''""""'""''"'""'''"''"'"""''""'")'""''"""""''""'"""''"""''"""'""''"""~"'""""""'""''"'"""'""""""''""'"'"""'"'"'"" "'""'"'""'"""""'""'""''""'" 
~ .. ~.i~~~?. !~~~~~~~~~~~0~~~~~~0~~~~~~~~0~~~~ 1::~... .. ... :~.... .... .................. " ...... .. 
"reset :o i<= 0 
............................................................................................................................................................. ~ ...................................................... T ................................................................................................................ .. 
.. usr_ld_key )O i<= 0 
'"""'""'""''"''"''"''""'"""'''"'"'""'"''''"''' ""'"'""''""""'"''''"''""' '""'""'""""""""""""'""'""''"'' '"'"!""""'""""'"'""'"''"""'""'" "'""""'~"'""'"''""'"""'"""''"'"'""""""'"""'"""~""' """""""'"'"''""""''"''"""" 
. ~".~~::st~rt .. ~........ ........ . ..... !::1 . .. i ....... " ............ , ............... . 
" usr_ encrypt f 1 i < = 1 
''"""'"'""'""'"""'"'"'"""'"''"''"''""'!"'""""''"""'""''"''""""''""""'""'"""'""""'"""'""'"""'""'' ""'""'"'!''"''"""'""'"""""""""""''""" '""'"t''"""""""'""'"'"'"'""''""'""""""""'"""''''""""""""""""'"""'"""'""'" 
... ~ .. i~~~ ................................. !,.~··· i U. ...... • ...... ! ........... !.... ........ . . ................ . 
IE -1! outCiphertext !UUUUUUUUUUUUUUUUUUUUUUUUUU ! ~uuuuu 
. . 
"'""'"'"'"'""'"''""""""'"""""" ""'l'"'"'"""'"''"'''"'""'""'""''"""""'"""'""'''"""'"'"'''"""""'"""""'"'""!"""'""''"''"'""'""""" "'""""''"'"'"~"'"''"''""'"'"'"'""'"'''"'"""""'"'"""'""""'"" '""'""""'""""'"'""'"'"""'" 
... ~.~.t~t.:~.... ~=~~~.~t.::.~.~....... . .. .... . :....... . . ..... ~·;~=i~l! ......................... ~.~~p~~(kO ....... . 
~~t~t:1...... . ,,:~~~~t::~.1""''"' .. .......... \'. ... ...... ... .. .. ...... ~~.=i~~~ ....................... ~O~p~~:~~~ ....... .. 
Jll state2 i compute_ k2 ~l.idl! compul( 1:2 
Jll state3 'compute_ k3 compui!J3 
Figure 56: Start of Encryption -Text Vector 1 
We can see that the encryption starts at 250ns. This is proven by the change of 
the 4 states to compute KO, compute Kl, compute K2, and compute K3 
indicating that KO, Kl, K2 and K3 are being computed now. 
91 
Name Slimulalor 20,00 ' 20 2100 ' ' ' 215, 0 ' ' 22.00 
· 2050ns · 
............ t .. ~~~oooooooooo~~~ooo~o~~~~~~~~~~ ,::~...... ;..----........ -....... --! ...... -......... -.. -.-........ -... --....... -........ -. __ 
~~ 1n~e~.. .......... J 0~~~~~~~~~~~~.~.~.~0000000000000000 J~: 0 .............................. i:::: ....... ; ......... ::::: ... =1=:::= ... :; .......::::: ........::::: ........ ::::: ........ =;: ... =: ........ =: ........ :::: ... -·;=-·-::::: .. . 
i±l ~ inp01l 
....... ~ ... ~.~~ ................................ !~... ............................. . ............................................................... '.~1?:~... ......... .. ... ~... .. '''' ' ... ,,,1,,,,,,,,,,,,,, ............... 1........... '' ''''''''''''"''''' 
•1esel 
........ !~ 
~ USI_Id_key fO 
; (• 1 ~ USI_SI~II .......... [~ ....................................... ································· 
"'"'''"'''''""" '''''''''""""""'""'''""""''''"""""' ''""""'"""""""'""""'""'" """""""""'"""""'"''""""'"""' 
.................................................................................................. ~ .............. ................. .. ................. -...................... -..... .. 
f-----,1 
........................................................................... !:= ........... = ................ =b.:. =-···= .. ··············="""'""""'~""""'"'""~" :::= 
i±l ~ outCiphe1ieKI .... t?~:89F5CF612·2·~:~~.~.~.~~~2F~E8C3~··········· . ............................ ' . .......... . ..... s:5s~~.~=~.~1·2·~·=~~~~:::~~·2·~·E·s:.~:.~ ........ . 
-o idle 
M slateD . . .. J .. m~:i~l: ................................................. . compule_kO 
compule_t1 ~s1~1:1........ !m~:~~~: ....... ............................................... , "''""~""""""""""""'"""""'" '"""""'"'-""' """"'"""""""""" "'"""'""""""""'"''~'"""'" 
nr slale2 i my _idle 
M slale3 [my_idle 
; 
; 
'''"""'"""'""'"""'" ............................. '''"'''''''!'''"" 
)7 compule_k3 
"""""'''''''"""''""'"'" ...................................... "''''"'''"'''"''"""""""'"'"""' .............................. .. """""""""I""""""""""""''"""'""'"""""""""""""""'""'""""'-"""""""""'""'""""""'""""" 
Figure 57: End of Encryption- Text Vector 1 
As we can see above, the outCiphertext is obtained at 2050ns .. The encrypted 
data obtained is 9F589F5CF6122C32B6BFEC2F2AE8C35A. The output 
ciphertext obtained matches with the expected value given by the authors. 
One observation made is as follows: 
Encryption duration= (2050-250) ns = 1800ns- 18 clock cycles 
One clock cycle = 1 OOns 
Expected encryption duration= 1800ns- 18 clock cycles 
Observation: Matches As Expected 
It was observed that the output result obtained matches exactly with the result 
that was expected .. 
92 
5.4.2 Decryption 
The decryption process is exactly opposite of the encryption process. Some 
changes need to be done. The changes are as follows: 
KEY : 00000000000000000000000000000000 (maintains) 
PLAINTEXT : 9F589FSCF6122C32B6BFEC2F2AE8C35A (encrypted data) 
CIPHERTEXT: 00000000000000000000000000000000 (expected value) 
The following are the main steps that need to be followed, before beginning 
the process. 
• Load Key 
• Start Decryption 
5.4.2.1 Load Key 
Before we start this process, the following signals need to be set first. 
• INKEY= 00000000000000000000000000000000 
• CLK ='I' 
• USR LD KEY = 'I' 
• RESET= '0' 
• USR START ='0' 
• USR ENCRYPT ='0' 
The clock speed that is used in the simulation is lOMHz. 
93 
Name. .Valu~ 
[fJ "" inport i 9F589F5CF5122C32B6BFEC2F2AE8C35A i <= 0 
........................................................... ] ..................................................... , ........................................................................... ~ ....................................................................... . 
1±1 "" inke~ ! 00000000000000000000000000000000 ! < = 0 
· ~ ~ik · · · · ·To · · · ·· · · ·· ·· ..... ·· ·· ........ ·· ·· · · · · ... ·· · · · ....... Tci~~k· · ·· · ·· · 
··~·;~;~; · ·1a.... ··· · .... . ......... · ···t~·a· .. · · · .... . 
··~·~~;~id~k~;· ·1a···· ... ........ . ... ·· L:a· ..... .. ··· .... .. 
.......................................................... + ......................................................................... .,. ..................................................... f··· ....................................................................  
,.. usr_start j 0 
.................................................. ~ .. . . . . . . ... . ... . ... . ... . ... . ... . . . .. . . .. . . .. . . .. . . .. .. . ... . . .. . ... . . .. . ... . . ... . . 
D" OlliCipherleHI UUUUUUUUUUUUUUl.JUUUlJUUUJUUUUUUl.IUUU -
. . ; ;;~;~2 .. . .. .. . .. . .. !'~~=idi~ .. . . . ... . . . .. . . . .. .. .. . .. . . . . . .. . ;.. . .. . . . . .. .. . . . .. 
................................................... .., ..... :······· .. ······ ................................................................................................................. l ..................................................................... . 
..................... 
~·· •.' 
.,. state3 .. ~~Y.::i~l~ . . . . . . .................................. . 
Figure 58: Initialization of Decryption- Text Vector 1 
Name 
i<, 0 i9F539F5CF6122C32868FEC2F2AESC3~ 
I ,IQ, I o10• I ,30, I ,jO, I •5~ 
i 9F589F5CF6111C31868FEC2F2~E8C35A ' :• SOns 
............ ! .......................................... ~ ......................................................................... ~ ........... ~ ......... ~ .............................................. ~ ............. ~ ................................ ~ ........................ .. 







................................................................. ~................... .. ...................... ~ ................ ~ ........ _, .............................. . ..................................... .. 
!Cbck 
······························· ....... ! .................................................. ; ................ :::: ......:::: ......... ::::: ........= ........ = .......:::: ........ :::::.. ::::::::·· ... = ........ = ...... = ........ = ......... 4=. = 
' c uuuuuuuuuuuuuu: 'uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu 
..... ...................................... .. ........................... ~ .......... _ ............................... . 
loadJ• 
........... ...! ....... 
'mUdte loadJe 
.................................... !.~"············=···"" ======~==: 
;mtidle 
................... ! .... ~ .... ~ .... -......... . loadJE 
loadJe 
Figure 59: Decryption- Loading Key- Text Vector 1 
The key has been loaded at SOns. 
94 
5.4.2.2 Start of Decryption 
Before we start this process, the following signals need to be set first. 
• INPORT= 9F589F5CF6122C32B6BFEC2F2AE8C35A- (Plaintext) 
• CLK ='I' 
• USR LD KEY= '0' 
• RESET= '0' 
• USR START='!' 
• USR ENCRYPT='O' 
Name r < ! Value ·•·· ; • , i • > ]stimulator .I 2!0 ' 2, 250 ns ~ 
.~ .. ~.j~~~~~ ...................... ::~:8:~~~~.~.1 .. ~.~~~~~~S..~~.~~~~~.~~:~ ........ ;:.~~''' .......... m,,,,,,,m,;,,,, ........... m ...•. ,,,., .... . 
. ~ ...~ .. i~~=~ ''''''' ,,,,,,,,,,, ,,,,, !~~~~~~~~~~~~~~~~.~~~~~~~~~~~~~~~~ '''''''' !:.~ .. ~ ... ' '' ''''' ,,,,,,,,,,,,,, ····r·· '' ·············· m ,,, '' mo,,,,,,, ' 
• elk ! 1 ! Clock 
'''''''''"'"""''''''''''''''''"''''''''''''''''': •.. "o" •m••m•m '''''' m ,,,,.,m.,, '""m• '''"'mm••m ''"'' ,,,, ,, om'"!,,""""""'""'''"''"""""" """""""'"""""" 
·:~::::d~k~; •m••'o . ••m"'' ··••m••• ••m•••m••m•m•• :::·~···· ••••m••••m•m"''"''m"'''''''''' 
'''""""'""""'''"""""'""'""""'"""""'!'""'""""'''''""""""'""""""''"'"""""""'"""""""""""'""""""'''""""'"""""'""'"'!"""'"""'"""'""'''''"'"""'""""'''' "'~"""""""""""'"''"'"''"""'""'"""''""''"'" 
D-usr_slarl !1 !<=1 
'"'""""'"""""'""'""""""'""'''"""""i""""'""''"'''""""'""""""""""""'""""""'''''''"''"""""""'""""'""'"""'"""'''""""'i""'""'""'""'"""""'""'""""'"'"" ... f ............................................................. . 









' ................ !,:"" ..................................................................................................................  
compule_KI 
........... ~ ..... .. ........................ ~ ............................................................. .. 
compule_k: 
.............................................. ~""'"" "'""""'"""'"""'""" ....................................................................... . 
compule_k: 
Figure 60: Start of Decryption- Text Vector 
As can be seen above, decryption process started at 250ns. 
95 
~~i ,,~,,,.,,;,·.}~I·'·',<V1 ;~~0n;yo···,'20JO,···i~~l 
1±1 t- inporl , .. w., ,_, , 22C32868FEC2F2AE8C3~ ! <= 0 
'" '" "" "" .... "' "' ... "' "'' "' .... '""' ...... "' "' ... "' "' "' '" "" "" ... " "" "' ... "' "" '" .... " ... "' '"'' "' ... "''" ...... "'" ... "' 
8l D- inke~ !<= 0 
D- elk !1 Clock 
"' "' "'"' "" .... 
D- reset !O (= 0 
D- usr_ld_key 0 
"' '" .... ... "' ... ... "' "' ... '"' '"' .•. '"' '"' '"' '", '"' '"' -
D- usr_ encrypt 0 <= 0 
............. 
I -o idle 1 
"' "' '': "' "' "" 
ii±l -o 'JuiCiphertexl ! 00000000000000000000000000000000 
............... '" ...... '" "' .... "" ....... "', ........ 
I nr slateD !my_ idle ~ 
"•, " ... "' '"' "' "' .... "' "'. ,;. 
nr slate1 ,my_idle ' .• · (mlidl! 
! . 
nr stale2 .......... my_idle 
... '" "' "' '" "' "' "" '"' "" "" '"'(" '"' ... "' "' "" "' "" "' '" 
nr stale3 .......... l~y~idle [mlidl! 
Figure 61: End of Decryption -Text Vector 
From the above figure, it is true that the decrypted data obtained is 
00000000000000000000000000000000. This is the expected value. The 
decrypted data was obtained at 2050ns. 
One observation made is as follows: 
Decryption duration= (2050-250) ns = 1800ns- 18 clock cycles 
One clock cycle = lOOns 
Expected decryption duration= 1800ns- 18 clock cycles 
Observation: Matches As Expected 
It was observed that the output result obtained matches exactly with the result 
that was expected. 
From the above simulation, we could make the following observations. 
• The encryption cycle takes 18 clock cycles. 
• The decryption cycle takes 18 clock cycles. 
• Latency is 1800ns. 
96 
. 
5.5 TEST VECTOR 2- DESIGN 2 
KEY : 137A24CA47CD12BE818DF4D2F4355960 
PLAINTEXT : BCA724A54533C6987E14AA827952F921 
CIPHERTEXT: 6B459286F3FFD28D49515B1581B08E42 (expected) 















Figure 62: Initialization of Input Values- Test Vector 2 
97 
Yalue. 
fB D- inporl iBCA7241\54533C6987E14AI\827952F921 [ <= 0 
' ' '"'"""'"'"'"""""""'"'"'""""" '"""""'"'"'""""'""!'"'""""'"""""'"""'"""'""""""'"""'""""'""""'""""""'""""'"'""'"'~'"""';""'"'""""""""""""~""'""" ;; ...................................... ; ... ,; .............................. . 
fB D-inkey i137A24CA47CD12BE818DF4D2F4355960 [<=0 
.... ~.~~k······· · · ....... ·:; .. ···· .. · ............ · · lc~~~k .............. r .. ·· · ··· ................... · · 
· ~;~;~;................ ... ro .......... .. · ............ t~a· · ....... 1 ................ ·· · ·· ... · 
' ' """'""""""""'""""""""'"""""""""'''"'' '"""'"'''~"'"''"""'"""""'""""'"""'""'''''"""'"""'""""""'"""""'"""""""'"'"'~""""''"""""""""""""""'t'"'""'" ......... , ................................................................. . 
D-usr_ld_key iO [<=0 ' ::::~:~~~~~~~11 : :::· ....... ·.·.·.· ....... ·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.: .. ·:.1 ~. : :: :::: :·.:.:.:.:.·:::. : ' "'!';~; ''"' .l-'-...... l-.... --...... --...... --...... -...... -....... -...... -...... -. ---
'""'""''"'""""'"""''"""""+"""""'''''""'""''"'"""""'~'""""" ........................................................................... . 
D- usr_e~~~~~~ .................................... J,.~ .... ................................ . [ <= 1 , 
"'" ... """"""'""""""'"''""'!'""""""""""""""'"'""""~'""""" ""'""'"" '"'"""'"'""'"""'"'"""""'"'"""""'""" 
-ll idle !1 . 
~~~~;ci~h~~;~~;........ ... . '6845928'6F.1FFo28o49F1581581'8.osE42 . " . . ........... ;..... ;~;;~;;;F;~~~;;~;·;~;;~;;;;~~; 
: : 
. ·;;·;~~·;·a"''''""'"' ......... ' ···········~;~idi~'""''""''""""'''"""' ····· ............................... ,.... m[idl! 
' ;;l~~~i ''''''"''''' ' r~;~idi;'"'"''"'"'' ' ' ' '"''"""" ... ~~m=:[id=:[,===(~:i: 
·····;;;~1;2 ........... ::::.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.·.i,.'.m.·.·.·.~.·.·-.·.····i·d·····i·e············ . ..... ...... . .. . ... ..... . T . . .. .. ... , .... ~~~;;;··· ..................... ;~~;;; 
'"'"""""'"""""""""'""" '"'"""'""'""""""'""""'""""'""""'""""'""'""'"'"'"'~'""""""""'"'' ''""''"""''~"'"""" """'"""'""""'~"""""'""'""'""""'"""'""""'""" 
M slale3 !my_idle m[idl! compul 
Figure 63: End of Encryption- Test Vector 2 
As can be seen the output encrypted value tallies with the expected value. This 
proves that the system is behaving properly. 
NOTE: Timing analysis was not considered for the second test vector -
Only the output was considered. 
5.5.2 Decryption 
Decryption is exactly the opposite of the encryption. By using the same key 
and feeding the output value of the encryption process, the input of the 
encryption process would be obtained as the output of the decryption process. 
KEY : 137A24CA47CD12BE818DF4D2F4355960 
PLAINTEXT : 6B459286F3FFD28D49515B 1581B08E42 
CIPHERTEXT: BCA724A54533C6987El4AA827952F921 (expected) 
98 
Name • .. ··• 




· ..•.•. •·. }Value 1 . • ··•••••• ) lstirt\rnator > fQ;l · s.o . ' · too · ' · 150 
) 68 459286F3FFD 280 49F158 1581 8 08E 42 ) < = 0 [!sl5!2S6F3FFD2SD49F15B15S180BE 42 
.......................... ;. .. ...... ~ '"'""'""'""'"""'"'""""""''''"' 
l 137A24CA47CD12BE818DF4D2F4355960 l<= 0 137A24CA47CDI2BESISDF402F4355960 
. ''""'""'""\ ......................................................................... ''"""'"~"''' ...................... .. 
!Clock .b=='l 1'===-=' !Q 
...... j ..... 
............ ! ..... l<= 0 
!<= 0 " usr_ld_key 
........................................................................... .: ..... 
I 





.............................................. " .. .. ...... ' .. "' ............... ~- ' ...................... ' ··•·· ...... .. 
-o idle 
................... ! ...... 
' 
··························· j<= 1 I 
' 
............ L .... I 
' 
.... :~uuu~.~.~.~.~~yuuuuuuuuuuu~~~.~ ....• ! ..... I±! -o outCipherteKt uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu 
1!1" stateD i my_idle 
................................. ...................... j .......................................... . ........................................ J .. 
my_idle Xtoad_keya . Xn 
1!1" state1 imy_idle : my_idle Xtoad_key• Xn 
................................................. '"''"'''"'"""'""''"'"!'"' ............................................................... . 
························+ ······························•··· ~;::( ....... ~ ....  ............~ ........ ~=:·~ 
1!1" state2 i my_idle 
... ) my_idl• Xto•d_keya Xn ........................................ ! 
1!1" state3 !my_idle my_idle Xtoad_keya Xn 
Figure 64: Initialization oflnput Values- Test Vector 2 
.· 
' 
!Value · ·•• ·• .·· • · I $timulaloi Name I I 20~210 .. 0 • I • 2J~Q • I 2: 
·
11050ns 1' · 
83 rr inport isB459286F3FFD28049F1581581 B08E42 i<= 0 
3-1 D- inkey ··· r;3?:;,24CA47CD12BE818DF402F435596o to 
. .......................... : ........ . 
.. elk :, 
......................... 
!clock 
.......................................................................... ! ............................................................ .. . ...................... ~.... . ................................. ~ .......................... ~ .. 
! (= 0 ~>reset 
.................... ; ................................... ,f= ....... =i====== ;o ................. [ ..... 
>- usr_ld_kev :a i<=o 
....................................................................................... , ...................................... ,F=.. =1:====~ .....= .......... == ........... .. 
;<= 1 ~> usr_start 
. ...................................... ! ... .. : .................................. ; ..... .. 
~r~- usr_encrypt 
............................... ............................ ] .................................. . ..... ~..... .. ............................... ~......... ............ ............... ............................ .. .................................. .. 
.. idle 





. ..... .\ . ,~ .........= ..... k· ·::::::: ......... ::::= ..........;:::: .......... ~1 ~ 






.................. ! ..... 
fmy_idle 
..................... J~yJdle 
. ........ ~..... . . """"\ .................................... . 
.......... \ ............... ................ ~:::::: .....~ ..... ~~~~ ... ~.d= . ~~-C:·-·C: .....:;:. E:C:·C: .. C::·}: ..~c~·o= .. ·~=~~~-~~~§kl 
, mj_idl~ Xcomput~_k2 
.. 1.. ::::::::;]~=::::::::::::~;~~ ................ ~...... . .. .. ... .. ........................ .. 
..... L... myJdle ..................... . ... . ~~-~-~:..~.~~ ... ~~ 
Figure 65: End of Decryption- Test Vector 2 
As can be observed, the output values of the decryption process tallies with the 
expected output values. One interesting point to note here is with a given key, 
the decryption process is exactly the opposite of the encryption process. 
NOTE: Timing analysis was not considered for the second test vector -
Only the output was considered. 
99 
In general we can conclude both designs work perfectly without any flaw. 
Furthermore, due to heavy pipelining techniques adapted, the F _Functions of 
both Design I and Design 2 were never idle. This shows that the both designs 
are extremely optimized indeed. For encryption and decryption process, 
the ontput that was obtained matched with the test vectors given by the 
authors. 
5.6 DESIGN IMPLEMENTATION 
Once the simulation is completed, the design is ready to be implemented. The 
implementation is process is basically divided into 3 parts namely: -
• Translate 
This is the process of translating of reading NGO files, reading component 
libraries for design expansion, annotating constraints to design files, checking 
timing specifications and constraints, and checking expanded designs. 
• Map 
It is the process of translating a design to available resources. It states what 
available resources are taken up by a given design. 
• Place and Ronte 
This is the process of placing and routing a given design onto the FPGA. 
A good design process during this process is to specify the pins of the FPGA for a 
given function. Eg. Pin 77 for CLOCK instead of letting the tools to decide. This is 
very useful in the latter stages when the device is used for application purposes or 
testing purposes. The Spartan 2- XC2S200-5PQ208 FPGA chip found in UTP Lab 
was assembled by Burch Electronics. This company attached the FPGA to a custom 
made motherboard sold by this company. The name of this product is.B3-Spartan 2+. 
Therefore, it is very important for me to go through the datasheet before assigning the 
pins for the input and output of my design. The pin assignments for my designs are as 
follows:-
100 
Table 19: Pin Assignment to the Corresponding 1/0 
PIN PIN NUMBER 














































Input[ OJ 187 
101 
Figure 66: B3-SPARTAN2+ Board 
5.7 GENERATE PROGRAMMING FILE & DOWNLOADING 
The final step was to generate the programming file which is the ASCII Configuration 
file that has an extension of .rbt. Once that was done, the downloading process was 
performed by running the BEDLOAD utility. After setting the necessary parallel port 
settings, the design was downloaded. 
102 
CHAPTER 6: DISCUSSION 


















Figure 67: A View of a Complete Single Round F -Function of the Original Design as Proposed 
by the Author 
*Assumption: All keys must be calculated on the fly and not to be pre-computed. 
The above design shows, that for a complete single round F - Function, we need to 
have the following: -
• Even Key Generator 
• Odd Key Generator 
103 
• Even Plaintext Generator 
• Odd Plaintext Generator 
With the above design, when directly implemented on hardware could have the 
following sequence. 
Table 20: Generated Sequence of Values Based on Design 1 
Time Unit Even Key Odd Key Even 
(at the end of Generator Generator Plaintext 
this time unit) (Generated (Generated Generator 
value) value) (Generated 
value) 
1 KO Kl 
2 K2 K3 
3 K8 K9 PO even 
4 KIO Kll PI even 
5 K12 K13 P2 even 
6 K14 KIS P3 even 
7 K16 K17 P4 even 
8 K18 K19 PS even 
9 K20 K21 P6 even 
10 K22 K23 P7 even 
11 K24 K2S P8 even 
12 K26 K27 P9 even 
13 K28 K29 PlO even 
14 K30 K31 PI! even 
15 K32 K33 P12 even 
16 K34 K3S P13 even 
17 K36 K37 P14 even 
18 K38 K39 PIS even 
19 K4 KS 
20 K6 K7 
Estimated latency for processmg a block of data of = 20 clock cycles 























Estimated latency for processing a block of data of = 20 clock cycles 
128 bits with wrapper 
NOTE: Wrapper built doesn't add any time delay because it uses pipelining 
technique. 
6.2 DESIGN 1 
Due to our design decisions of minimizing the hardware, the structure of the cipher 
had to be modified in order to be able to use the same h-function for all computation 
purposes. The following diagram shows the modified F-function that was designed in 
order to meet the requirements: 
INPUT 
F 
I H·FUNCTION I 
'-(BLOC~ J 
fipifil 
R . I 




Figure 68: Generalized Modified F-Function 
As can be observed, the input and a rotated version of the input are multiplexed to 
reproduce both inputs that can be presented at the input of an h-function. Moreover, in 
order to compute the result of the PHT, two registers had to be added to store the 
value of the first input or the former while the second was being computed. Thus, 
these two registers act as pipeline registers. The other two multiplexers added are used 
to choose between the path used for computing a key or the path used to encrypt or 
105 
decrypt data. This modified F- Function has the capability of performing the 
following: -
• Even Key Generation 
• Odd Key Generation 
• Even Plaintext Generation 
• Odd Plaintext Generation 
This is a revolutionary design because instead of using 4 separate parts, I have 
combined them into a single module which could perform all 4 functions. When 
directly implemented on hardware, the design could have the following sequence. 
Table 21: Generated Sequence of Values Based on Design 1 
Time Generated Time Generated Time Generated Time Generated 
Unit Value Unit Value Unit Value Unit Value 
1 KO 19 P3 even 37 K24 55 Pl2 even 
2 Kl 20 P3 odd 38 K2S 56 P12 odd 
3 K2 21 Kl6 39 P8 even 57 K34 
4 K3 22 Kl7 40 P8 odd 58 K3S 
5 K8 23 P4 even 41 K26 59 Pl3 even 
6 K9 24 P4odd 42 K27 60 P13 odd 
7 PO even 25 K18 43 P9 even 61 K36 
8 PO odd 26 Kl9 44 P9 odd 62 K37 
9 KIO 27 PS even 45 K28 63 P14 even 
10 Kll 28 PS odd 46 K29 64 P14 odd 
11 PI even 29 K20 47 PIO even 65 K38 
12 PI odd 30 K21 48 PIO odd 66 K39 
13 Kl2 31 P6 even 49 K30 67 PIS even 
14 Kl3 32 P6odd 50 K31 68 PIS odd 
15 P2 even 33 K22 51 Pll even 69 K4 
16 P2 odd 34 K23 52 Pll odd 70 KS 
17 Kl4 35 P7 even 53 K32 71 K6 
18 KIS 36 P7 odd 54 K33 72 K7 
106 
Estimated latency for processing a block of data of = 72 clock cycles 
128 bits without wrapper 
Estimated latency for processing a block of data of = 72 clock cycles 
128 bits with wrapper 
NOTE: Wrapper built doesn't add any time delay because it uses pipelining 
technique. 
When the above design was implemented using Spartan 2- XC2S200-5PQ208 
Table 22: Output Performance of Design 1 on Spartan 2- XC2S200-SPQ208 
Estimated Frequency 19.638MHz@ Period= 50.921ns 
Latency 3.6663e-6s 
Gate Count 26 318 gate counts 
Throughput 34.9126Mbits/sec 
Throughput/gate 1326.5667 Mbits/ s-gate 
6.3 DESIGN 2 
Even Key Generator (Block 1) 
Odd Key Generator (Block 2) 
Even Modified F- Function (Block 3) 
(Capable ofpeiforming Even Key Generation 
and Even Plaintext Generation) 
Odd Modified F- Function (Block 4) 
(Capable of performing Odd Key Generation 
and Odd Plaintext Generation) 
Figure 69: A View of a Complete Single Round F -Function of Design 2 
107 
EVENKEYGENERATOR 






L-----11 + h-'~-----+EVEN_KEY 
I H-FUNCTION I 
ODDKEYGENERATOR ,_(BLOC~ j 
;-] 




I H-FUNCTION I 








L------+1+ h-'~---- EVEN_PLAINTEXl 
I H-FUNCTION I 
'-(BLOCK 3) j 
ODDPLAINTEXTGENERA TOR OUTPUTODDPLAINTEXTGENERATOR I 
;-] 
I 
I H-FUNCTION I 
'-(BLOCK 4) j 
Figure 70: Generalized F-Function of Design 2 
108 
ODD_PLAINTEXl 
The above design shows, that for the Design 3, we need to have the following: -
• Even Key Generator 
• Odd Key Generator 
• Even Modified F- Function (capable of performing Even Key Generation 
and Even Plaintext Generation) 
• Odd Modified F- Function (capable of performing Odd Key Generation 
and Odd Plaintext Generation) 
This design looks quite similar to the original design which is the design proposed by 
the author. The difference is instead of using a dedicated Even and Odd Plaintext 
Generator that could only calculate plaintext values; I have decided to use a Modified 
F -Function which could calculate even and odd keys besides calculating even and odd 
plaintext. This design was borrowed from Design I. After borrowing the design, 
further modification need to be made because the Modified F- Function in Design I 
could calculate all even and odd keys and even and odd plaintext. This would not be 
efficient and a waste of hardware resources for Design 2. I decided to split the 
function of Design 1 into smaller sub modules so that it could be incorporated into 
this design. As a result I obtained Even Modified F- Function and Odd Modified F-
Function. The Even Modifed F -Function is capable of calculating even key and 
plaintext values and Odd Modified F-Function is capable of calculating odd key and 
plaintext values. 
This is a far more efficient design because it incorporates the best of the Original 
Design (Author's Design) and Design I. The major intentions are as follows:-
• A more efficient design. 
• A very small latency. 
• A very high throughput. 
• A very high throughput/gate. 
• A very minimal increase in hardware resources. 
109 
With the above design, when directly implemented on hardware has the following 
sequence. 
Table 23: Generated Sequence of Values Based on Design 2 
Time Unit Even Key Odd Key 
(at the end of Generator Generator 
this time unit) 
1 KO KI 
2 K8 K9 
3 KlO Kll 
4 Kl2 Kl3 
5 Kl4 KIS 
6 KI6 Kl7 
7 KI8 KI9 
8 K20 K2I 
9 K22 K23 
10 K24 K2S 
11 K26 K27 
12 K28 K29 
13 K30 K3I 
14 K32 K33 
15 K34 K3S 
16 K36 K37 
17 K38 K39 
18 K4 K5 
Estimated latency for processing a block of data of 






















= 18 clock cycles 
Estimated latency for processing a block of data of = 18 clock cycles 






















NOTE: Wrapper built doesn't add any time delay because it uses pipelining 
technique. 
110 
When the above design was implemented using Spartan 2- XC2S200-5PQ208 
Table 24: Output Performance of Design 2 on Spartan 2- XC2S200-5PQ208 
Estimated Frequency 8.0107MHz@ Period= 124.8420ns 
Latency 2.2472e-6 s 
Gate Count 32 616 gates 
Throughput 56.9609Mbits/s 
Throughput/gate 1746.4097Mbits/ s-gate 
NOTE: Spartan 2 - XC2S200-5PQ208 is available in UTP Compute System 
Research Laboratory. Design has been successfully downloaded into the board. 
6.4 PERFORMANCE DIFFERENCE BETWEEN DESIGN 1 AND 
DESIGN 2 ON SPARTAN 2- XC2S200-5PQ208 
Design 1 would be chosen as the base value. 
Table 25: Performance Difference of Design land Design 2 on Spartan 2· XC2S200-5PQ208 
Design 1 Design 2 % Difference 
Estimated 19.638MHz@ 8.0107MHz @ Frequency= 
Frequency Period= 50.921ns Period= 124.842ns -59.2% 
Period=+ 145.2% 
Latency 3.6663e-6s 2.2472e-6 s - 38.7% 
Gate Count 26 318 gate counts 32 616 gates + 23.9% 
Throughput 34.9126 Mbits/s 56.9609 Mbits/s +63.2% 
Throughput/gate 1326.5667 1746.4097 + 31.6% 
Mbits/ s-gate Mbits/ s-gate 
From the above design, we could see that Design 2 offers significance performance 
improvement. This is because optimizes all modules without leaving any modules 
from being idle. Among the significant improvement is the throughput which 
increased by 63.2% and the reduction in latency by 38.7%. Besides that the 
throughput/gate also increased significantly to1746.4097 Mbits/s-gate as compared to 
1326.5667 Mbits/s-gate. 
111 
6.5 PERFORMANCE OF DESIGN 1 AND DESIGN 2 WHEN 
IMPLEMENTED ON SPARTAN 3 -XC3S400-5FG456 
NOTE: Spartan3-XC3s400-5FG456 is a newer generation FPGA of the Spartan 
FPGA family. The designs are implemented on this FPGA for benchmarking 
purposes. This FPGA is not available in UTP. 
The results obtained are as follows: 
Table 26: Performance Difference of Design I and Design 2 on Spartan 3 -XC3S400-5FG456 
Design 1 Design 2 %Difference 
Estimated 42.739MHz@ 16.217MHz@ Frequency= 
Frequency Period= 23.398ns Period= 61.663ns -60.64% 
Period= +163.5% 
Latency 1.684656e-6 s 1.1099e-6 s -34.1% 
Gate Count 27 497 gates 33 044 gates +20.17% 
Throughput 75.9842 Mbits/s 115.3257 Mbits/s +51.8% 
Throughput/gate 2763.3644 3490.0649 26.3% 
Mbits/ s-gate Mbits/ s-gate 
Overall, we could see that when Design 1 and Design 2 were implemented on Spartan 
3-XC3S400-5FG456, tremendous performance improvements were achieved. The 
comparison between Spartan 2 and Spartan 3 performances would be discussed 
below. 
112 
6.6 PERFORMANCE COMPARISON OF DESIGN 1 WHEN 
IMPLEMENTED ON SPARTAN 2- XC2S200-5PQ208 & SPARTAN 
3 -XC3S400-5FG456 
Implementation on Spartan 2 would be the base. 
Table 27: Performance Difference of Design 1 on Spartan 2- XC2S200-5PQ208 & Spartan 3-
XC3S400-5FG456 
Spartan 2 Spartan 3 % Difference 
Estimated 19.638MHz@ 42.739MHz@ Frequency= 
Frequency Period= 50.92lns Period= 23.398ns + 117.6% 
Period= -54.1% 
Latency 3.6663e-6s 1.684656e-6 s -54.1% 
Gate Count 26 318 gate counts 27 497 gates +4.5% 
Throughput 34.9126 Mbits/sec 75.9842 Mbits/s + 117.6% 
Throughput/gate 1326.5667 2763.3644 + 108.3% 
Mbits/ s-gate Mbits/ s-gate 
Significant performance improvements were achieved in frequency, latency, 
throughput and throughput/gate. A brief glance shows performance improvement of 2 
times. Frequency increased by 117.6% to 42.739MHz. Besides that, throughput also 
increased more than 2 times to 75.9842Mbits/s from 34.9126 Mbits/sec. 
113 
6.7 Performance Comparison of Design 2 when Implemented on 
Spartan 2 - XC2S200-5PQ208 & Spartan 3 -XC3S400-5FG456 
Implementation on Spartan 2 would be the base. 
Table 28: Performance Difference of Design 2 on Spartan 2- XC2S200-5PQ208 & Spartan 3-
XC3S400-5FG456 
Spartan 2 Spartan 3 % Difference 
Estimated 8.0107MHz @ 16.217MHz@ Frequency= 
Frequency Period= 124.842ns Period = 61.663ns + 117.1% 
Period= -50.6% 
Latency 2.2472e-6 s 1.1099e-6 s -50.6% 
Gate Count 32 616 gates 33 044 gates +1.3% 
Throughput 56.9609 Mbits/s 115.3257 Mbits/s + 102.5% 
Throughput/gate 1746.4097 3490.0649 + 99.8% 
Mbits/ s-gate Mbits/ s-gate 
Design 2 also experienced impressive performance improvement too. Frequency more 
than doubled to 16.21MHz from 8.010MHz. Besides that, Throughput and 
Throughput/gate also improved by approximately 100%. 
6.8 FACTORS CAUSING DIFFERENCES BETWEEN ONE 
IMPLEMENTATION AND ANOTHER IMPLEMENTATION OF 
FPGA 
There are many factors that contribute to the differences in performances between one 
implementation and another implementation on FPGA. Among them are as follows: -
• Design 
A design that was built carefully by considering all aspects of a target 
device would generally have a better performance. This is because the 
designer has selected a particular target device and tailored his design 
according to the architecture of the FPGA. This includes considering the 
number of slices, flip-flops, latches, Lookup Tables, etc. A general design 
114 
would generally not perform as well as to design that is tailored to a 
specific target device. Besides that, the smaller the number of input/output 
blocks used, the better the performance. 
• Target Device 
Target device IS also very important. As evident above, Spartan 3 
performed 2 times better than Spartan 2. A bigger hardware resources, 
allows the device to optimize the design better. Usually when a device has 
taken up to 80% of the hardware resources, optimization becomes hard. 
Performance degradation would take place. This is evident in the design 
above (Spartan 2). In that design, 99% of the slices had been used. As a 
result optimization would not be very good and performance would 
degrade. As for the implementation on Spartan 3, 70% of the slices have 
been used. Spartan 3 has enough resources to further optimize on the 
design especially place and route. 
• Speed Grade 
A higher speed grade would demonstrate a better performance difference. 
That means speed grade of 6 is better than speed grade 5 and so on. 
Theoretically it means, the FPGA is more suited for high speed application 
projects. 
• Architecture ofFPGA 
Architecture of the FPGA is also very important. Some architecture is 
meant for low speed applications, some architecture were meant for high 
speed processing, etc. Choosing the appropriate architecture and family 
type ofFPGA is very important. 
• Synthesizer 
Synthesizers too, play a very important role. In this project, I have used 
XST or Xilinx Synthesizing Tools. The best known synthesizing tool 
currently recommended is Synplify. These tools are capable of optimizing 
30% more than most tools in the market. 
Besides that, there is wide understanding among FPGA programmers that the Xilinx 
FPGAs could be executed at frequencies of 20 % more than that given by 
115 
synthesizers. That means 20 % more performance could be obtained than actually 
reported. 
6.9 PERFORMANCE COMPARISON 
One interesting point to consider is the performance of my designs as compared to 
other designs [5, 8, 10]. The table below shows the ranking of my designs. 
Table 29: Performance Comparison With Other Implementations 
Method Designer Device Gate Throughput Throughput 
Techno!Qgy Count p_er Gate 
Hardware NSA 0.5um 945993 2.27Gbps 2403.32 
Evaluation 
Hardware Mitsubishi 0.35um 431857 394.08Mbps 912.52 
Evaluation 
FPGA Kris Gaj Xilinx 24800 90.9Mbps 3665 
XC4028XL 
FPGA Xilinx 857560 1.59Gbps 1854.1 
XVClOOO 
FPGA Viktor A It era 41093.75 80.3Mbps 1954 
Fischer FlexlOK 
FPGA Ananda Spartan2- 26 318 34.9126Mbps 1326.5667 
(Design 1) Raja XC2S200-
5PQ208 
FPGA Anand a Spartan2- 32 616 56.9609Mbps 1746.4097 
(Design 2) Raja XC2S200-
5PQ208 
*FPGA Anand a Spartan 3- 27 497 75.9842Mbps 2763.3644 
(Design 1) Raja XC3S400-
5FG456 
*FPGA Anand a Spartan 3- 33044 115.3257Mbps 3490.0649 
(Design 2) Raja XC3S400-
5FG456 








Note: Design requirements between one implementation and another differ slightly eg. encryptor/decryptor, full 
keying/zero keying, ASIC/FPGA, etc 
116 
From the above diagram, the performance of both Design I and Design 2 that were 
implemented on Spartan2- XC2S200-5PQ208 are comparable with other 
researchers. Both my designs were implemented with zero keying unlike some of the 
other designs. Furthermore, Design I and Design 2 could perform both encryption and 
decryption unlike some of the designs above. Therefore, the table above is just for 
performance benchmarking. It would be incorrect to directly compare. 
Besides that, it is also important to note that, if both Design I and Design 2 were to be 
implemented with the following tools namely: -
• Higher resource and speed grade FPGA or ASIC 
• Better synthesizing tool 
• Latest - ISE6.3 
a far higher performance could be achieved. 
117 
CHAPTER 7: CONCLUSION & 
RECOMMENDATIONS 
7.1 CONCLUSION 
Twofish is a 128-bit block cipher. It can work with variable key lengths: 128, 192 or 
256 bits. In this report, only a version of 128-bit key length was discussed. Twofish 
has 6 main building blocks; Feistel Networks, whitening, S-boxes, MDS Matrices, 
Pseudo Hadamard Transforms and Key Schedule. Twofish is a 16 round Feistel 
network with a bijective F function, which corresponds to 8 cycles. The whitening 
technique employed substantially increases the difficulty of keysearch attacks against 
the remainder of the cipher. Twofish uses 4 different, bijective, key-dependent, 8-by-8 
bitS-boxes. Twofish uses a single 4 by 4 MDS matrix over GF (28).This is one of the 
2 main diffusion elements of Twofish. There is also Reed-Solomon code with the 
MDS property used in the key schedule; this doesn't add diffusion to the cipher but 
does add diffusion to the key schedule.) Besides that, Twofish also uses a 32 bit 
Pseudo Hadamard Transform to mix the outputs from its 2 parallel 32-bit g functions. 
Finally, Twofish needs a lot of key material, and has complicated key schedule. To 
facilitate analysis, the key schedule uses the same primitives as the round function. 
Except for 2 additional rotations, each pair of expanded key words is constructed by 
applying the Two fish round function (with key-dependent). 
In this project, 2 different designs were implemented. The first design (Design 
l) was implemented with minimum hardware resources usage, using a single F-
Function (modified) and was optimized with reasonable latency, throughput and 
throughput per gate. As for the second design (Design 2) was implemented with 
reasonably minimum hardware resources using 4 units of F-Function(modified) of 
Design l, minimum hardware resources usage, very small latency, very high 
throughput and very high throughput per gate. Furthermore, both Design l and Design 
2 were implemented with zero keying and function as encryptor/decryptor. Both 
Design I and Design 2 were written using VHDL, simulated using ALDEC, 
synthesized using XILINX Synthesizing Tools, implemented using XILINX ISE6.2i 
implementation tools and download onto the Spartan 2 FPGA board using 
BEDLOAD utility program. 
118 
As a conclusion this Final Year Project is quite successful because all the 
objectives have been met successfully. 
7.2 RECOMMENDATION 
Since the implementation of this project has been very successful, it would be quite 
interesting to improve the project in other possible areas. Among the possible 
recommendations are as follows: -
• Perform a fast implementation Twofish Encryption Algorithm using mixed 
inner and outer round pipelining 
• Implementing this algorithm for mobile communications. 
• Implementing this algorithm for embedded systems. 
• Propagation faults and their detection in a hardware implementation of 
Twofish Algorithm. 




[I] B. Schneier," Applied Cryptography Second Edition", John Wiley & Sons, 1996. 
[2] B. Schneier, J. Kelsey, D. Whiting, D. Wagner, C. Hall, N.Ferguson, "TwoFish: a 
128-bit block cipher", www.counterpane.com/twofish-paper.html 
[3] Bruce Schneier, John Kelsey, Doug Whiting, David Wagner, Chris Hall, and Niels 
Ferguson, "The Twofish Encryption Algorithm", Wiley, 1999 
[4] Pawel Chodowiec, Kris Gaj, "Implementation of the Twofish Cipher Using FPGA 
Devices", www.counterpane.com/twofishtpga.html 
[5] Yeong-Kang Lai, Liang-Gee Chen, Jian-Yi Lai, and Tai-Ming Pamg, "VLSI 
Architecture Design and Implementation for Twofish Block Cipher", proceedings of 
IEEE International Symposium on Circuits & Systems (ISCAS'02), USA, May 26-29, 
2002. 
[6] Mark De Clercq and Vincent Levesque., "A VHDL Implementation of the 
Twofish Block Cipher", McGill University. 
[7] Elbirt, A.J.; Yip, W.; Chetwynd, B.; Paar, C," An FPGA-based performance 
evaluation of the AES block cipher candidate algorithm finalists ", IEEE Transactions 
on Very Large Scale Integration (VLSI) Systems, vol. 9, pp.545-557, Aug. 2001. 
[8] T. Ichikawa, T. Kasuya and M. Matsui, "Hardware Evaluation of AES Finalists", 
in The Third AES Candidate Conference, Gaithersburg, MD, pp. 279-285, April 13-
14,2000. 
[9] The National Institute of Standards and Technology (NIST), "Advanced 
Encryption Standard (AES)", Federal Information Processing Standards Publication 
197, November 26, 200 I. 
[10] Viktor Fischer, "Realization of the Round 2 AES Candidates Using Altera 



















j I ~ ] ] iii ~ ! ~ ] ~ ~ ' ' ~ i ' 0 • ~ 
"' 
; 
~ ' ~ ~ ] " § ' " ~ 
.;,: 
' 
1 ~ ~ "' ~ 0 ] ~ il'~ l ~ £ c ] ~ ·' ,. I i -, l '"' "' :~ .. "' -" ' "' ] ..: ~ " • ~




t .. • ., .. 
' 




<lltll "' wawo 
X X 
t-.oon..-oNONO:"<tO:: 




~ T"<0<fl0MN00 !JJU) U)O) (/) 0 C:•N 0: wawo 
X X 
. O<n.,-oN.-0 .-O:NO:: 
. .. •• wowo X X 
~ - ..- ... o.-o o a::r 0: (/)(/) U)U) Ul Q.W Q 
X X 
2 - 1:Jf;10~ 0 D::,N 0:: tllooco 
X X 










X X X X X 
f1l[!i1iif5~gj(Jigj~gjl8gj:Ji0 
X X X X X X 
OQ:.-!l:MO: .... O:I()O:WQ:r-a 
wo"'o"'o"'owoV>ow 
X X X X X X 
OQ:NO:MO:..,.O:LI' 
momomoooow 








:;; .-r-«>Ln..,.ON~o {/)<J)(J}[/) (/}!I}<J) 
m ,...~~(Jj13°1ii&J 
~ .... \2un;lr;Jag 
~ O..,.o"JNr"O !l}t/)<J)<J) 
~ .... (Zf;j/ijf]l 






. • w 
X 
·; 
t i: 'o i 
• ,§ 
" 
OQ:.-O:M!l: .... O:"'!l:«'!l:""O 
moV>omomomo"'o"' 






.... 0:"'0:"' womom -:;; 
X X 
"'0: .... 0:"'0:"-womomom 
X X X 
lljgj~gj(Jigjtllgj&i . 
X X X X 
Olf5f;jf513f5~f5illf5&i 





X X X X X 
oo:Na:<'>O:'<TO:"' 
wowowo<J)ow 
X X X X 
.-o:Ntro"JO:V 
"'o<JJo<J)o"' 
X X X 
o.:r.-O:NO:M 
"'o<J)o<J)ooo 














>< >< X X X 
oo:.-o:"'O:"'O:"" 
wowoV>omo"' 
X X X X 
.-No:«>oo:.-o:r-VloV>VloVlo<ll 
X X X 
~NO:<')O!l:.-O:NO:<DO:r­VlQVl<llQ({)Q<IlQ<IlQ<Il 
X X X X X 
DO:NO:'Il!l:<DO:r-<I/Q(/)QUlQ<IlQVl 
X X X X 
DO:.-O:NO:MO: .... O:LnO 
"'o"'omoooowow 
>< X X X X 
r«>o:.,.oo:.-o:<"O:<'>O:..,. 
mommomomomow 
X X X X X 
00:r'O:N0:<'>0:..,. 
ooomowomow 
X X X X 
,.....,.{riD.-O:NO:o"J{r .... 
w 0 wwo<J)o"'ow 
X X X X 
oa:.-o:<'<a:.,.trr-llloUlo<J)o<llolll 











wo"'owooo 0 worn 
X X X X X 
,.....,.O:IilOO::~O:NO::MO:VO:Ln 
w 0 ww 0 womow 0 Ulom X X X X X X 
~~a:NOO:toO:,._ 
tllotlltllo<llotll 
X X X 
.-.-a:Noa:~a:"'a:<Pa:"­
tlloU)tllQU)ot/'JoU)oU) 
X X X X X 
Ui~<Ji~lll~1ll 
X X X 
~~a:N<>O:<'>ll:"ll:"' 
w 0 wwowoU)oU) 
X X X X 
oa:.-a:NO:<'>a:..,. Wo<lloU)oU)oU) 
X X X X 
.-.-a:Noa:ra:N Wo"'"'O"'OU) 
X X X 








' " " " " " " 
XOR $3 
" " " " " 
XOR XO< XOR 
0 0 
" " " " " " " " " " " xoe xoe xoe xoe xoe xoe xoe XOR 














" " " " " 0 
" " " " " " " " xoe XOR XOR XOR XOR XOR XOR 
" " " " " " " 
"" 
XOR XOR 









XOR (S1 XOR S7) 
" " " " " XOR XOR XOR XOR XO< 
" " " " " 
FINAL ANSWER 
" " " " " " " " 
m 134 
2) S multiply with 55 Primitive 
' 













" " " " " " 
" " " " " " " " " " " " " 
"" 
"R <OR XOR XOR <OR XOR "R XOR 
" " " " " " " " " 
' " " " " " " " " 










" " " " " " 
" " " " " " " " 
XOR XOR XOR 
0 0 0 0 0 0 0 0 
" " " 









0 0 0 0 0 0 0 0 XOR S6 
" " " " " 
" " " " " " " " " " " " " " " " " " " " 0 0 0 0 0 0 0 0 <OR XOR <OR XOR 
"" 
<OR XOR <OR 
" " " " " " " " " " " " " " " " 0 0 0 0 0 0 0 0 OR XOR <OR <OR XOR <OR 
Answer 
" " " " " " " " " " " " " " 
0 
" " " " 
~ 
" <OR XOR XOR XOR XOR XOR <OR XOR XOR XOR <OR XOR "R XOR 








" " " " 
~ 
" XOR S5 " " " " " 









X X X 
.-L00::"-<"0::<">0::..,.0::"'0::"-0 
tilQtiltilotilQtilQtilQtil 




til til (J} 0 til 
X 
~ .... ~m~oooo~~~~~m~~o ~~~~m~~ 
)( )( )( )( )( )( 




X X X 
" " " " " 
XOR XOR XOR XOR 
'OR WR 
" " " " 
" " 
XOR XOR XOR XOR 
WR XOR 










' " " XOR (53 XOR 55 
" " " " " XOR 57) XOR WR XOR XOR XOR 
XOR 
" 








' XOR XOR WR XOR XOR XOR (51 XOR 
" " " " " 
" " " " " 
S3XOR 55 XOR WR XOR XOR XOR 
" " " " " " " " " 
XOR 57) 
" " " " " XOR XOR XOR XOR XOR WR WR XOR XOR XOR XOR 
" " " " " " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " " XOR XOR XOR FINAL ANSWER '0 
" " " " " " " 
" " " 
(REDUCED) XOR XOR XOR XOR XOR XOR 
XOR WR 
" " " " " " 
" " 
WR XOR XOR WR 
XOR 
" " " " 
" 








' " " " " XOR (S2XOR 
" " " " " 54 XOR 57) XOR XOR XOR XOR XOR 
XOR WR 
" " 
" " " " " XOR XOR XOR WR XOR 
" " " " " 
" " " " " " " " XOR XOR XOR XOR XOR 
" " " " " 
141 142 
4) S multiply with SA XOR 
" Commentslbits 0 
' 
2 









" " " " " 
" " " " " " " " " " " 
' " " " " " " " " 








" " " " " " " " 0 0 0 0 0 0 0 0 XOR WR XOR WR XOR XOR 
" " " " " " " " " " " " " " 0 0 0 0 0 0 0 0 WR XOR OCR XOR 
" " " " " " " " " " " " 
" " " " " " " " 
XOR 
0 0 0 0 0 0 0 0 
" 









0 0 0 0 0 0 0 0 XORSS 
" " " " " Answer 
" " " " " " " " " " " " " " " " " " " " " " " WR XOR XOR XOR XOR XOR XOR XOR XOR XOR 'OR XOR XOR XOR XOR '0R XOR 
" " " " " " " " " " " " " " " " " XOR XOR XOR XOR XOR XOR XOR WR XOR XOR XOR XOR XOR 
" " " " " " " " " " " " " XOR XOR XOR XOR XOR XOR XOR 








' XOR (54 XOR 57) 









" " " " " 
" " " " " " " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " XOR XOR XOR XOR XOR XOR XOR 
" " " " " 
" " " " " " " " " " " " " " " " XOR XOR XOR WR XOR WR XOR XOR XOR XOR XOR 
" " " " " " " " " " " 
~Olct<!>Oct<'lct<O 
wo<nwowow 




" 0 X 
X X X X 
!!!t3 












X X X 
.-<rlOil::Nctr<lctO()ctf!>D 
wwowocnowow 
X X X X 
~"'~O:Nil::<llO:t­
UlWQWQWQ(J) 
X X X 
amtlo~a!llgtlgm 
X X 









~ ~ 0 0 0~ ~Olij OctNO:<'> woUJocn 
X X 
~ 0 0 0 ON~ 0 ~"" •• •o• 
X 
~ ~ 0 0 o;;; ~ o"~ •o• 
X 
~ ~ 0 0 0~ ~ 
X X X 
<r0::<01l:" 
'""Z\gjl7l .... O::..,.Il:t- .-oo:..-o:r-
" 
wowow wawo<IJ mowow 
X X X X X X X 
Zl'6lll'6l7JO O<>:N<t:<'>O::U"Jil:<O .-NO::U"J Dll:<'>ll:<D mo(f)oWo(/)ow •o• womow 
X X X X X X X X X 
... ..-ll:Nil:"'"ll:"' .-O::Nil:'<tQ::tn 
•o• "'owo!l)o"' mo"'owow X X X X X X X 
"ll:'"'CI::,_ r<'>Q::<D 0 <>: ..... cr::"' ll:..,. a:,_ 0 ooc..-<>:"'O::" 
womo"' •o• wamowowaw wowomow 
X X X X X X X X X X 
,...,.,1!:"' DO::<'>Il:ll>O(<D ,...,...a:..., OO(Nil:<'>ll:<D 
•o• "'o<nowo"' •o• wo<no"'o"' 
X X X X X X X X 
.. , ..-No:"' "'U:"O:"'a:r-
•o• •o• wawowom 
X X X X X 
.-<'>o:«> 
•o• ... •o • ... •o• 
X X X 
..-NO::U"J 
•o• """ •o• X X 
~ iii 






X X X X 
" 
' 
~~ r i;; 
0 ~m ""' •o• .. , •o• 
X X 
" 
owo ~gjlll ~~~gjlll •• X X 
" 
~0~<7j '<til:"'<>:"" O!i'6l2 w 0 w 0 w X X X 
e ~l]lDO!jzj t:Jg;,;gmg~ """ •o•X X X X 
OlJl'&jDzj&j Nll:0'11l:<flll:<O wowxwo"' ~ ,_ N <>: <'> (/)"'o"' 
X X X 
l;jD'&j<I)O&jU) ..-o:No:.,.<t:tnOC,._ ,...,_ .... "'"' 
"'o(/)o<nowo<n CIJWoW 
X X X X X 
:;; .-t;lJlD<I)zjOUJfil fil'6UJgjt:lgj<li'61llgjti 0 o •• •o• X X X X X X 
~ ..-<OU"JO<'>NOO DO::Nil:<'>Q:<l>ll:<O ,-,._Oil:N (/)(/) "'"' (/) wo(/)o"'o"'o"' "'"'o"' X X X X X 
~ O<t>'<tONrO ..-<t:NO:"<tll:tn """ "'"' (/)(/) "'o"'o"'o(/) •o•X X X X 
~ .... ..,.,.,o .... o !/)!Il r.fJ(/) oo.-0::"'1!:" wxwomow OOr •x• 
X X 
& .... /B~og Oil: NO:<'> 0"" mo"'o"' •o• 
X X X 
N ON..-0 




•o• o •• •o• 
X X 
• 
g . g g 
Q 
• ·; 
~ .;: ·~ ! :;; ·o 






" " " " '' '' '' '' 
ss S< 
'' 
XOR (53 XOR S4 
" " " " " ,0, ,0, ,0, 'oe ,0, ,0, we ,0, ,0, XOR 55) ,0, we ,0, we we 
" 
ss S< ss ss 
" " '' " 
S< S< S< S< 




s; s; ss s; 





" " ss so ss 
" " 
,0, WR WR WR 'oe ,0, ,0, XO< we ,0, ,0, ,0, we 
" " " " " " " " " 
" " " " '0' w' ,0, ,0, ,0, ,0, we 'oe we ,0, 
" " " " " " " 
S< 
" " 











S< S< S< 
" " XOR (S4 XOR 55 S< 
" 
S< S< S< ,0, ,0, w' xo' xo' ,0, 
XOR 56) we we we w' ,0, s; s; s; ss ss 
" ss ss s; 
" " 
,0, w' xo' ,0, 
we we 'oe w' 'oe 
" " " " 















" " " " " " " " " 
S< XOR (52 XOR 53 
" " " " " xo' xo' we we we we ,0, w' XOR 54 XOR 57) w' xo' xo' w' we 
" " " " " " 
S< ss 
" " " " " we w' XO< XO< we w' w' w' xo' XO< XO< 




S< S< S< 
" 
S< 










" " " " " 
XO< XO< XO< xo' xo' xo' xoe ,0, ,0, we 
" " 
ss 
" " " " " 
" " 








' " " " " " " " " xo' XO< XO< xo' xoe xoe xoe XO< 
m 154 




XO< XO< xo' xo' xo' xo' xo' 
7) S multiply with 9E 
S< ss S< ss 














" " " 
S< s; 
" " XOR (51 XOR 52 
" " " " " 
x9E 0 
' ' ' ' 
0 0 
' XOR 53 XOR 56 xo' ,0, XO< XO< xo' 0 0 0 0 0 0 0 0 
XOR 57) 
" " " " " xo' XO< XO< xo' xo' 
" " " " " 
so 
" " " 
S< s; 
" " so 
" " " 
S< 
" " " so 
" " " 
S4 
" " " XO' we xoe we xo' 
" " 
ss ss ss 
xo' XO< XO< xo' xo' 
so 
" " " 
S< 
" " " 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
" " " " " 
so 
" " " " " 
so 
" F1nal Answer so so S< so so 
" 
so so 
(Recovered) ,0, we XO< xo' XO< w' we xo' 
" " 
s; 
" " " " " ,0, we ,0, XO< ,0, xo' ,0, xo' 
" " " " " " 
S< 
" ,0, we we XO< ,0, ,0, XO< xo' 
Answer so so so so 
" " 
so 
" " " " " 
so 
" ,0, >0, xo' xo' xo' w' w' w' ,0, ,0, 
" " " " " " 
S< 
" " " >0, xo' xo' xo' w' w' w' ,0, 




" xo' xo' xo' xo' w' w' 
" " " " " 





" " WR XO' we WR >0, >a< we 
" 
S< ss S< s; 












' XOR 57 
" " " " " so so so so 
" " 
so 




xo' ,0, ,0, ,0, ,0, xo' xo' ,0, >a< xo' ,0, 
" " " " " " 
S< s; so 
" " we ,0, ,0, w' 'OR xo' xo' xo' 
" " " 
S< S< s; so 
" 
'55 156 
" " " 
<S 








" 'ox 'ox 'ox 'ox 'ox 
>S •s •s •s >S 
'ox 'ox 'ox 'ox 'ox 
<S <S <S <S <S 
'ox 'ox 'ox 'ox 'ox (9S BOX I>'S ~OX 









's OS OS 's 
'ox 'ox 'ox 'OX 
" 
OS 's OS OS OS OS OS 






'ox 'ox 'ox 'ox 'ox ,ox 'OX 'OX 
" " " 
's •s •s 
" 
's 
'ox 'ox 'ox 'ox 'ox 'ox 'ox 'OX 
<S 
" " 
•s OS OS OS OS 
" 
's 's 's 's 
'ox 'ox 'ox 'ox 'OX 
OS OS OS OS OS 
" " " 
'ox 'ox 'ox 'ox 'OX 
'ox 'ox 'ox •s •s •s •s •s 
" " " " " " " 
'ox 'ox 'ox 'ox 'OX (LS BOX SS BOX 
'ox 'ox 'ox 'ox 'ox 'ox 'ox 's 's 's 's 'S \>S IOOX ZS) l:JOX 




















'ox 'ox 'ox 'ox 's 
'S 
" 
's 's OS OS •s OS 'ox 
'OX 'ox 'ox 'ox 'ox 'ox 'ox 'ox OS OS 
os >S •s OS 
" 
's 's <S 'ox 'ox 





•s •s •s 'ox 'ox 'ox 
'OX 'ox 'ox 'ox 'ox 'ox 'ox 'ox 's 
"' "' " "' " 
<S 's 's 
'S 
" 
OS 's <S OS OS OS OS 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 
OS OS OS 
"' 
OS 
"' " " " 
'S <S 
" 
•s •s •s 
'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 
OS 
" 
OS OS OS 
" 
<S 's •s OS 
" 
<S OS OS OS OS 
'ox 'ox 'ox 'ox 'ox 9Sl:JOX 's 
" 
's 's 
" <S <S <S 
" 










OS OS OS OS (LS !:lOX SS) <:lOX 








'ox 'ox 's OS 
OS OS 'ox 'ox 
'ox 'ox OS OS •s <S 
'S 's 's 's <S •s 'ox 'ox 'ox 'ox 
'OX 'ox 'ox 'ox 'ox 'ox 's 's 
" 
OS >S •s 's 's 's 
"' 
os OS •s OS <S 's 's 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 
'ox '0X 'ox 'ox 'ox 'ox 'ox 'ox 
" " " 
OS 
"' 
<S <S 's •s •s •s 
OS >S •s <S <S 's •s •s •s 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 'ox 











<S OS OS OS OS 
"' "' "' "' 
OS 9Sl:IOX 








'ox 'OX 'ox 'ox 'ox OS 's 
" "' 
OS OS OS 'ox 'ox 
'ox 'OX 'ox 'ox 'ox (LS l:JOX OS OS OS 
" " >S 
"' 
•s •s •s 9S l:JOX I>'S) !;lOX 'ox 'ox 'ox 'ox 'ox 
8) S multiply with 56 xoe 













' XOR 56 
" " " " " 
" " " " " " " " " " " xoe we xoe xoe xoe xoe xoe xoe s " 












" " " " " " " " 0 0 0 0 0 0 0 0 xoe xoe xoe xoe xoe xoe 
" " " " " " " 
s' 
" " "' " " " so 
" " " " " " " 
xoe xoe xoe xoe 
0 0 0 0 0 0 0 0 
" " " " so 
" " " "' " " " 
xoe 
0 0 0 0 0 0 0 0 
" so 








' 0 0 0 0 0 0 0 0 XOR55 
" " " " " Answer 
" 
so 
" " " 
so 
" " " " " " " " " " " " " " " " "' xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " " " " " " " " " " " " 
S< 
" xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " " " " " " " " xoe xoe xoe xoe xoe xoe xoe 























' ' xoe xoe xoe xoe xoe xoe xoe xoe xoe XOR 54 
" " " " " 
" " " " " " "' " " " " " " " 
so S' s' so xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " " 
s' s' S2 so 
" 
so so s' xoe we xoe xoe xoe xoe xoe xoe xoe 







xoe xoe xoe xoe 
" " " " 
9) S multiply with 82 
xoe 
" 





















0 0 0 0 0 
' 
" " " " " 




" " " " " " " " " " " " xoe xoe xoe xoe xoe xoe xoe xoe 0 0 0 0 0 0 0 0 
" " " " " " " " 
0 0 0 0 0 0 0 0 
xoe WR XOR XOR XOR XOR 0 0 0 0 0 0 0 0 
" " " " " " 
0 0 0 0 0 0 0 0 
XOR XOR XOR XOR 0 0 0 0 0 0 0 0 
" " " " " " 
S2 
" " " " " xoe XOR Answer so 
" 
S2 














' " " XOR (52 XOR SS) 
" " " " " XOR XOR XOR XOR XOR 
" " " " " Final Answer 








" " " " " " " " 
XOR57 
" " " " " xoe XOR xoe xoe xoe 
" " " " 
s; 
xoe XOR XOR xoe 
" " " " XOR XOR WR 
" " " 
so 




" "' " " XOR xoe xoe xoe 








" " " " " so 
" " " " " " " " " "' " xoe xoe xoe xoe xoe xoe 
" " " " " " 









xoe xoe xoe 
XOR (SS XOR S7) so so 
" " " me xoe xoe xoe xoe 
S< 
" " xoe 
" " " " " 
S< 
" " 
S> ss ~ 
" 
S' 
















" " " 
XOR S6 XOR 57) xoe xoe xoe xoe xoe 


































" " " " " me xoe xoe xoe xoe xoe xoe 
S< 
" " " " " " xoe me xoe xoe xoe 
" " 
S< 








' XOR (S3 XOR 55 
" 
ss 




" " xoe xoe xoe xoe me 
" " " " " 











me xoe xoe xoe xoe me 
" 
s' 
" " " " " " xoe xoe xoe xoe xoe me xoe 
" " 
S< 
" " " 
ss 
xoe xoe xoe xoe xoe xoe 
" " 
ss 
" " " xoe xoe xoe xoe xoe 
ss S< 
" " " xoe xoe 







' XOR (S1 XOR S3 s' s' 
" 
s' s' 
XOR S5 XOR 56 xoe xoe xoe me xoe 
XOR S7) 
" " 
ss ss ss 





xoe xoe xoe xoe xoe 
" " " " 
ss 














(Recovered) me xoe xoe xoe xoe xoe xoe 
" " 
ss S< 
" " " 














" " " " " xoe xoe me xoe xoe xoe xFJ 
' ' 
0 0 
' ' ' ' ss ss 
" " 
ss 
" xoe xoe xoe " " " " " 
ss ss S< 
" 
s' 





S< 0 0 0 0 0 0 0 0 

















S> ss ~ ss ss 
" Answer so 
" " " 
so so so so 
" " 
ss S< ss 
" 
S< 
xoe xoe xoe me xoe xoe xoe xoe xoe xoe xoe xoe xoe 
so S> ss ss 
" 





xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " " 
ss ss ~ 
" 
ss s' 
xoe xoe xoe xoe xoe xoe xoe 
" 
ss 
" " " 
ss s' 

















" " " " 
so so so 
" " 
ss S< ss ss 
xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" 
S> ss ss 
" " 
s' 




<DR WR WR <DR XOR XOR XOR XOR 
" " " " " " 
" " " " " " " " 
XOR XOR XOR WR XOR XOR 
XOR XOR XOR XOR XOR XOR XOR 
" " " " " " 
" " " " " " " 
XOR WR XOR XOR XOR 
XOR XOR XOR 
" " " " " 
" " " 
XOR XOR XOR 
XOR XOR 

















XOR {54 XOR 55) 
" " " " " XOR (S6 XOR 57) 
" " " " " 
XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR 
" " " " " 
" " " " " 
00 
" " " " " " " " " " 00 
" " " " " " " " " " " " 
XOR XOR XOR XOR WR XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " " " 
" " " " " " " " " " " " 
XOR XOR XOR XOR XOR XOR XOR 
XOR <DR XOR XOR XOR <DR XOR 
" " " " " " " 
" " " " " " " 
XOR XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR XOR 
" " " " " " 
" " " " " " 
XOR XOR XOR XOR 
XOR XOR XOR XOR 
" " " " 
" " " " 
XOR XOR XOR 
XOR XOR XOR 
" " " 
















XOR (53 XOR 54) 
" " " " " XOR (56 XOR 57) 
" " " " 
S> XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR S9 
" " " " 
" " " " " 
00 
" " " " " " " " " so 
" " " " " " " " " " " 
XOR XOR XOR WR XOR XOR XOR XOR XOR 
XOR WR XOR XOR XOR <DR XOR XOR XOR XOR XOR 
" " " " " " " " " 
" " " " " " " " " " " 
XOR XOR XOR XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR XOR 
" " " " " " " " 
"' 
170 
XOR XOR XOR XOR XOR XOR Final Answer 
" " " " " " 
so 
" 
" " " " " " 
(Recovered) XOR XOR XOR XOR XOR XOR XOR XOR 
XOR XOR XOR 
" " " " " " " " 
" " " 
XOR WR XOR XOR XOR XOR XOR <DR 
XOR XOR XOR 
" " " " " " " " 
" " " 








' " " " " " " " " 
" " " " " 
XOR XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR 
" " " " " " 
" " " " " 
XOR XOR 
XOR XOR XOR XOR XOR 
" " 
-
" " " " " so 
" " " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " XOR WR XOR XOR WR XOR XOR 
" " " " " " " XOR XOR XOR XOR XOR 
" " " " " XOR XOR XOR 







' XOR (51 XOR 52 
" " " " " XOR S6 XOR 57) XOR XOR XOR XOR XOR 
" " " " " XOR XOR XOR XOR XOR 
" " " " " XOR XOR XOR XOR XOR 








X X X 
NU:<'>U:"U:"'U:"" 
"'o"'o"'o"'o"' X X X X 
NU:<'>U:"U:"'U:"" 
"'o"'O"'o"'o"' X X X X 
ru:NU:<'>O:.,.U:<rJ 






' 0 u 
X X X X 
rlllO:<rJOU:IllO:<rJ 







~ -OMNr-0 OU:rU:NU: ~ 0 Otr:rU:NU:<'> 
"'"'"'"' 
"'o"'o"'o "'o"'o"'o"' X X X X X X 
~ r-ONrO DQ:rU:N - ...._ou:ret:Na:"" uw 
"'o"'o"' "'"'o"'o"'o"' X X X X X 
w -o-o o"- o"-.. •o• •o• X X 
;;; r01,3 g g 
g 










m .-o<t>.,.aooa .. . 
~ oa;,;lilaaa 













.- ... Q:IllNU:<'>U:...._ 
"'o"'"'o"'o"' X X X 
rU:NU:<'>U: ... U:<O 
UlQ<IlQIJJQIJJQIJJ 
X X X X 
au:"U:"' 
"'O"'O"' X X 
""" •o•X 










"'o"'"'o"'O"'o"'o"'O"' X XXXXX 
,...,...ou:..-u:"'O:"" 










" 0 X 
13) S multiply with 68 XO< XO< xo< XO< XO< 
" " " " " 





' ' ' " " " " " " 
" " " " " " " 
so 
s 
" " " " " " " " 
XO< XO< xo< XO< XO< XO< XO< XO< 
" " " " " 
so so 
" XO< XO< xo< 
so 















0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
I 
0 0 0 0 0 0 0 0 
" " " " " " " " 0 0 0 0 0 0 0 0 
" " " " " " " " 
XOR (55 XOR 86 
" " " 
so ss 
" " " " " " 
so 
" 
XOR 87) XO< XOR XO< XO< <OR 






XO< XOR XO< XOR 
" " " " " 
" " " 
so 




" " " " " " " " 
I 
XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " XOR XOR XOR XOR XOR XO< XOR XOR XO< XOR XOR XOR XOR 
" " " " " " " " " " " " " XOR XO< XOR XOR XOR XOR XOR XOR 
" " " " " " " " XOR XO< XOR XOR XOR XOR 
















" " " " " 
" " " " " " " " " " 
I 
XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR (54 XOR 85 
" " " " " 
" " " " " " " "' 
XOR XOR XOR XO< XOR 
<OR XOR 
" " " " " 
" " 
XOR XOR XOR XOR XOR 
" " " " " 
" " " " 
so 













XOR XOR XOR XOR <DR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " " " " " " XOR <OR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" 
so 
" " " " " " " " " 
so 
" XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " " " " XOR Final Answer 
" " " " " " " " 
" 








' " " " " " " " " XOR (53 XOR 54 
" " " " " XOR 55 XOR 57) XOR XOR XOR XOR XOR 
XOR XOR XOR <OR XOR XOR XOR 
" " " " " " " 
" " " " " 
XOR XOR XOR XOR XOR 
XOR XOR XOR <OR XOR 
" " " " " so so so 
" " 
XOR XOR XOR 
XOR <OR XOR XOR <DR 
" " " 
" " " " " 
XOR XOR 
" " " " " " " " " " XOR >OR XOR XOR XOR XOR XOR >DR XOR 
" " " " " " " " " XOR >OR XOR >DR XOR >OR >DR 
so 
" " " " " " XOR XOR XOR XOR XOR 
" " " " " XOR XOR 








' XOR (82 XOR 53 
" " " " " XOR S4XOR 86 XOR XOR XOR XOR XOR 
XOR 57) 
" " " " " 
14) S multiplywitb E5 xoe xoe xoe 
" '" " xoe xoe xoe 
Commentslbits 0 
' ' ' 
X s 
' ' ' 
s 







' s so 
" " 
., 
" " "' " 
XOR (56XOR 












xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
s' 
" " " " " 
s' ., 





" xoe XOR XOR 
"' " '" XOR xoe xoe 













' ' ' 




"' " 0 0 0 0 0 0 0 0 




" " 0 0 0 0 0 0 0 0 




" " " " 
" " " " " " " " 
" " " " " 
ss 







" " " " 
ss 





" " " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR 
" " " 
., sx 
" " " 
XOR (55 XOR 56) 
'" " " " " XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " " " XOR so 




XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " " 
., 
" " XOR XOR XOR XOR XOR XOR XOR XOR 
ss 
" " " 
., 
" "' " XOR XOR XOR 










' XOR S7 
" " " " " so 
" 
so 
" " " 
so 
" " " " " " " XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " 
., 
" " " 
., 
" " " " XOR XOR XOR XOR XOR XOR XOR 
" " " 
., 












XOR XOR XOR XOR XOR 
XORS7) XOR XOR XOR XOR xoe ss 
" " " " ss 
" " " " 
XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR
" " " " " 
" " " " " " " 
so 





" " " " " " " " 
XOR XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR XOR XOR XOR XOR 
" " " " " " 
" " " " " " " " " 
XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR XOR XOR 
" " " 
., 
" 
" " " " 
., 
" " 
XOR XOR XOR XOR 





" " " " 
XOR XOR XOR 
XOR 

















XOR (51 XOR 52 






XOR S4 XOR 56) XOR XOR XOR XOR XOR 
XOR 56) XOR XOR XOR XOR XOR 
" " " " " 
" " " " " 
XOR XOR XOR XOR XOR 




" " " " " 
XOR XOR XOR XOR XOR 








so so so 
" 
., ., 
" " " 
., (Recovered) XOR XOR XOR XOR XOR 
XOR XOR XOR XOR XOR XOR 
" " " " " ., 
" 
ss 
" " " 
XOR XOR XOR XOR XOR 





" " " " 














' "' " " XOR /52 XOR 53 


















>: " 1 • " I 
" 















om l8 m -m 
'0" 
" " 
""' •o• .-:;:; ~ ~ 
X 
omo,:,; ~·· o• ... "o" - m ~ X X 
OOi)JO[;l 
""" •o• """ •o• ~ 
X X 
00001)0(;! , .. ,... .... "'ll: ..- ll: .... 0 ~ 






X X X 
.-,.._ooooNoo 
" " " 
53 0: N ll: 1'- 0 
owotn Oll:NC:t-..-CllQUlQW m ~ 
X X X X 
omoooo;;;o 
-·· •ow 
,.. ... .-,y (&J 0: ... 0 
(I)(/) 0 (/) 0 Ul 1i 
X X X 
.-<t>Q0000 0"" o•• ,.. l1l g 
" " "o" "o" X X 
o...-oooo ~ ~ ~ 
" 
O(;jOOO ~ ~ ~ 
ONOO ~ ~ ~ 
" 
owo ;;; ;;; ;;; 
-~ ~ ~ ~ 
! . ' ~ 18 < '" X < ~ gj :~ g; • X • X 
ooo 
0 0 0 0 0 
00"<t0000 ~ 
" 
0 0<'>000 ~ 
" 
oo(;joo ~ 







X X X X 
,...NO::"-Oil:ti>(l:<D WoWWQ<I>QW 
X X X 
.-oc..-a:lllet:«>O::I-
"'owomoOJoVl 
X X X X 
.-NO:r-.NO:<'>O:'l"ll:,.._ 
<l!owooowoUJor.t! 
X X X X 
T"Nil:,.._,..ll:NO:,.._ 
wowwowow 
X X X 
gj(;jgjl1Jgjt:i ... ~ggj(;jgj~gjl8gjt:io 
XXX XXXX 
... ~gg;~gjillgjl8 ... lil 
X X X 
" " 0 X 
we xoe xoe xoe we 17) S multiply witb FC 
" " " " " Final Answer 
" " " " " " " " {Recovered) xoe xoe xoe xoe xoe xoe xoe xoe Comments\bits ' ' ' 
0 • 
' ' ' ' 
s w 
" " " " " 
" " " " " " " " xoe xoe xoe xoe xoe xoe xoe s 
" " " " " " " " 
" 
s; so so 
" " 
~ xFC 
' ' ' ' ' ' ' ' xoe xoe xoe xoe xoe s; ss 
" " " 
' ' ' ' ' ' ' ' 
' ' ' ' ' ' ' ' xoe xoe xoe xoe so
" " " " " " " 
" " " " " " " " " " " " 
" " " " " 
ss 
" " 
" " " " " " " " so 
" " " " " " " 
" " " " " " 
~ s; 
Answer 
" " " " 
so 
" " " " " 
ss 
" " xoe xoe we xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " " " " 
ss 
" " xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " " 
ss ~ 
" xoe xoe xoe xoe xoe xoe xoe 
" " " " " " " xoe xoe xoe xoe xoe 
" " " " " xoe xoe xoe 
" " " Prim~ive 
' ' ' ' ' ' ' ' ' XORS7 s; 





" " " " " " xoe xoe we xoe xoe xoe xoe xoe xoe xoe we 
" " " " " " " " " " " 
193 IS< 
xoe xoe xoe xoe xoe we xoe xoe xoe xoe xoe xoe xoe 
" " " " " " " 
~ 
" 
ss ss ss ss 
xoe xoe xoe we xoe xoe xoe 
" " " " " 
so 
" " " "' 
" " " " " 
ss 
" 
xoe xoe xoe xoe xoe we xoe xoe WR 
xoe xoe xoe xoe xoe 




>OR xoe xoe xoe xoe xoe xoe 
we xoe xoe 
" " " " " " " s; 
" 
ss xoe xoe xoe xoe xoe XOR 
xoe 
" " " " " 
ss 
" 
xoe xoe xoe xoe xoe XOR 
Primttive 
' ' ' ' ' ' ' ' ' 
ss 
" "' "' 
ss ss 







xoe xoe xoe xoe xoe 
" " " 
" " " " " 
xoe 
" " " " 
so 
" " " " " 
ss ss 
XOR xoe XOR xoe xoe xoe xoe xoe xoe xoe we 
" " " " " " " "' " " " xoe xoe we xoe xoe xoe xoe Pnmttive 
' ' ' ' ' ' ' ' ' 
" " " " " "' " 
XOR (54 XOR 55) 
"' " " "' " xoe xoe we xoe xoe we xoe xoe xoe xoe XOR 
" " " " " 
ss ss 
" " " " xoe xoe xoe xoe xoe xoe so so so 
" " " " " " ~ 
"' " " " " 
xoe xoe xoe xoe xoe xoe xoe xoe 
xoe xoe xoe 




xoe xoe xoe xoe we XOR XOR XOR 
xoe 
" " " " " " " " 
" 
xoe xoe XOR xoe XOR XOR 
xoe ss 
" " " " "' 
" 
XOR xoe XOR xoe xoe
Primitive 




ss xoe XOR xoe 
" " " " " " " " " " " " " 
xoe xoe xoe xoe xoe xoe xoe 
xoe xoe xoe xoe xoe xoe xoe xoe 
" 
., 
" " " " " 
" " " " " " " " 
xoe xoe xoe xoe xoe xoe xoe 
xoe xoe xoe xoe xoe xoe xoe ., 
" " 
., 
" " " 
" " " " " " " 
xoe xoe xoe xoe xoe xoe 
me 
" " " " " " 
" 

















' XOR 57) xoe xoe xoe xoe xoe XOR (53 XOR 54 
" " " " " 
" " " " " 
XOR ssxoR ss xoe xoe xoe xoe xoe 
xoe xoe xoe xoe xoe XOR 57) 
" " " " " 
" " " " " 
xoe xoe xoe xoe xoe 
" " " " " 
., 




" xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " 
., 
" " " " " " " " "' " me xoe xoe xoe xoe xoe we xoe xoe xoe xoe xoe 
" " " " " 
So 
" " " " " " xoe xoe me me 
" " " " " " " " " " s' s' 
"' " 




" " " " 
" 








' " " " " " " " XOR (54 XOR SS s• s• sx s• 
"' XOR S6 XOR 57) me xoe xoe me xoe
xoe xoe xoe me xoe xoe xoe
" " " " " " " so so so ., so xoe xoe xoe xoe 
me xoe xoe xoe xoe 
" " " " ss 
" " "' " 
xoe xoe 
xoe xoe xoe xoe xoe
" " 









" " " " " " " " " " " 
XOR (52 XOR S3 
" " " " " 
201 202 
XOR S4XOR 55 xoe xoe xoe xoe xoe 
" 
., ., ., ., 
XOR 56) 
" " " " " 
xoe xoe xoe xoe xoe 
xoe xoe xoe xoe xoe 
" " " " " 
" " " " " 
Finat Answer 
" " " " " " " " xoe xoe xoe xoe xoe 
., ., 
" " " 
(Recovered) xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " " " " we xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe xoe 
" " " " " 
S2 
" " " " " " 
" " " " " " " " " 
xoe we xoe me xoe xoe xoe 
xoe me xoe xoe xoe xoe xoe xoe 
" " " " " " " 
" " " " " " " " 
xoe xoe xoe xoe 




" " " " " " " 
xoe xoe xoe xoe 
xoe xoe xoe xoe xoe xoe xoe 
" " " " 
" 
., 
" " " " " 
xoe xoe 
xoe xoe xoe me 
' 
S! 
---- --- L__ 
" 
" " " " me xoe xoe xoe 








' XOR (51 XOR S2 
" XOR S3XOR S4 xoe 
XOR SS XOR 57) 
" " " " " xoe we xoe xoe xoe 
" " " " " me xoe xoe xoe xoe 
" " " " " xoe xoe xoe xoe xoe 
" " " " " xoe xoe xoe xoe xoe 
19) S multiply with 47 
Comments\bits 0 
' ' ' ' ' ' ' 
s 
" " " " " 
ss 
" " x47 
' ' ' 




" " " " 
ss 
" " so 
" " " " " " 
" " " " " " 0 0 0 0 0 
0 0 0 0 
0 0 0 
so 
" 0 
""~' " " " " " " " " XO< XO< XO< xo< XO< XO< XO< 
" " " " " " " XO< XO< XO< xo< XO< XO< 
" " " " " " xo< XO< 





" " " " " " " " XO< XO< XO< XO< XO< XO< XO< 
" " " " " " " XO< XO< m< XO< XO< XO< 
" " " " 
so 












' XOR (S3 XOR S5) 






" " " " " XO< XO< XO< XO< xo< >a< XO< 
" " " 
ss 
" " " XO< xo< XO< XO< 










XOR (S2 XOR S4) 
" " " " XO< XO< XO< XO< 
" " " " Final Answer so so so 
" " " " " (Recovered) XO< XO< XO< XO< XO< XO< XO< XO< 
" " " " " " " " XO< XO< XO< XO< XO< 
" " " " " XO< 
ss 
' ' " " " " 
" 
" " 0 0 0 
0 0 0 0 
0 0 0 0 0 
" " " " 
so 
" 0 0 0 0 0 0 
" " " 
so 
" " XO< XO<







" " " 
" " " " " XO< m< XO< 















XOR (S5 XOR S7) 
Primitive 
XOR (S4 XOR S6) 








" " XO< 
" 






















" " " " " " " " " " XO< XO< XO< XO< XO< XO< XO< xo< XO< XO< 
" " " " " " " 
S< ss 
" XO< xo< xo< XO< XO< 
" " " 
so 








' so so so 
" " XO< XO< xoe xo< XO< 
" " " " " 
" " " " " " " " " XO< xo< xo< XO< XO< XO< XO< XO< XO< 




" XO< XO< XO< xo< 











" " " " " XO< xo< XO< XO< XO< 
" " " " 
so 
" " " " 
so 
" " " XO< XO< XO< XO< XO< XO< XO< XO< 




XO< XO< XO< 
" " " 
'" 
' ' ' ' ' ' ' ' " " " " " " 







' 0 0 0 0 0 0 
" " " " " 
so 
" so 
" " " " 
ss 
" " so 
" " " " 
ss 




" " " " " " " 0 0 0 0 0 0 0 0 
so 









" " " " 
so 
" " XO< XO< XO< xo< XO< XO< XO< XO< XO< XO< XO< 
" " " " " " " " 
so 
" " XO< XO< XO< XO< XO< XO< XO< xo< 
" " " "' " " 
so 
" XO< XO< XO< XO< XO< 
" 
so so 
















" " " " 
ss 
" XO< XO< XO< XO< XO< XO< >a< me XO< >a< 
" " " " " " " " 
ss 
" XO< XO< XO< >a< XO< XO< XO< XO< 
" " " " " " " •ox •ox •ox •ox •ox •ox •ox 
" " " " " " " " " " •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox 
" " "' " " " " " " " " " 










" " " •ox •ox •ox 
" " " " " " •ox •ox •ox •ox •ox •ox 
" " " "' " " " " •ox •ox •ox •ox •ox •ox •ox •ox 
" " " "' " " " " " " •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox 
" " " " " " " " " " " " " 
J<IMSlJII 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
" " " " " " " " " 
" " " " " " " " 
" " " " " " " " 
" " " " " " " " 0 0 0 0 0 0 0 0 









" " " " " •ox •ox •ox •ox •ox •ox 
0 0 




" " " " " " " " 
" " " " " " " " ' 
•ox •ox •ox •ox •ox •ox •ox •ox (paJaMoa~) 
" " " " " " " " 
JaMSUV leUI=I 
" " " " " " ' ' ' ' ' ' ' ' ' 
0 51!'1\S11lallllli0J 
" " " " " •ox •ox •ox •ox •ox 
Q£ lp!Aoi.Aid!Jinw S (IZ " " " " " •ox •ox •ox •ox •ox (LSl:lOX 
OIZ 
'" 
" " " " " 









" " " " " " " " " " 
" " "' " " " 
ItS !:lOX 
•ox 









•ox •ox •ox 
" 






•ox •ox •ox •ox •ox •ox •ox 
" " " 
" " " " " " " " 
•ox •ox •ox 
•ox •ox •ox •ox •ox •ox •ox •ox 
" " " " " " " 
" " " " " " " " 
•ox •ox •ox •ox •ox •ox •ox 
" " " " " " " " " " " " " " •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox 
" " " " " 
(gs ClOX ZS} <lOX 




















" " " " " 
" 
•ox •ox •ox •ox •ox 
" " " 
•ox 
" " " " " " •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox 
" " " " " " " " " " " " " " " " •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox 
" " " " " " " 
0' 
" " "' " " " " " " " 
" " " " " 
•ox •ox •ox •ox •ox •ox •ox •ox •ox 
•ox •ox •ox •ox •ox 
" " " " " " " " " " " " 
" " " " " 
(LS !:lOX f.:S) 't!OX 


















" " " " " •ox •ox •ox •ox •ox 
" " " " " " " " " " •ox •ox •ox •ox •ox •ox •ox •ox •ox •ox 
" " " " " " " " " " " " " " " " " 
WR 'OR WR 'OR 'OR 'OR 
" " " " " " 
" " " " " " 
XOR XOR WR WR XOR XOR 
WR WR WR 
" "' " " " " 
" " " 
WR WR WR 
WR 










' " XOR (56 XOR 57) 
" " " " "' 
'OR 
WR 'OR 'OR WR XOR 
" 









'" " '" '" '" '" " " " "' " 
XOR(S4XORS5) 
" " " "' " XOR WR WR XOR XOR XOR XOR XOR WR XOR WR XOR WR XOR 
" " " " " " " " " " " " " " 'OR 'OR WR XOR XOR XOR 
'" " '" '" '" '" " " " 
" " " " " " 
'OR 'OR 'OR WR WR XOR WR 'OR 
'OR 'OR 'OR 'OR WR 'OR 
" 
s' s' s' s' s' s' S< 
" " " " " " 
'OR WR 'OR WR XOR XOR 
XOR 'OR 'OR so so s' s' s' s' 
" " " 
XOR XOR WR WR XOR 
XOR ss S< ss s' S< 
" 















XOR (SS XOR 56) 
" " " " " 'OR WR WR XOR WR 
WR 
S' 









'" " '" '" '" '" " " " " 
XOR(S3XORS4) s' s' 
" 
s' s' 
WR XOR 'OR XOR WR 'OR 'OR 'OR XOR WR WR 'OR XOR 
" " " " " " " " "' 
S< S< S< S< 
XOR WR XOR 'OR WR 'OR Final Answer so s' so so so so s' s' 
"' "' 
(Recovered) XOR 'OR WR XOR WR 'OR 'OR XOR ~.:. "'mum 1 w1m ~,.. 
s' S< s' s' s' s' s' s' 
XOR XOR XOR XOR XOR XOR Comments \bits 0 ' ' ' ' 
0 
' ' ' ' '" " " " " " s• so s' so s' s' 
WR WR WR XOR s so s' s' s' S< so ss s' 
s• ss ss S' 
XOR WR 'OR 'OR 







0 0 0 
so s' s' s' s• ss 
" " 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
'" " " " " " "' " so 
" " " " " "' " 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 




" " " " " " " XOR XOR XOR XOR XOR XOR 'OR 'OR 
" " " " "' " 
so 












" " " " " 
'" " " 
so so 
" " " "' " 
so 
WR 'OR 'OR XOR WR WR XOR XOR 
" " " " "' " 
so 
" 'OR 'OR WR WR 'OR WR 
" " " "' " " 'OR WR









" " " " " xoe xoe xoe xoe xoe
" " " " " 
xoe xoe XOR xoe xoe 
" " " " " Final Answer so 
" " 
so so 
" " " 
" " " 
so so 
" " " "' 
so 
xoe xoe xoe xoe xoe xoe xoe xoe 
(Recovered) xoe xoe xoe WR xoe xoe WR xoe
"' 
s; 
" " " " " " 
" " " " " "' 
so 
" xoe xoe xoe xoe xoe xoe xoe xoe 
" " " 
s; 
" " " " 
xoe WR xoe XOR xoe XOR XOR XOR





xoe xoe xoe XOR xoe XOR xoe 
















' XOR (SS XOR S6 so so so so so 
XOR 57) xoe xoe xoe xoe xoe 
" " " " " xoe xoe xoe xoe xoe
" " " " " 
" " " " 
so 
" " " "' xoe xoe XOR XOR XOR xoe xoe xoe 
so 
" " " " " " 
so 



















' XOR (54 XOR SS 
"' " "' " "' XOR S6 XOR S7) xoe XOR XOR xoe xoe
so so so so s; 
XOR XOR xoe XOR XOR 
so 
" " " 
so 
217 m 
23) S multiply with ()3 
Commentslbits 0 








" " " " " 
so 
" " x03 
' ' 
0 0 0 0 0 0 
" " " " " 
so 
" " 
" " " " " 
so 
" " 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
Answer 
" " " " " " 
so 
" " we XOR xoe XOR XOR XOR xoe











" " " " " so 




xoe xoe xoe xoe we xoe xoe xoe
" " " " " 
so 
" " XOR XOR XOR 
" " " 
APPENDIX C: SOURCE CODE FOR DESIGN 2 
l.ADDEROI 
- Author Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component is a 1 bit adder. 




entity adderOl is 
port (dataa, datab, cin: in std_logic; 
result, cout: out std log~c); 
end adderOl; 
architecture dataflow of adder01 is 
begin 
result <~ dataa xor datab xor cin; 
cout <= (dataa and datab) or (cin and (dataa or datab)); 
end dataflow; 
2.ADDER32 
Author Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component is a 32 bit adder. 
Version : 1.0 -Beta 
-----------------
library ieee ; 
use ieee.std_logic_1164.all 
entity adder32 is 
port( dataa, datab ln std logic vector(31 downto 0) ; 
cout out std logic,- -
cin in std logic ; 
result out Std logic vector(31 downto 0)) ; 
end adder32; - -
architecture behavior of adder32 is 
component adderOl 
port ( dataa 
datab 
in std logic 
in std=logic 
in std logic ; 
out-std_logic 





signal c: std_logic_vector(31 downto 1) 
begin 
g: for i in 0 to 31 generate 
gO: if (i 0) generate 
1:20 
xO: adde~Ol port map(dataa(O), datab(O), cin, result(O), 
c {1)) 
end generate gO 
gl: if ( (i > 0) and (i < 31)) generate 
xO: adderOl port map{dataa(i), datab(i), c(i), result(i), 
c(i+l)} 
end generate gl 
g2: if (i ~ 31) generate 
xO: adderOl port map(dataa{31), datab{31), c(31), 
result (31), cout) ; 
end generate g2 
end generate g ; 
end behavior; 
3. CIPHERTEXT 
Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component is a ciphertext. 
output part of the twofish enc/dec system: 
output whitening and result latching 
Version 1.0- Beta 
library ieee; 
use ieee.std logic 1164.all; 
use ieee.std=logic=arith all; 
--Ciphertext: output part of the twofish enc/dec system: output 
whitening and result latching 
entity ciphertext is 
port( K4,K5,K6,K7 : in std_logic_vector(31 downto 0); --input forK-
Subkeys 
llne0,llne1,llne2,llne3 : ln std loglc vector(31 downto 0); 
inputs derived from 16 iterations in core coffiponent 
elk in std_logic; --Clock signal 
reset in std_logic; 
latch ciphertext : in std logic; --signal enabling latching of 
results intO result register for MS 64 bits 
ciphertext : out std logic vector(127 downto 0)); --Enc/Dec 
result -
end ciphertext; 
--Start of ciphertext structure description 
architecture mixed OF ciphertext is 
component xor 2x32 
port (A iri Std logic vector(31 downto 0); 
B : in std-logic-vector(31 downto 0); 
Result oUt std=logic_vector(31 downto 0)); 
end component; 
component reg32 
port(clock,clr,enable in std_logic; 
data : in std logic vector (31 downto 0); 
q : out std !Ogle v€ctor (31 downto 0)); 
00> 
end component; 
signal CO, Cl, C2, C3: std_logic_vector(31 downto 0); --Register 
output signal 
signal K4xorLO, K5xorLl,K6xorL2, K7xorL3: std logic vector(31 downto 
0); --Resulting signal for XOR of inputs with-different K-Subkeys 
begin 
-- Writing result of enc/dec to output register for LS 32-bit vector 
CO_reg: reg32 
port map (data => K4xorLO, 
clock => elk, 
clr => reset, 
enable => latch ciphertext, 
q => CO); -





(data => KSxorLl, 
clock => elk, 
clr => reset, 
enable => latch_ciphertext, 
q ""> Cl); 
Writing result of enc/dec to output register for second MS 32-bit 
vector 
C2 reg. reg32 
port map (data => K6xorL2, 
clock => elk, 
clr => reset, 
enable => latch_ciphertext, 
q => C2); 
Writing result of enc/dec to output register for MS 32-bit vector 
C3 reg reg32 
port map (data => K7xorL3, 
clock => elk, 
clr => reset, 
enable => latch_ciphertext, 
q=>C3); 
output whitening 
XOR LS 32-bit vector with K-Subkey K4 
LOK4_xor : xor 2x32 
port map (A => lineD, 
B => K4, 
Result=> K4xbrLO); 
-- XOR second LS 32-bit vector with K-Subkey K5 
L1K5 xor : xor 2x32 
port-map (A =>-linel, 
B => K5, 
Result=> K5xorL1); 
-- XOR second MS 32-bit vector with K-Subkey K6 
L2K6 xor xor 2x32 
port-map (A =>-line2, 
8 => K6, 
Result=> K6xorL2); 
-- XOR MS 32-bit vector with K-Subkey K7 
L3K7_xor : xor_2x32 
port map (A => line3, 
B => K7, 
Result=> K7xorL3); 
--Concatenating and assigning output its value from signal 
output: process(C3,C2,Cl,CO) 
begin 
ciphertext <= C3 & C2 & Cl & CO; 
end process output; 
end mixed; 
4. CLEARTEXT 
Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component is the cleartext. 
Input Whitening 
Version : 1.0 - Beta 
library ieee; 
use ieee.std logic 1164.all; 
use ieee std=logic=arith.all; 
--Building block of cleartext 
entity cleartext is 
port(input in std logic vector(l27 downto 0); --data to encrypt 
input - -
elk : in std logic; --clock signal 
reset : in std logic; 
latch_cleartext in std_logic; --enable input register 
latching 






out std_log1c vector(31 downto 0); --LS 32-bit vector 
out std_logic_vector(31 downto 0); --Second LS 32-bit 
P2xorK2 out std logic vector(31 downto 0}; --Second toMS 32-
bit vector whitening ~ -
P3xorK3 : out std_logic_vector(31 downto 0)); --MS 32-bit vecto 
whitening 
end cleartext; 
--start cleartext structure description 
architecture mixed OF cleartext is 
component xor_2x32 
port(A: in std logic vector(31 downto 0); 
8 : in std-logic-vector(31 downto 0); 
Result : oUt std=logic_vector(31 downto 0)); 
end component; 
component reg128 
port(clock,clr,enable : in std logic; 
data : in std logic vector-(127 downto 0); 
q out std_lOgic v8ctor (127 downto 0}}; 
end component; 
signal cleartext out : std logic vector{127 downto 0}; --output 
signal - - -
--Cleartext: input to enc/dec unit: latching input into register, 
input whitening and SO and S1 sub-key generation 
begin 
-- Latching input into register 
cleartext input: reg128 
port map (data ~> input, 
clock => elk, 
clr => reset, 
enable => latch cleartext, 
q => cleartext_Out); 
-- input whitening by XORing input to deifferent K-Subkeys 






{A=> cleartext out{31 downto 0), 
B => KO, -
Result=> POxorKO}; 
: xor 2x32 
(A=> cleartext out(63 downto 32), 
B => K1, 
Result=> P1xorK1); 
: xor 2x32 
(A =>-cleartext out(95 downto 64), 
B => K2, -
Result=> P2xorK2); 
P3K3_xor : xor_2x32 
port map (A=> cleartext_out{l27 downto 96), 




Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component This component is a counter 
It drives through the sequence 
of encryption/decryption 
Version 1.0 - Beta 
library ieee; 
use 1.eee.std logic 1164.all; 
use ieee.std::::logic::::arith.all; 
entity cntr is 
port (elk : in 
enable : 
clr : in 
encrypt 
std_logic; 
in std logic; 
std_lo'9ic; 
in std_logic; 
out cntrO out integer range 0 to 39; 
out cntr1 out integer range 0 to 39; 
out-cntr2 out integer range 0 to 39; 
out::::cntr3 out integer range 0 to 39}; 
end cntr; 
architecture behavior of cntr is 
signal cnt sigO, 
- cnt Slg1, 
cnt sig2, 




if (clk'EVENT AND elk= '1'} then 
if clr = '1' then 
if encrypt = '1' then 
cnt_sigO <= 0; 
cnt_sig1 <= 1; 
cnt_sig2 <= 2; 
cnt_sig3 <= 3; 
else 
cnt_sigO <= 4; 
cnt sig1 <"" 5; 
cnt-sig2 <= 6; 
cnt::::sig3 <= 7; 
end if; 
elsif enable = '1' then 
if encrypt= '1' then 
if cnt s1.gO = 0 and cnt s1.gl 
and cnt_sig3 = 3 then 
cnt_sigO <= 8; 
cnt_sig1 <= 9; 
cnt s1.g2 <= 0; 
cnt_sig3 <= 0; 
1 and cnt_sig2 2 
elsif cnt sigO = 38 and cnt sigl 39 and cnt sig2 
0 and cnt sig3 = 0 then - -
6 and cnt_sig3 7 
else 









elsif cnt_sigO = 4 and cnt_sigl 
then 
cnt s1.gO <= 0; 
cnt_sigl <= 1; 
cnt_sig2 <"' 2; 
cnt_sig3 <"" 3; 
else 
cnt sigO <= cnt sigO + 2; 
cnt::::sigl <= cnt-s1.gl + 2; 
cnt sig2 <= 0; 
cnt::::sig3 <= 0; 
end if; 
1.£ cnt SlgD = 4 and cnt Slgl 
then 
cnt_sigO <'"' 38; 
OO< 
5 and cnt_sig2 
5 and cnt_sig2 6 
cnt s1gl <= 39; 
cnt sig2 <= 0; 
cnt:=sig3 <= 0; 
elsif cnt s1gO = 8 and cnt~sigl 
0 and cnt~sig3 0 then 
cnt~sigO <= 0; 
cnt s1g1 <= 1; 
cnt sig2 <= 2; 
cnt-sig3 <= 3; 
elsif cnt_sigO = 0 and cnt_sigl 
2 and cnt_sig3 3 then 
else 
cnt_sigO <'"" 4; 
cnt sigl <= 5; 
cn(:sig2 <= 6; 
cnt s1gJ <= 7; 
cnt_sigO <= cnt s1gO 
cot sigl <= cnt sigl 
cnt-sig2 <= 0; -






out_cntrO <= cot s1g0, 
out cntrl <= cnt sigl; 
out:=cntr2 <= cnt=sig2; 
out_cntr3 <= cnt slg3, 
end behavior; 
6.CONTROL 
Author Ananda Raja A/L Dare Raja 
ID : 1669 
,, 
,, 
Component This component is the controller. 




use 1eee.std_log1c ar1th.all; 
9 and cnt sig2 
1 and cnt_sig2 
--Control: Controller module used to synchronize enc/dec different 
steps 
entity control is 
port( pre_param_even : in std_logic_vector(31 downto 0); 
pre param odd : in std logic vector(31 downto 0}; 
elk-: in Std_logic; - -
reset : in std logic; 
ld key : in std logic; 
start in std logic; 
sel_enc in std_logic; 
input even key : out std logic vector(31 downto 0); --need to 
expand to 4-valu€s - -
input odd key out std log1c vector(31 downto 0); --need to 
expand to 4-va!Ues 
input even plaintext : out std logic vector(31 downto 0); --
need to expand tO 4 values - -
input~odd_plaintext out std~logic_vector(31 downto 0); --
need to expand to 4 values 
latch cleartext out std logic; 
latch-key : out std logic~ 
sel odd : out std lOgic; 
sel-k out std_!Ogic; 
sel test :out std logic; 
parim source is result out std logic; 
load reg pre-ev~n - out std logl;, 
load reg:=pre=odd out std_logic; 
load reg post even : out std logic; 
load-reg-post-odd: out std logic; 
latch ciPhert€xt out std logic; 
ctrl ;ncrypt out std logic; 
idle- out std_logic);-
end control; 
--Building block of controller 




port (elk : in std_logic; 
enable : in std_logic; 
clr : in std_logic; 
encrypt : in std~logic; 
out cntrO out integer 
out-cntrl out integer 
out=cntr2 out integer 
out_cntr3 out integer 
end component; 
--type DECLARATION 
range 0 to 39; 
range 0 to 39; 
range 0 to 39; 
range 0 to 39 }; 
type state_typeO is (my idle, load keyA, compute KO,compute K4, 
- - compute-K2r8); -
type state_typel is (my idle, load keyA,compute Kl,compute KS, 
- - compute-K2r_9); -
type state_type2 is (my_idle, load_keyA, compute_K2, compute_K6, 
compute_even); 
type state type3 is (my idle, load keyA,compute K3, compute K7, 

















out-count3 : std_logic_vector(7 downto 0); 
out count intO, 
out-count-int1, 
out-count-int2, 





ctrl_enc_sig std logic; 
--Start of controller structure description 
begin 
ctrl encrypt <= ctrl_enc_sig; 
cntrl : cntr 
port map( elk=> elk, 
enable => cnt_enbl, 
clr => reset_cntr, 
encrypt => sel enc, 
out cntrD => oUt count intO, 
out cntrl => out-count-int1, 
out-cntr2 => out-count int2, 
out=cntr3 => out=count-int3); 
-- EVEN KEY 
process (clk,reset) 
begin 
if reset ='1' then 
stateD <= my idle; 
elsif clk'EVENT AND elk= '1' then 
case stateD is 
when my_idle => 
if ld_key = '1' then 
stateD <= load_keyA; --'LoadKey' signal makes 
the controller to latch the key material from the 128 bit input port 
elsif start= '1' then -- 'start' signal puts the 
controller in the encryption or decryption mode 
'1')) 
stateD <= compute KD; 
else stateD<= my idle;-
end if; 
when compute KO => 
stateD-<= compute K2r 8; 
when compute K2r 8 => -
if ( ((out-countO = "00100110") AND (ctrl enc sig 
-- for encryptioi1 - -
?)R 
'D'l) ) then 
OR ( (out_countD = "DDOD1DOD"l AND (ctrl enc_slg 
for decryption 
stateD <= compute K4; 
else 
stateD <= compute_K2r 8; 
end if; 
when load_keyA => 
stateD <= my_idle; 
when compute_K4 => 




-- ODD KEY 
process (clk,reset) 
begin 
if reset ='1' then 
state1 <= my_idle; 
elsif clk'EVENT AND elk= '1' then 
case state1 is 
when my idle ""> 
if ld key= '1' then 
-state1 <= load_keyA; --'LoadKey' signal makes 
the controller to latch the key material from the 128 bit input port 
elsif start= '1' then-- 'start' signal puts the 
controller in the encryption or decryption mode 
'1 1)) 
state1 <= compute Kl; 
else statel <= my idle; 
end if; -
when compute K1 => 
state!-<= compute K2r 9; 
when compute K2r 9 => -
if ( ((out countO = "00100110") AND (ctrl enc slg 
-- for encryptioi1 -
OR ((out countO = "00DD1DOD"l AND (ctrl_enc_sig = 
'0')) then for decryption 
end if; 
end process; 
state1 <= compute_KS; 
else 
state1 <= compute_K2r_9; 
end if; 
when load_keyA => 
state1 <= my idle; 
when compute KS =>-
state!-<= my_idle; 
end case; 
-- EVEN PLAINTEXT 
process (clk,reset) 
begin 
if reset ='1' then 
state2 <= my_idle; 
elsif elk' EVENT AND elk = '1' then 
case state2 is 
when my idle => 
if ld_key '1' then 
state2 <~ load keyA; --'LoadKey' signal makes 
the controller to latch the key material from the 128 bit input port 
elsif start ~ '1' then -- 'start' signal puts the 
controller in the encryption or decryption mode 
1 1')) 
• D')) 
state2 <= compute_K2; 
else state2 <= my_idle; 
end if; 
when compute K2 => 
state2 <= coffipute_even; 
when compute even => 
if ( ((out_countO ~ "00100110") AND (ctrl enc s~g 
-- for encryption 
OR ((out countO = "00001000") AND (ctrl enc slg 
then -- for decryPtion -
state2 <= compute_K6; 
else 
state2 <= compute even; 
end if; -
when load keyA => 
statez <= my_idle; 





-- ODD PLAINTEXT 
process (clk,resetl 
begin 
if reset ='1' then 
state3 <= my idle; 
elsif clk'EVENT AND elk= '1' then 
case state3 is 
when my idle => 
if ld key= '1' then 
-state3 <=load keyA; --'LoadKey' signal makes 
the controller to latch the key material from the 128 bit input port 
elsif start = '1' then -- 'start' signal puts the 
controller in the encryption or decryption mode 
1 1'} ) 
'0')) 
state3 <= compute K3; 
else state3 <= my_ idle; 
end if; 
when compute K3 ~> 
State3 <= compute_odd; 
when compute odd ~> 
if ( ((out countO = "D0100110") AND (ctrl_enc_sig 
-- for encryption 
OR ((out countO = "00001000") AND (ctrl_enc_sig = 
then -- for decryption 
state3 <~ compute_K7; 
else 
state3 <= compute odd; 
end if; 
when load_keyA => 
state3 <= my_idle; 





--EVEN KEY -- INPUT_EVEN_KEY 
with stateD select input_even_key <= 
(out countO & out countO & out countO·& out countO) 
(out-countO & out countO & out-countO & out-countD) 
(out-countD & out-countD & out countO & out-countD) 
(out-countD & out countD & out-countD & out-countD) 
(out-countD & out-countD & out-countD & out-countD) 
complite_K2r_8; -
-- ODD KEY -- INPUT ODD KEY 






(out count1 & out countl & out countl & out countl) when my idle, 
(out=countl & out countl & out=countl & out=countl} when load_KeyA, 
(out count1 & out count1 & out countl & out countl} when compute K1, 
(out countl & out countl & out-count1 & out-count1} when compute-KS, 
(out=countl & out=count1 & out count1 & out-count1) when -
compute_K2r_9; 
--EVEN PLAINTEXT -- INPUT_EVEN PLAINTEXT 
with state2 select input even plaintext <= 
(out count2 & out count2-& out count2 & out count2l 
(out-count2 & out-count2 & out-count2 & out-count2) 
(out-count2 & out-count2 & out count2 & out-count2) 
(out=count2 & out-count2 & out_count2 & out=count2) 
pre_param_even when compute_even; 
when my idle, 
when load_ KeyA, 
when compute K2, 
when compute_K6, 
--ODD PLAINTEXT -- INPUT ODD PLAINTEXT 
with state3 select input-odd~laintext <= 
(out count3 & out count3-& out count3 & out count3) when my idle, 
(out=count3 & out_count3 & out-count3 & out=count3) when load_KeyA, 
(out count3 & out count3 & out count3 & out count3} when compute K3, 
{out-count3 & out-count3 & out-count3 & out-count3} when compute-K7, 
pre_raram_odd wheO compute_odd; - -
--EVEN KEY -- LATCH_KEY 
with stateD select latch key <= '0' when my_idle, 
'1' when load_KeyA, 
'0' when compute_KO, 
'0' when compute_K4, 
'0' when compute K2r 8; 
--EVEN KEY -- RESET_CNTR 
with stateD select reset cntr <= '1' when my_idle, 
'0' when load_KeyA, 
'0' when compute_KO, 
'0' when compute_K4, 
'0' when compute K2r_8; 
--EVEN KEY -- LATCH_CLEARTEXT 
with stateD select latch cleartext <~ '1' when my_id1e, 
'0' when load_KeyA, 
'0' when compute_KO, 
• 0. when compute_K4, 
• 0. when compute_K2r 8; 
--ODD PLAINTEXT -- SEL ODD 
with state3 select sel_odd <= '0' when my_idle, 
'0' when load_KeyA, 
'1' when compute K3, 
'1. when compute_K7, 
'1' when compute odd; 
--ODD PLAINTEXT -- SEL_K 
with state3 select sel k <= ·o· when my_idle, 
• 0' when load_KeyA, 
'l' when compute_K3, 
'1' when compute_K7, 
'0' when compute_odd; 
--ODD PLAINTEXT -- SEL TEST 
with state3 select 
sel_test <~ '0' when my idle, 
'0' when load_KeyA, 
'1' when compute_K3, 
'1' when compute K7, 
'0' when compute=odd; 
--EVEN KEY -- PARAM SOURCE IS RESULT 
with stateD select Param_sOurCe_lS result <= '0' when my_idle, 
'0' when load KeyA, 
'0' when compUte KO, 
'1' when compute K4, 
--EVEN KEY -- LOAD REG PRE EVEN 
with stateD select load=reg~re_even <= 
-- ODD KEY -- LOAD_REG_PRE_ODD 
with state! select load_reg_pre_odd <= 
'1' when compute=K2r_8; 
'0' when my idle, 
•o• when load KeyA, 
'1' when complite_KO, 
'D' when compute K4, 
'1' when compute=K2r 8; 
'0' when my_idle, 
'0' when load_KeyA, 
'1' when compute_K1, 
'0' when compute_KS, 
'1' when compute_K2r_9; 
--EVEN PLAINTEXT -- LOAD REG POST 
with state2 select load_reg_post_even <= '0' when my_idle, 
'0' when load KeyA, 
'1' when compUte_K2, 
'0' when compute_K6, 
'1' when compute_even; 
--ODD PLAINTEXT -- LOAD_REG POST_ODD 
with state3 select load_reg_post_odd <= '0' when my_idle, 
'0' when load KeyA, 
'l' when compUte_K3, 
'D' when compute K7, 
'1' when compute-odd; 
--EVEN PLAINTEXT -- LATCH_CIPHER_MS NEXTCC 
with state2 select latch_ciphertext <= '0' when my idle, 
'0' when load KeyA, 
'0' when compUte_K2, 
'1' when compute K6, 
'0' when compute-even; 
--EVEN KEY -- LOAD_ENCRYPT 
with stateD select load_encrypt <= '1' when my_idle, 
'0' when load_KeyA, 
·o· when compute_KD, 
'0' when compute_K4, 
'0' when compute_K2r_8; 
--EVEN KEY -- IDLE 
with stateD select idle<= '1' when my idle, 
--EVEN KEY -- CNT ENABLE 
'0' when load KeyA, 
'0' when compUte_KO, 
'0' when compute K4, 
'0' when compute=K2r_8; 
with stateD select cnt_enbl <= • 0. when my_idle, 
• 0. when load_KeyA, 
'1' when compute_KO, 
'1' when compute~K4, 
'1' when compute_K2r_8; 
sel encrypt reg: process(load encrypt) 
begin - -
if (load encrypt= '1') then 
ct~l_enc_sig <= sel enc; 
end if; 
end process; 
out countO <= 
out countl <= 
out_count2 <= 
out count3 <= 
end mix; 
7.CORE 









This component is the main core. 
1. 0 - Beta 
use 2eee std log2c 1164.all; 
use ieee.std=logic_arith.all; 
-- External interface of twofish cipher implementation 
entity core is 
port(inport in std_logic_vector(l27 downto 0); --input from 
cleartext entity 
inkey in std logic vector (127 downto 0); -- input from 
keymodule entity - -
elk in std_logic; --Clock signal 
reset : in std_logic; 
usr_ld_key in std_logic; --Usr requests load key 
usr start · ln std_loglc; --Usr requests start 
usr encrypt in std logic; --Usr requests encrypt 
idle : out std_logic; --Device is idle 
outCiphertext : out std logic vector(127 downto 0) );--output 
after one iteration through e~c/dec-
end core; 
architecture mixed of core is 
component little_endian_converter 
port (big endian in std logic vector(127 downto 0); 
little endian : out-std lOgic vector(l27 downto 0)); 
end component; - -
component control 
port( pre param even in std logic vector(31 downto 0); 
pre~aram=odd ; in std_logic_~ector(31 downto 0); 
elk in std_logic; 
reset in std logic; 
ld key : in std logic; 
start in std logic; 
sel enc in s"td logic; 
inpUt_even_key -out std_logic vector(31 downto 0); --need to 
expand to 4 values 
input odd key out std logic vector{31 downto 0); --need to 
expand to 4-valUes - -
input_even_plaintext : out std_logic_vector(31 downto 0); --
need to expand to 4 values 
input odd plaintext ; out std logJ.c vector(31 downto 0); --
need to expand to 4 values -
latch_cleartext : out std_logic; 
latch key out std logic; 
sel_odd : out std_lOgic; 
sel k out std_logic; 
sel test out std logic; 
param_source_is_reSult : out std_logic; 
load_reg_pre_even : out std_logic; 
load_reg_pre_odd : out std_logic; 
load reg post even out std logic; 
load-reg-post-odd out std logic; 
latch ciPhertext : out std logic; 
ctrl encrypt : out std_logic; 
idle : out std logic) ; 
end component; -
component keymodule 
port (key : in std logic vector{l27 downto 0); 
latch key in std-logic; 
elk -in std logic; 
reset in s"td logic; 
SO,Sl : out std logic vector(31 downto 0); 
keyout out std_logiC_vector(l27 downto 0)); 
end component; 
component evenkeygenerator 
port (input : in std logic vector(31 downto 0); 
in so in std=logic=vector(31 downto 0); --SO for S-boxes in 
h fctn module 
- in_Sl : in std_logic_vector(31 downto 0); --Sl for S-boxes in 
h fctn module 
- elk in std logic; 
reset : in s"td_logic; 
outevenkey : out std logic vector(31 downto 0)}; --output of 
main on odd path - -
end component; 
component oddkeygenerator 
port (input in std logic vector(31 downto 0); 
in SO : 2n std-logic-vector(31 downto 0); --SO for S-boxes ln 
h fctn module - -
- in Sl ; in std_logic_vector(31 downto 0); --Sl for S-boxes in 
h fctn module 
elk in std_logic; 
reset : in std_logic; 
outoddkey : out std logic vector(31 downto 0) ); --output of 
maJ.n odd key - -
end component; 
component evenplaintextgenerator 
port (input in std_logic_vector(31 downto 0); 
in_SO : in std_log1c vector(31 downto OJ; --SO for S-boxes in 
h fctn module 
- in Sl 
h fctn mOdule 
in std_logic_vector(31 downto 0); --Sl for S-boxes in 
elk ; in std_logic; 
reset : in std_logic; 
outevenplaintext out 
of main on odd path 
end component; 
std_logic_vector(31 downto 0)); --output 
component oddplaintextgenerator 
port (input : in std logic vector(31 downto 0); 
in SO : ln std-logic-vector(31 downto 0); --SO for S-boxes in 
h fctn mOdule - -
- in Sl in std_logic_vector(31 downto 0); --Sl for S-boxes in 
F fct B : in std logic vector(31 downto 0); 
irl_A in std_lo9ic_veCtor(31 downto 0); 
in_B : in std_logic_vector(31 downto 0); 
out A out std logic vector(31 downto 0}; 
out-B out std-logic-vector(31 downto 0)); 
end comporlent; - -
h_fctn module --cleartext module 
path 
elk : in std logic; 
reset in std logic; 
sel_odd : in std_logic; --input selecting btw even path or odd 
sel k : in std logic; --input selecting btw calculating a key 
or an encryption/decryption 
outoddplaintext : out std loglC vector(31 downto 0) ); --output 
of main on odd path -
end component; 
component rol9 32 
port (data : irl std logic vector(31 downto 0}; 
q : out std 10gic vector<31 downto 011: 
end component; - -
--rotate left 8 bit 
component rolS 32 
port (data irl std logic vector(31 downto 0); 
q out std lOgic vector(31 ctownto Oll; 
end component; 
component mux 2x32 
port(sel in std_logic; 
in 0 : in std logic vector(31 downto 0); 
in-1 : in std-logic-vector(31 downto 0); 




port (A in : in std logic vector(31 downto 0}; 
B-in in std-logic-vector(31 downto 0}; 
A=out out std_logic_vector(31 downto 0); 
Bout out std_logic_vector(31 downto 0)}; 
end component; 
component adder32 
port (dataa, datab : in std logic vector(31 downto 0) ; 
cout out std logic,- -
cin in std_logic:~ '0'; 
result out std_logic_vector(31 downto 0}) 
end component; 
component reg32 
port(clock,clr,enable : in std logic; 
data : in std_logic_vectOr (31 downto 0); 
q : out std logic vector (31 downto 0)); 
end component; 
-- selection of mode: encrypt/decrypt 
component opselect 
port( encrypt in std logic; 
F_fct_A in std=logic_vector(31 downto 0}; 
component cleartext 
port (input : lTI std loglC vector(127 downto 0); 
elk in std lOgic; 
reset : in std logic; 
latch cleartext in std loglc, 
KO,Kl~K2,K3 in std logic vector(31 downto 0); 
POxorKO out std lo9ic veCtor(31 downto 0); 
PlxorK1 out std-logic-vector(31 downto 0); 
P2xorK2 out std-logic-vector(31 downto 0); 




port {K4,K5,K6,K7 : in std loglC vector{31 downto 0}, 
lineO,linel,line2,li0e3 in std logic vector{31 downto 0); 
elk in std logic; - -
reset in std logic; 
latch cipherteXt : in std logic; 
















output even key pht, 
output=odct_key_9ht, 






output odd key pht temp, 
rotated_output=odd=key_pht_temp, 
even key temp, 
odd key temp, 
outPut_even_plaintext_pht, 
output odd plaintext pht, 
rotated_output_odd_piaintext_pht, 









post _param _odd, 
input f,out round even, 
out rOund odd,source pre even reg, 
source_pr8_odd_reg, - - -
source post even reg, source post odd reg, even SO, 
even_sl,odd::::so, Odd Sl, - - - -
f_in_SO, f_in_Sl 
std_logic_vector{31 downto 0); 
signal 
key, input, output, outkey std_loglC vector{127 downto 0); 
signal 
begin 
latch cleartext, latch key, 






--little endian convertion 
little endian input: little endian converter 
port map { big endian => inPort, 
little_endian =>input); 
little endian key: little endian converter 
port map ( bi9_endian => inkey, -
little_endian =>key); 
little_endian_output: little_endian_converter 
port map { big endian => output, 
little_endian => outCiphertext); 
twofishController: control 
port map ( pre_param_even => pre_param_even, 
pre_param_odd => pre_param_odd, 
elk => elk, 
reset => reset, 
ld key => usr ld_key, 
start => usr_start, 
sel enc => usr encrypt, 
std_logic; 
inpUt even key-=> input even key, 
input-odd key=> input Odd k8y, 
input::::even_plaintext => inPut_even_plaintext, 
input odd plaintext => input odd plaintext, 
latch-cleartext =>latch clearteXt, 
latch-key => latch key, -
sel_odd => sel_odd~ 
sel_k => sel_k, 
sel test ~>sel test, 
param_source lS result ~> param_source_is_result, 
load_reg_pre_even => load_reg_pre_even, 
load_reg_pre_odd => load_reg_pre_odd, 
load reg post even => load reg post even, 
load-reg-post-odd => load reg Post Odd, 
latch_ciPhert8xt => latch::::cipherteXt, 
ctrl encrypt => ctrl encrypt, 
idle-=> idle}; -
-- This modules holds the cleartext input module and 
-- and prepares the parameters when given correct K values. 
cleartext module: cleartext 
port map (input => input, 
elk => elk, 
reset => reset, 
latch cleartext => latch cleartext, 
KO =>-even key, 
Kl => odct_key, 
K2 => even plaintext, 
K3 => odd_Plaintext, 
POxorKO => POxorKO, 
P1xorK1 => PlxorKl, 
P2xorK2 => P2xorK2, 
P3xorK3 => P3xorK3); 
select the correct S input to modified F: 
S if computing a result, 
M if computing a k 
selectSorM process (sel_k,sel_odd,key,sO,sl) 
begin 
if (sel k = '1') THEN 
else 
if (sel_odd = '1') THEN 
-- computing a K 
else 
-- SO = M3, Sl = Ml 
fin SO<= outkey(l27 downto 96); 
f::::in-$1 <= outkey(63 downto 32); 
-- computing a K 
-- SO = M2, S1 = MO 
fin SO<= outkey{95 downto 64); 
f-in-$1 <= outkey{31 downto 0}; 
end if;- -
-- computing a result 
-- SO = SO, 51 = Sl 
finSO<=SO; 
f-in-Sl <= Sl; 
end if; 
end process; 
inputevenkeygenerator <= input_even_key; 
lnputoddkeygenerator <= lnput odd_key, 
inputevenplaintextgenerator <= lnput_even_plaintext; 
inputoddplaintextgenerator <= input_odd_plaintext; 
evenkeygen evenkeygenerator 
port map ( nput =>inputevenkeygenerator, 
n_SO => outkey(95 downto 64), ---must take into account 
for key 
in_Sl => outkey(31 downto Ol, ---must take into account 
for key 
elk ~> elk, 
reset => reset, 
outevenkey => outputevenkeygenerator); 
oddkeygen: oddkeygenerator 
port map (input =>inputoddkeygenerator, 
in~SO => outkey(127 downto 96), ---must take into 
account for key 
for key 
in~Sl => outkey(63 downto 32), ---must take into account 
elk => elk, 
reset => reset, 
outoddkey => outputoddkeygenerator); 
input_even_key_pht <= outputevenkeygenerator; 




Hadamard Transformation of signals for keys 
pht 
(A_in => input_even_key_pht, 
B in => input odd key_pht, 
A-out => outpUt e~en key pht, 
B=out => output=odd_key_Pht); 
--Rotating the output of the PHT on the odd path left nine times 
rol9_out_pht_odd_key: rol9_32 
port map (data => output_odd_key_pht , 
q => rotated~output_odd_key_pht); 
--KEY VALUES 
even_key <= output_even_key_pht; 
odd_key <= rotated_output_odd_key_pht; 
mux_sel evenSO mux_2x32 
port map(sel => sel test, 
in 1 => outkey(95 downto 64), 
in=O => f_in_SO, 
output=> even_SO); 
mux_sel_oddSO : mux_2x32 
port map(sel => sel test, 
in 1 => outkey(127 downto 96), 
in 0 => f in SO, 
output=> 0dd=SO); 
mux sel evenSl mux 2x32 
port maP(sel => sel test, 
in 1 => oUtkey(31 downto 0), 
in=O => f_in_Sl, 
output=> even_S1); 
mux_se1_oddS1 : mux_2x32 
port map(sel => sel test, 
in 1 => outkey(63 downto 32}, 
in-0 => f_in_S1, 
output=> odd_Sl); 
evenplaintextgen: evenkeygenerator 
port map(input =>inputevenplaintextgenerator, 
in SO => even SO, -- must take into account for plaintext 
in_sl => even_sl, -- must take into account for plaintext 
elk => elk, 
reset => reset, 
outevenkey => outputevenplaintextgenerator); 
oddplaintextgen: oddplaintextgenerator 
port map(input =>inputoddplaintextgenerator, 
in SO => odd SO, -- must take into account for plaintext 
in=Sl => odd=Sl, -- must take into account for plaintext 
elk => elk, 
reset => reset, 
sel odd => sel odd, 
sel = k => sel_ k-;-
outoddplaintext => outputoddplaintextgenerator); 
--Pseudo Hadamard Transformation of signals for plaintext 
pht plalntext pht 
port map (A in => outputevenplaintextgenerator, 
B-ln => outputoddplalntextgenerator, 
A out => output even _plaintext pht, 
B-out=> output=odd_plaintext_Pht); 
--Rotating the output of the PHT on the odd path left nine times 
rol9 out pht odd: rol9 32 
port-map-(data => outpUt odd plaintext pht , 
q => rotated_oUtput_odd_plaintext_pht); 
sel odd_k <= sel_odd AND sel_k; -- to choose between computation of a 
key and plaintext 
--Multiplex, on the odd side, btw. the output of the pht and the 
output of the pht rol X9 
mux out : mux 2x32 
port map(sel ~> sel_odd_k, 
in_l => rotated_output_odd_plaintext_pht, 
in 0 => output odd_plaintext pht, 
output=> outpUt_odd_plainteXt_mux}; 
even_plaintext <= output_even_plaintext_pht; 
odd_plaintext <= output_odd_plaintext_mux; 
-- 32-bit adder: even K + even result of h 
add_even adder32 
port map (dataa => even plaintext, 
datab => even=key, 
result=> f_out_even); 
-- 32-bit adder: odd K + odd result of h 
add odd : adder32 
port map (dataa => odd_plaintext, 
datab => odd key, 
result=> f Out_odd); 
-- final operation of round: encrypt or decrypt? 
opsel : opselect 
port map(encrypt => ctrl_encrypt, 
F fct A => f out even, 
F=fct=B => f=out=odd, 
in A => post param even, 
in-B => post:Param-odd, 
out_A => out round=even, 
out_B => out_round_odd); 
choose the source of next parameters. 
choose in param pre even: process 
(param-soUrce iS reSult, out round even, 
out rOund odd,p~e_param ev8n,pre~aram odd, 
POx0rKO,PlxorKl,P2xorK2;P3xorK3} -
begin 
if (param_source_is result= '1') THEN 
source pre even reg <= out round even; 
source~re=odd_reg <= out_round_0dd; 
source_post_even_reg <= pre_param_even; 
source_post_odd_reg <= pre_param_odd; 
else 
source pre even reg <= POxorKO; 
source~re=odd_~eg <= PlxorKl; 
source post even reg <= P2xorK2; 
source-post-odd ~eg <= P3xorK3; 
end if; - - -
end process ; 
line 0: the parameter is used on even 
values at the beginning of the round 
(KO in first round} 
reg_pre_even_param_reg: reg32 
port map (data => source pre even reg, 
clock => elk, - - -
clr => reset, 
enable => load_reg_pre_even, -- helps to latch POxorKO or 
out round even 
q => pre_param_even); 
line 1: the parameter is used on odd 
values at the beginning of the round 
(Kl in first round) 
reg_pre_odd_param_reg: reg32 
port map (data => source_pre_odd_reg, 
clock => elk, 
clr => reset, 
enable => load_reg_pre_odd, -- helps to latch P1xorKl 
or out round odd 
q => pre_param_odd); 
line 2: the parameter is used on even 
values at the end of the round 
(K2 in first round) 
reg_post_even_pararn_reg: reg32 
port map (data => source post even reg, 
clock => elk, - - -
clr => reset, 
enable => load_reg_post_even, -- helps to latch P2xorK2 
or pre param even 
q => post_param_even); 
line 3: the parameter is used on odd 
values at the end of the round 
(K3 in first round) 
reg_post odd pararn reg: reg32 
port map-( data=> source_post_odd_reg, 
clock => elk, 
clr => reset, 
enable => load reg post odd, -- helps to latch P3xorK3 
or pre param odd -
q => post_param_odd}; 
-- handle the output 
--HAS TO UNDO THE LAST SWAP!!! 
ciphertext_module: ciphertext 
port map (K4 => even_key, 
KS => odd_key, 
K6 => even_plaintext, 
K7 => odd_plaintext, 
lineD => post_param_even, 
linel => post param_odd, 
line2 => pre_param_even, 
line3 => pre_param_odd, 
elk => elk, 
reset => reset, 





register holds the 128-bit key ... 
end mixed; 
keymodule 
(key => key, 
latch key => latch key, 
elk => elk, -
reset "'> reset, 
so => so, 
51 => 51, 
keyout => outkey); 
8. EVENKEYGENERA TOR 
Author : Ananda Raja A/L Dare Raja 
ID : 1669 
Component This component is the 
Evenkeygenerator 
Version 1.0 -Beta 
library ieee; 
use ieee.std_logic_1164.all; 
use leee.std_loglc arlth.all, 
--Main: used to compute K-subkeys and as main buildig block of 
enc/dec 
entity evenkeygenerator is 
port (input in std logic vector(31 downto 0); 
in_SO in std=logic=vector(31 downto 0); --SO for S-boxes in 
h fctn module 
in 51 in std logic vector(31 downto 0}; --51 for S-boxes in 
h fctn mOdule - -
- elk in std_logic; 
reset : in std_logic; 
outevenkey out std logic vector(31 downto 0)); --output of 
main on odd path - -
end evenkeygenerator; 
--Building block of main 




port (input : in std logic vector(31 downto 0}; 
SO in std lo9ic veCtor(31 downto 0}; 
Sl : in std-logic-vector(31 downto 0); 
output out std logic vector(31 downto 0)); 
end component; - -
begin 
--h-function transformation 
h_block : h_fctn 
port map (input => input, 
SO => in_SO, 
Sl => in_Sl, 
output => outevenkey); 
end mixed; 
9. EVENPLAINTEXTGENERATOR 
Author Ananda Raja AIL Dare Raja 
ID : 1669 
Component This component is the 
Evenplaintextgenerator 
Version 1.0- Beta 
library ieee; 
use ieee.std logic 1164.all; 
use ieee.std=logic=arith.all; 
--Main: used to compute K-subkeys and as main buildig block of 
enc/dec 
entity evenplaintextgenerator is 
port (input in std logic vector(31 downto 0); 
in SO : in std-logic-vector(31 downto 0); --SO for S-boxes in 
h fctn mOdule - -
in_Sl in std_logic_vector(31 downto 0); --Sl for S-boxes in 
h fctn module 
elk : in std_logic; 
reset in std_logic; 
outevenplaintext out std logic vector(31 downto 0)); --output 
of main on odd path - -
end evenplaintextgenerator; 
--Building block of main 
architecture mixed OF evenplaintextgenerator lS 
--component DECLARATION 
--h-function modules 
component h fctn 
port (input-: in std logic vector(31 downto 0); 
SO : in std_lo9ic_veCtor(31 downto 0); 
Sl : in std logic vector(31 downto 0); 
output out std_iogic_vector(31 downto 0)}; 
end component; 
--Start Main's structure description 
begin 
--h-function transformation 
h block : h fctn 
pOrt map (iTiput => input, 
SO => in_SO, 




Author Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component is h-function. 
Version : 1.0 - Beta 





--h-function combines s-boxes and maximum distance separable matrix 
entity h fctn is 
port( inPut : in std logic vector(31 downto 0); 
SO in std lo9ic veCtor(31 downto 0); 
Sl in std-logic-vector(31 downto 0); 
output out stct_iogic_vector(31 downto 0)); 
end h_fctn; 
--Building block of the h_fctn 
architecture struct OF h_fctn is 
component S boxes 
port( income : in std logic vector(31 downto 0); 
SO : in std logic vector(31 downto 0); 
Sl : in std-logic-vector(31 downto 0}; 
outcome : oUt std=logic vector(31 downto 0}); 
end component; 
--maximum distance separable matrix 
component mds 
port( yO lO std logic vector(? downto 0); 
yl in std-logic-vector(7 downto 0); 
y2 in std-logic-vector(7 downto 0); 
y3 in std-logic-vector(7 downto 0}; 
zO out std_logiC_vector(7 downto 0); 
zl out std logic vector(? downto 0); 
z2 out std-logic-vector(7 downto 0); 
z3 out std=logic=vector(7 downto 0)); 
end component; 
Signal 
out mds, --output of the MDS 
out=S_box std_logic_vector{31 downto 0); --output of the S-boxes 
--Start of the h-function structure 
BEGin 
--S-box that permutes the input and XORs it with SO and Sl 
S boxl : S boxes 
pOrt map( income~> input, 
SO => SO, 




port map( yO => out S box(7 downto 0), 
yl => out-S-box(15 downto 8), 
y2 =>outS box(23 downto 16}, 
y3 => out-S-box(31 downto 24}, 
zO => out-rnds(7 downto 0), 
zl => out=mds{15 downto 8), 
z2 =>out mds(23 downto 16), 
z3 => out=mds(31 downto 24}); 
--Assigning output its value form signal 
output <= out_mds; 
end struct; 
11. KEYMODULE 
Author : Ananda Raja A/L Dare Raja 
ID : 1669 
Component : This component is the keyrnodule 
























port (key : in std logic vector(127 
latch key in std-logic; 






out std_logic_vector(31 downto 0}; 
out std_logic_vector(l27 downto 0) ); 
end keyrnodule; 
architecture mixed of keyrnodule is 
component rsmatrix 
port (m0,ml,m2,m3,m4,m5,m6,m7 in std logic vector(? downto 
sO,sl,s2,s3 : out std logic vect0r{7 ctOwnto 0)); 
end component; - -
component reg128 
port{clock,clr,enable in std logic; 
data : in std logic vector-(127 downto 0); 
q : out std 10glc_v€ctor (127 downto 0}); 
end component; 
component reg32 
port(clock,clr,enable in std logic; 
data in std logic vector-(31 downto 0); 
q : out std_l0gic_v€ctor (31 downto 0}); 
end component; 
component mux 2x64 
port (sel : i~ std_logic; 
in 0 in std logic vector(63 downto 0); 
in-1 in std-logic-vector(63 downto 0); 
output out Std_lo9ic_vector{63 downto 0)); 
end component; 
signal RS lnput std_logic_vector(63 downto 0); 
signal RS output std logic vector(31 downto 0); 
signal re9key_output :-std_l0gic_vector(127 downto 0); 
begin 
keyreg reg128 
port map(data => key, 
clock => elk, 
clr => reset, 
enable => latch_key, 
q => regkey_output}; 
keyout <= regkey_output; 
choose input of RS Matrix 
RSinputChoice: rnux 2x64 
port map(sel => latch_key, 
ln 0 => regkey output(127 downto 64), 











RS lnput{15 downto 8), 
RS_input(23 downto 16), 
RS input(31 downto 24}, 
RS=input(39 downto 32), 
RS lnput(47 downto 40), 
RS input(55 downto 48), 
0); 
m7 => RS lnput(63 downto 56}, 
sO => RS output(7 downto 0}, 
sl => RS-output(15 downto 8}, 
s2 => RS_output(23 downto 16}, 
s3 => RS output(31 downto 24}}; 
-- store so if it's-being computed 
SOreg : reg32 
port map(data => RS_output, 
clock => elk, 
clr => reset, 
enable => latch key, 
q => SO); -
latch S1 to the output. Will be correct 
soon after latch_key drops to 0. 




Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component This component is the little 
converter. 
Version : 1. 0 - Beta 
library ieee; 
use ieee.std logic 1164.all; 
use ieee.std=logic=arith.all; 
entity little endian converter is 
port (big endian in std logic vector(127 downto 0); 
little endian : out-std lOgic vector(127 downto 0)); 
end little_e0dian_converter; - -
architecture behavior of little_endian converter is 
signal 
leO,lel,le2,le3,le4,le5,le6,le7,le8,1e9,1e10,le11,1e12,1e13,1e14,1e15 




--little endian convertion 
leO <=big endian(7 downto 0); 
le1 <=big-endian(15 downto 8); 
le2 <=big-endian(23 downto 16); 
le3 <=big-endian(31 downto 24); 
le4 <=big=endian(39 downto 32); 
le5 <=big endian(47 downto 40); 
le6 <=big-endian{55 downto 48); 
le7 <=big=endian(63 downto 56); 
le8 <=big endian(71 downto 64); 
le9 <=big=endian(79 downto 72); 
le10 <=big endian(87 downto 80); 
le11 <=big-endian(95 downto 88); 
le12 <=big=endian(103 downto 96); 
le13 <=big endian(111 downto 104}; 
le14 <=big=endian(119 downto 112}; 
le15 <=big endian(127 downto 120); 
end process; 
litt1e_endian <= leO & le1 & le2 & le3 & le4 & leS & le6 & le7 & le8 
& le9 & le10 & le11 & le12 & le13 & le14 & le15; 
end behavior; 
13.MDS 
Author : Ananda Raja A/L Dare Raja 
ID : 1669 
Component : This component is the MOS. 
Version : 1.0 - Beta 
library ieee; 
use ieee.std logic_ll64.all; 
Maximum Distance Separable matrix 
Each entry is 8-bits wide. The following operation is 
performed in the Galois field GF(2A8). 
lzOI I 01 EF 58 58 I I yO I 
I z11 ~ 158 EF EF 011 0 I y11 
I z21 IEF 58 01 EFI ly21 
I z31 IEF 01 EF 581 ly31 
entity mds is 
port (yO in std logic vector(? downto 0); 
yl in std-logic-vector(7 downto 0); 
y2 in std-loglc-vector(7 downto 0); 
y3 in std logic-vector(? downto 0); 
,Q out std_logiC_vector(7 downto 0); 
,, out std_logic_vector(7 downto 0); 
,, out std logic vector(? downto 0); 
" 
out std=logic=vector(7 downto 0)); 
end mds; 
architecture struct of mds is 
-~ multiplication by 58 
component multgf 58 
port (x in std-logic vector(7 downto 0); 
y : out std log1C vector(7 downto 0)); 
end component; -
-- multiplication by EF 
component multgf EF 
port (x : in std=logic_vector(7 downto 0); 
y : out std logic vector(? downto 0) ); 
end component; - -
-- bit-wise xor of 4 8-bit vectors 
component xor 4x8 
port {x in std logic vector {7 DOWNTO 0}; 
y ln std=logic=vector {7 DOWNTO 0}; 
z in std logic vector {7 DOWNTO 0}; 
w in std-logic-vector {7 DOWNTO 0}; 
result oUt std-logic vector {7 DOWNTO 0} }; 
end component; -
-- result of multiplications 
signal yOxEF std logic vector(7 downto 0); 
signal y0x5B std=logic=vector{7 downto 0); 
signal ylxEF std logic vector{7 downto 0}; 
signal ylx5B std=loglc-vector{7 downto 0}; 
signal y2xEF std logic vector{7 downto 0}; 
signal y2x5B std=logic=vector{7 downto 0}; 
signal y3xEF std logic vector(7 downto 0}; 
signal y3x5B std-logic-vector(7 downto 0); 
begin - -
-- Do multiplications 
yOxEF camp multgf EF 
port ~ap { x => yO~y => yOxEF); 
yOxSB camp: multgf 58 
port ffiap { x => yO~y => y0x5B); 
ylxEF camp: multgf EF 
port ffiap ( x => yl~y => ylxEF); 
ylx5B camp: multgf 58 
port ~ap ( x => yl~y => ylx58); 
y2xEF comp: mul tgf EF 
port Wap ( x => y2 ~ y => y2xEF) ; 
y2x58 camp: multgf 58 
port ffiap { x => y2~y => y2x5B); 
y3xEF camp: multgf EF 
port ~ap { x => y3~ y => y3xEF); 
y3x5B camp: multgf 58 
port IDap ( x => y3~y => y3x5B); 
do the additions {row by row) 
(An addition in GF corresponds to an XOR.) 
zO = yOxOl + ylxEF + y2xSB + y3x5B 
zO xor: xor 4x8 
port map (x-=> yO, 
y => ylxEF, 
z => y2x5B, 
w => y3x5B, 
result => zO ) ; 
-- zl = yOxSB + ylxEF + y2xEF + y3x01 
zl xor: xor 4x8 
port map ( X => yOxSB, 
y => ylxEF, 
z => y2xEF, 
w => y3, 
result => zl ) ; 
-- z2 = yOxEF + ylxSB + y2x01 + y3xEF 
z2 xor: xor 4x8 
port map ( X => yOxEF, 
y => ylxSB, 
z => y2, 
w => y3xEF, 
result => z2 ); 
-- z3 = yOxEF + ylxOl + y2xEF + y3x58 
z3 xor: xor_4x8 
port map ( x => yOxEF, 
y => yl, 
z => y2xEF, 
w => y3x5B, 





Multiplication of an 8-bit vector by OhSB in 
Galois field GF9{2A8). 
x: input vector 
y: output vector 





in std logic vector(7 downto 0); 
out std logiC vector(7 downto 0) ); 
multgf_5B; -
architecture struct of multgf 58 is 
begin 
y(O) <= x(2) XOR x(O); 
y(ll <= x(3) XOR x(l} XOR x{O); 
y(2} <= x(4) XOR x(2) XOR x{l); 
y(3} <= x(S) XOR x(3) XOR x{O); 
y(4) <= x(6) XOR x(4) XOR x(l) XOR x(O); 
y(5) <= x(7) XOR x(5) XORx(l); 
y{6) <= x{6) XOR x{O); 
y(7) <= x(7) XORx(l); 
end struct; 
library ieee; 
use ieee.std_logic 1164.all; 
Multiplication of an 8-bit vector by OhEF in 
Galois field GF9(2A8). See section 4.2 of 
"Twofish: A 128-Bit Block Cipher" for 
details. 
x: input vector 
y: output vector 





in std logic vector(7 downto 0); 
out std logiC vector(7 downto 0)}; 
multgf_EF; -
architecture struct of multgf EF is 
begin -
y(OJ <= x(2) XOR x(l) XOR x{O); 
y{l) <"" x{3) XOR x(2) XOR x(l) XORx{O); 
y(2) <= x{4) XOR x(3) XOR x(2) XOR x(l) 
y{3) <= x(5) XOR x{4) XOR x(3) XORx{O}; 
XORx(O); 
y{4) <= X (6) XOR x(5) XOR x(4) XOR x{1); 
y{5) <= x{7) XOR x(6) XOR x{5) XOR x{1) XOR x{O); 
y(6) <= x(7) XOR x{6) XOR x(Q); 
y(7) <= x{7) XOR x{1) XOR x{O}; 
end struct; 
14. MUX_2X32 
Author Ananda Raja A/L Dare Raja 
ID 1669 
Component This component lS a 2:1 multiplexor. (32bit) 
Version : 1.0 - Beta 
library ieee; 
use ieee.std logic 1164.all; 
Multiplexer that selects between in 0 and in 1. 
in_O is returned if sel is 0, in_l is returned 
otherwise. 
entity mux_2x32 is 
port {sel in std logic; 
in 0 in std logic vector{31 downto 0); 
in-1 : in std-logic-vector(31 downto 0); 
output out Std_logic vector{31 downto 0) ); 
end mux_2x32; -
architecture behavior of mux 2x32 is 
begin 
mux process {sel, in 0, in 1} 
begin - -
if sel = '0' then 
output <= in 0; 
else 




15. MUX _ 2X64 
Author : Ananda Raja A/L Dare Raja 
ID : 1669 
Component This component is a 2:1 multiplexor. {64bit) 
Version : 1.0 -Beta 
library ieee; 
use ieee.std logic 1164.all; 
Multiplexer that selects between in 0 and in 1. 
in_O is returned if sel is 0, in_1 is returned 
otherwise. 
entity mux 2x64 is 
port {sel ~ in std logic; 
in 0 : in std logic vector{63 downto 0); 
in-1 in std-logic-vector{63 downto 0); 
output out Std_logic_vector{63 downto 0)); 
end mux_2x64; 




if sel = '0' then 
output <= in 0; 
else 




16. ODDKEYGENERA TOR 
Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component This component is the 
Oddkeygenerator 
Version 1.0 - Beta 
library ieee; 
use ieee.std logic 1164.all; 
use ieee.std=logic-arith.all; 
--Main: used to compute K-subkeys and as main buildig block of 
enc/dec 
entity oddkeygenerator is 
port (input in std logic vector{31 downto 0); 
in SO : in std=logic=vector(31 downto 0); --SO for S-boxes in 
h fctn module 
in_S1 in std_loglC vector(31 downto 0); --51 for S-boxes in 
h fctn module 
elk : in std logic; 
reset : in std logic; 
outoddkey : out std loglc_vector(31 downto 0}); --output of 
main odd key 
end oddkeygenerator; 
--Building block of main 
architecture mixed OF oddkeygenerator is 
--component DECLARATION 
--h-function modules 
component h fctn 
port {input : in std logic vector{31 downto 0); 
SO in std logic veCtor{31 downto 0); 
Sl : in std-logic-vector{31 downto 0}; 
output out std loglc vector(31 downto 0)); 
end component; -
--rotate left 8 bit 
component rol8_32 
port {data in std_logic_vector{31 down to OJ; 
q : out std_logic_vector{31 downto 0) ); 
end component; 
--SIGNAL DECLARATION 
Signal sig out h 
h function- -
std_logic_vector(31 downto 0); --output of 
--Start Main's structure description 
begin 
--h-function transformation 
h_block : h_fctn 
port map (input => input, 
SO => in_SO, 
Sl => in Sl, 
output=> sig_out_h); 
--rotate to the left the output of the h function 8 times 
rol8_sig_out_h: rol8 32 
port map (data => sig_out_h, 
q => outoddkey); 
end mixed; 
17. ODDPLAINTEXTGENERATOR 
Author Ananda Raja A/L Dare Raja 
ID : 1669 
Component This component is the 
Oddplaintextgenerator 
Version : 1.0 - Beta 
library ieee; 
use ieee.std log1c 1164.all; 
use ieee.std log1c ar1th all, 
--Main; used to compute K-subkeys and as main buildig block of 
enc/dec 
entity oddplaintextgenerator is 
port (input in std_logic_vector(31 downto 0); 




in std_log1c vector(31 downto 0); --Sl for S-boxes in 
h_fctn module 
elk 1n std_logic; 
reset : in std logic; 
sel_odd in std_logic; --input selecting btw even path or odd 
path 
sel k in std logic; --input selecting btw calculating a key 
or an encryption/decryption 
outoddplaintext : out std logic vector(31 downto 0) ); --output 
of main on odd path - -
end oddplaintextgenerator; 
--Building block of main 
architecture mixed OF oddplaintextgenerator is 
--component DECLARATION 
--h-function modules 
component h fctn 
port (input- in std_logic_vector(31 downto 0); 
so in std log1c vector(31 downto 0); 
Sl in std logic vector(31 downto 0); 
output out std_logic_vector(31 downto OJ); 
end component; 
--Multiplexer modules 
component mux 2x32 
port(sel : in std logic; 
in 0 in std_logic_vector(31 downto 0); 
in_l ; in std_logic_vector{31 downto 0); 
output out std logic vector(31 downto 0) ); 
end component; - -
--register 
component reg32 
port{clock,clr,enable in std_logic; 
data : in std logic vector (31 downto 0); 
q : out std_lOgic_vector (31 downto 0)); 
end component; 
--rotate left 8 bit 
component rol8 32 
port (data : in std_logic_vector(31 downto 0); 





sel odd k, 
sel=odd=result std_logic; 
rotated input, --P1 Xor Kl rotated left 8 times 
sig_in_h, --input to h_function 
sig_out_h, --output of h_function 
rotated_out_h --output of the h_fctn rol 8 times 
std_logic_vector(31 downto 0); --output of the main on the odd 
path 
--Start Main's structure description 
begin 
-- control 
sel odd k <= sel odd AND sel k; 
sel=odd=result <= sel_odd AND (NOT sel k); 
--Rotate input to the left 8 time 
rol8 input: rol8 32 
port-map (data =>input, 
q =>rotated lnput), 
--Choose input of h-function: rotate only if we 
--we are compute an odd result (and not a k!) 
mux_choose_hin mux_2x32 
port map(sel => sel odd result, -- sel odd results 
(not sel k) - - -
sel_odd && 
in 0 => input, -- for encryption, 
sel_k will be 0, so if sel_odd is equal to 0, sel will choose even 
path else if sel odd equals to 1 , sel will choose odd path. 
in 1 => rotated input, 
output=> sig_in_h); 









h block : h fctn 
pOrt map (input => slg ln h, 










--rotate to the left the output of the h function 8 times 
rol8 Slg out h· rol8 32 
port map-(data => si9_out_h, 
q =>rotated out h); 
--Multiplex between out h and rotated out h 
mux_out_h_rotated_out_h-: mux 2x32 -
port map(sel => sel odd k, -- sel odd k = sel odd && sel k 
in 1 => rotated out h, -- rOtated oUt h will only 
be selected It it is on the path and the current proCess-is key 
scheduling 
in_O ""> sig_out_h, 
output would be sig out h 
output =>-out0ddplaintext); 
Sel_odd_k -- Sel_odd 




0 <== 0 0 
0 <== 0 -- 1 
0 <"'= 1 0 
<== 1 -- 1 
end mixed; 
18. OPSELECT 
Author Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component is the operation selector. 
Version : 1.0 -Beta 
This module performs the operation 





This module includes all parts of the 
cipher that differ if it is in encryption 
or decryption mode. 
Encryption: 
in A in_B 
I 
I 
F-fct_A ----> xor <« 
F-function I I 









F-fct_A ----> xor 
F-function 
















encrypt: encryption mode I not decryption mode 
F-fct_A: even output of F-function (top) 
F-fct 8: odd output of F-function (bottom) 
in_A:-third input to the round 
in_B: fourth input to the round 
out_A: third output of the round 
out 8: fourth output of the round 
entity opselect is 
port (encrypt in std_logic; 
F fct A: in std logic vector(31 downto 0); 
F=fct=8 : in std=logic=vector(31 downto 0); 
in A : in std logic vector(31 downto 0); 
in-B : in std-logic-vector(31 downto 0); 
out_A : out std_logic_vector(31 downto 0); 
out_B : out std_logic_vector(31 downto 0)); 
end opselect; 
architecture struct of opselect is 
-- subcomponent 
component sub_opselect 
port (rotate before : in std logic; 
F fct - ln std lOglC vector(31 downto 0); 
iTiput : in std=logic_vector(31 downto 0); 
output : out std_logic_vector(31 downto 0)); 
end component; 
signal decrypt std_logic; 
begin 
left subcomponent: 
do: rol, xor for decryption 
xor, ror for encryption 
=> "rotate_before" when "encrypt" is 0 
leftsub sub_opselect 
port map (rotate_before => decrypt, 
F fct => F fct A, 
iTiput => iTI _A,-
output => out A) ; 
right subcomponent: -
do: rol, xor for encryption 
xor, ror for decryption 
=> "rotate before" when "encrypt" is 1 
rightsub : sub_opselect 
port map (rotate_before => encrypt, 
F fct => F fct 8, 
iDput => iD_B, 
output=> out_B); 
decrypt <= not encrypt; 
end struct; 
19.PERM_QO 
Author : Ananda Raja AIL Dare Raja 




This component is the qO_perrnutation. 
1.0 -Beta 
use 1eee std_log1c 1164.all; 
use 1eee.std_log1c ar1th.all, 
--Perm qO: Block used for qO-type of permutation 
entity-Perm qO is 
port( input-: in std logic vector(? downto 0); 
output out std logic vector(? downto 0) ); 
end Perrn_qO; - -
architecture struct of Perm qO lS 
component xor 2x4 -
port(a 1n std logic vector(3 downto 0); 
b : in std=logic=vector(3 downto 0}; 
result : out std logic vector(3 downto 0)); 
end component; - -
component xor 3x4 
port(dataa : in std_loglc_vector(3 downto 0); 
datab : in std logic vector(3 downto 0); 
datac in std-logic-vector(3 downto 0); 
result : out std logic vector(3 downto 0) ); 
end component; - -
signal 
a xor b, --a XOR b 
r0tat8d_b, --b rotated right one time 
a zero, --(a(O),O,O,O) vector 
befor_tl, --signal before lookup table tl 
a_prime, --a prime (output of lookup table tO) 
b prime, --b prime (output of lookup table tl) 
aP xor bp, --a prime XOR b prime 
rotated_bp, --b prime rotated right one time 
ap zero, --(a'(0},0,0,0) vector 





std_logic_vector(7 downto 0); --output of the qO 
--Start of qO permutation structure 
begin 
-- rotate b right once 
-- b(3 downto 0) is input{3 downto 0) 
rotated b(3) <= input(O); 
rotated-b(2) <= input(3); 
rotated=b(l) <= input(2); 
rotated_b(O) <= input(l); 
--build a vector with (a(O),O,O,O) 
-- a(3 downto 0) is input{? downto 4) 
a zero(3) <= input(4); 
a-zero(2) <= '0'; 
a zero(l) <= '0'; 
a-zero(O) <= '0'; 
-=rotate b_prime right once 
rotated_bp(3) <= b_prime(O); 
rotated_bp(2) <= b_prime(3); 
rotated_bp(l) <= b_prime(2); 
rotated_bp(O) <= b_prime(l); 
--build a vector with (ap(O),O,O,O) 
ap_zero(3) <= a_prime(O); 
ap_zero(2) <= '0'; 
ap zero(l) <= '0'; 
ap-zero(O) <= '0'; 
-- Compute a xor b 
--xor a and b 
xorl : xor 2x4 
port map( a=> input(? downto 4), 
b => input(3 downto 0), 
result=> a_xor_b); 
Compute a xor (b >>> 1) xor (a{O),O,O,O) 
--xor a, rotated b and a zero vector 
xor2 xor 3x4 -
port map( dataa =>input(? downto 4), 
datab => rotated_b, 
datac => a zero, 
result=> befor_tl); 
Compute a' xor b' 
--xor a prime and b prime 
xor3 : ~or_ 2x4 -
port map( a => a_prime, 
b => b_prime, 
result=> ap_xor_bp); 
a' xor (b' >>> 1) xor (a' {0),0,0,0) 
--xor rotated b_prime, a_prime and ap_zero vector 
xor4 : xor 3x4 
port map{ dataa => a prime, 
datab => rOtated_bp, 
datac => ap zero, 
result=> b8for_t3); 
lookup table tO 
qO tO process {a xor b, a prime) 
be'9in - - -
if a xor b = "0000" then a _prime <= "1000"; -- 8 
elsif a Xor_b "0001" then a_prime <= "0001"; 
elsif a_xor_b "0010" then a_prime <= "0111"; 
elsif a_xor_b = "0011" then a_prime <"' "1101"; 
elsif a xor b = "0100" then a pr~me <= "0110"; 
elsif a-xor-b "0101" then a_prlme <= "1111"; 
elsif a:=xor:=b = "0110" then a_prime <= "0011"; 
elsif a xor b = "0111" then a prime <"' "0010"; 
elsif a-xor:=b "1000" then a:=prime <= "0000"; 
elsif a xor b = "1001" then a prime <"' "1011"; 
elsif a-xor-b = "1010" then a-prime <= "0101"; 
e1sif a:=xor:=b "1011" then aYrime <= "1001"; 
elsif a xor b = "1100" then a pr~me <= "1110"; 
elsif a xor-b "1101" then a prime <= "1100"; 
elsif a-xor-b = "1110" then a-pr~me <"' "1010"; 
else a Prime<= "0100"; -- 4 
end ifi 
end process; 
-- lookup table t1 
qO_tl process {befor_t1, b_prime} 
begin 
"1110"; -- E if befor t1 = "0000" then b prlme <= 
elsif befor t1 = "0001" theTI b prime 
elsif befor:=tl = "0010" then bYrime <= 
elsif befor_t1 = "0011" then b__prime 
<~ 
<~ 
elsif befor t1 "0100" then b prime <= 
elsif befor:=t1 "0101" then bYrime <= 
e1sif befor t1 = "0110" then b prime <= 
e1sif befor:=tl = "0111" then bYrime 
elsif befor_t1 = "1000" then b_prime <= 
elsif befor_t1 = "1001" then b_prime <= 
elsif befor t1 "' "1010" then b prime <= 
elsif befor-t1 = "1011" then b-prime <= 









































elsif befor_tl = "1101" then b_prime <= "0000"; 
e1sif befor t1 = "1110" then b_prime <= "1001"; 





-- lookup table t2 
q0_t2 process (ap_xor_bp,permuted_input) 
begin 
if ap xor bp = "0000" then permuted input(7 downto 4) <"' "1011"; -- B 
elsif-ap Xor bp = "0001" then permuted input(7 downto 4} <= "1010"; 
-A - - -
elsif ap_xor_bp 
- 5 
elsif ap xor_bp 
- E 
elsif ap xor bp 
- 6 - -
elsif ap_xor_bp 
- D 
elsif ap xor_bp 
- 9 




elsif ap xor bp 











"0010" then permuted_input(7 downto 4) <"" "0101"; 
"0011" then permuted_input{7 downto 4) <= "1110"; 
"0100" then permuted lnput(7 downto 4) <= "0110"; 
"0101" then permuted_input(7 downto 4) <= "1101"; 
"0110" then permuted_input{7 downto 4) <= "1001"; 
"0111" then permuted lnput(7 downto 4) <= "0000"; 
"1000" then permuted_input(7 downto 4} <= "1100"; -
"1001" then permuted_input(7 downto 4} <= "1000"; 
"1010" then permuted input(7 downto 4) <= "1111"; 
"1011" then permuted_input(7 downto 4) <= "0011"; 
"1100" then permuted_input(7 downto 4} <= "0010"; 
"1101" then permuted_input(7 downto 4} <= "0100"; 
"1110" then permuted_input(7 downto 4) <= "0111"; 
else permuted input(7 downto 4) <= "0001"; -- 1 
end if; -
end process q0_t2; 
--lookup table t3 
q0_t3 process {befor_t3,permuted_input) 
begin 
if befor t3 = "0000" then permuted input(3 downto 0} <"' "1101"; -- D 
elsif befor t3 = "0001" then permuted input(3 downto 0} <= "0111"; 7 - -
elsif befor_t3 
F 










"0010" then permuted_input(3 downto 0) <= "1111"; 
"0011" then permuted lnput(3 doWnto 0) <= "0100"; 
"0100" then permuted_input(3 downto OJ <= "0001"; 
"0101" then permuted_input(3 downto 0) <= "0010",· 
"0110" then permuted_input(3 downto 0) <= "0110"; 





elsif befor t3 
3 -
elsif befor t3 
0 -
elsif befor t3 
8 -
elsif befor t3 
5 
elsif befor t3 
c 
"1000" then permuted_input(3 downto 0) <= "1001"; 
"1001" then permuted_input{3 downto 0) <="lOll"; 
"1010" then permuted_input(3 downto 0) <= "0011"; 
"1011" then permuted_input(3 downto 0) <= "0000"; 
"1100" then permuted_input(3 downto 0) <"' "1000"; 
"1101" then permuted_input(3 downto 0) <= "0101"; 
"1110" then permuted_input(3 downto 0} <= "1100"; 
else permuted_input(3 downto 0) <= "1010"; --A 
end if; 
end process 
--assigning value to the output from signal 
-- output = b&a {see page 10 of twofish} 
output<= permuted_input{3 downto 0} & permuted_input(7 downto 4}; 
end struct; 
20.PERM_Ql 
Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component This component is the gl_permutation. 
Version : 1.0 - Beta 
library ieee; 
use ieee.std logic ll64.all; 
use ieee.std-logic=arith.all; 
--Perm ql: Block used for ql-type of permutation 
entity-Perm_gl is 
port( input : in std_logic_vector(7 downto OJ; 
output : out std logic vector(7 downto 0}); 
end Perm_ql; - -
architecture struct OF Perm_ql IS 
component xor_2x4 
port{a : in std logic vector(3 downto 0}; 
b in std-logic-vector(3 downto 0); 
result : oUt std=logic_vector(3 downto 0)); 
end component; 
component xor_3x4 
port(dataa : in std logic vector(3 downto 0); 
datab :-in std logic vector(3 downto 0); 
datac : in std-logic-vector(3 downto 0); 
result : out std_logic_vector(3 downto 0) ); 
end component; 
Signal 
a xor b, --a XOR b 
r0tat8d b, --b rotated right one time 
a_zero,---(a(O),O,O,O) vector 
befor tl, --signal before lookup table tl 
a_priffie, --a prime (output of lookup table tO) 
b_prime, --b prime (output of lookup table tl) 
ap xor bp, --a prime XOR b prime 
rotated bp, --b prime rotated right one time 
ap_zero; --{a' (0),0,0,0) vector 
befor t3 std logic vector(3 downto 0); --input signal to lookup 
table-t3 - -
Signal 
permuted_input : std_logic_vector(7 downto 0); --output of the gO 
block 
--Start of gO permutation structure 
begin 
input = a&b 
-- rotate b right once 
-- b(3 downto 0) is input(3 downto 0) 
rotated_b(3) <= input(O); 
rotated b(2) <= input(3); 
rotated-b{l) <= input(2); 
rotated-b(O} <= input(l); 
--build-a vector with (a(O},O,O,O) 
-- a(3 downto 0) is input(7 downto 4) 
a_zero(3) <= input(4); 
a_zero(2) <"' '0'; 
a_zero(l} <= '0'; 
a_zero(O) <= '0'; 
--rotate b_prime right once 
rotated_bp(3) <= b_prime(O); 
rotated_bp(2} <= b_prime(3}; 
rotated_bp(l) <= b_prime(2); 
rotated_bp(O) <= b_prime(l); 
--build a vector with (ap(O),O,O,O} 
ap zero{3) <=a prime(O); 
ap-zero(2) <= •0•; 
ap_zero(l) <= '0'; 
ap_zero(O) <= '0'; 
Computer a xor b 
--xor a and b 
xorl xor_2x4 
port MAP( a=> input(7 downto 4), 
b => input(3 downto 0), 
result => a_xor b); 
Compute a xor (b >>> 1) xor (a(O),O,O,O} 
--xor a, rotated b and a zero vector 
xor2 : xor_3x4 -
port MAP( dataa => input(7 downto 4), 
datab => rotated_b, 
datac => a zero, 
result=> befor tl); 
--xor a_prime and b_prime 
xor3 : xor 2x4 
port MAP( a => a_prime, 
b => b_prime, 
result ~> ap_xor_bp); 
a' xor (b' >>> 1) xor (a' (0),0,0,0) 
--xor rotated b prime, a prime and ap zero vector 
xor4 : xor 3x4 - - -
port MAP( dataa ~> a_prime, 
datab => rotated_bp, 
datac => ap zero, 
result=> befor_t3); 
lookup table tO 
q1 tO : process (a x:or b,a prime) 
be9in - - -
if a_xor_b = "0000" then a_prime <= "0010"; -- 2 
elsif a xor b = "0001" then a_prime <= "1000"; 
elsif a-xor-b = "0010" then a_prime <= "1011"; -- B 
elsif a -xor-b = "0011" then a pn_me <= "1101"; D 
elsif a-xor-b = "0100" then a prime <= "1111"; -- F 
elsif a xor-b "0101" then a-prime <= "0111"; -- 7 
elsif a xor-b = "0110" then a-prime <= "0110"; -- 6 
e1sif a-xor-b "0111" then a-prime <= "1110"; E 
elsif a xor-b = "1000" then aYrime <= "0011"; -- 3 
elsif a-xor::::b "1001" then a_prime <= "0001"; -- 1 
elsif a xor b = "1010" then a prime<= "1001"; 9 
elsif a-xor-b = "1011" then a-prlme <= "0100"; -- 4 
elsif a-xor-b = "1100" then a prime <"'" "0000"; -- 0 
elsif a-xor-b "1101" then a-prime<= "1010"; A 
elsif a xor-b = "1110" then aYrime <= "1100"; -- c 
else a Prime<"'" "0101"; -- 5 
end ir";-
end process; 
--lookup table t1 
q1 t1 process (befor t1,b prime) 
be9in -
if befor_t1 = "0000" then b_prime <= "0001"; --
elsif befor_t1 = "0001" then b_prime <= "1110"; -- E 
elsif be for t1 = "0010" then b prime <= "0010"; -- 2 
elsif befor-t1 = "0011" then b-prlme <= "1011"; 8 
elsif be for t1 = "0100" then b prime <= "0100"; -- 4 
elsif befor::::t1 = "0101" then b:J>rime <"' "1100"; -- c 
elsif befor t1 "0110" then b prime<= "0011"; 3 
elsif befor t1 = "0111" then b-prime <'"'" "0111"; -- 7 
elsif befor::::tl = "1000" then b::::prime <= "0110"; -- 6 
elsif befor t1 = "1001" then b prime<= "1101"; D 
elsif befor::::t1 = "1010" then b:J>rime <"" "1010"; -- A 
elsif befor t1 "1011" then b_prime <= "0101"; 5 
elsif befor t1 "1100" then b prime<= "1111"; F 
e1sif befor-t1 "1101" then b::::prime <= "1001"; -- 9 
e1sif befor t1 = "1110" then b prime <= "0000"; -- 0 
else b_prime <"' "1000"; -- 8 -
end if; 
end process; 
--lookup table t2 
q1 t2 : process (ap_xor bp,permuted input) 
be9in - -
if ap xor bp = "0000" then permuted input(7 downto 4) <"' "0100"; --
elsif-ap Xor bp = "0001" then permuted input(7 downto 4) <= "1100"; -





elsif ap xor bp 
- 1 - -
elsif ap xor bp 



















"0010" then permuted lnput(7 downto 4) <= "0111"; 
"0011" then permuted lnput(7 downto 4) <= "0101"; 
"0100" then permuted_input(7 downto 4) <= "0001"; 
"0101" then permuted_input(7 downto 4) <= "0110"; 
"0110'' then permuted_input(7 downto 4) <= "1001"; 
"0111" then permuted lnput(7 downto 4) <= "1010"; 
"1000" then permuted_input(7 downto 4) <= "0000"; 
"1001" then permuted_input(7 downto 4) <= "1110"; 
"1010" then permuted_input(7 downto 4) <= "1101"; 
"1011" then permuted lnput(7 downto 4) <= "1000"; 
"1100" then permuted lnput (7 down to 4) <= "0010"; 
"1101" then permuted_input(7 downto 4) <= "1011"; 
"1110" then permuted_input(7 downto 4) <"' "0011"; 
else permuted input(7 downto 4) <= "1111"; -- F 
end if; -
end process; 
--lookup table t3 
q1_t3 : process (befor~t3,permuted_input) 
begin 
if befor t3"' "0000" then permuted input(3 downto 0) <= "1011"; -- 8 
elsif befor t3"' "0001" then permut"ed_input(3 downto 0) <= "1001"; 
9 
elsif befor t3 
5 




e1sif befor t3 
3 




elsif befor t3 
6 -
elsif befor t3 
4 




elsif befor t3 
2 
elsif befor t3 
0 -
elsif befor t3 
8 -
"0010" then permuted_input(3 downto 0) <= "0101"; 
"0011" then permuted_input ( 3 down to 0) <= "0001"; 
"0100" then permuted_input (3 downto 0) <= "1100"; 
"0101" then permuted_input(3 downto 0) <= "0011"; 
"0110" then permuted_input(3 downto 0) <= "1101"; 
"0111" then permuted_input(3 downto 0) <= "1110"; 
"1000" then permuted_input (3 downto 0) <"' "0110"; 
"1001" then permuted_input(3 downto 0) <= "0100"; 
"1010" then permuted_input(3 downto 0) <'"" "0111"; 
"1011" then permuted input(3 downto 0) <= "1111"; 
"1100" then permuted_input (3 down to 0) <= "0010"; 
"1101" then permuted_input(3 downto 0) <"' "0000"; 
"1110" then permuted_input(3 downto 0) <= "1000"; 
else permuted_input(3 downto 0) <= "1010"; --A 
end if; 
end process; 
--assigning value to the output 
-- output = b&a (see page 10 of twofish) 
output<= permuted input(3 downto 0) & permuted input(7 downto 4); --
---MAKE SURE THIS WoRKS WELL!!!! -
end struct; 
2J.PHT 
Author Ananda Raja AIL Dare Raja 
ID 1669 
Component : This component is the PHT. 
Version 1.0 - Beta 
library ieee; 
use ieee.std_logic 1164.all; 
Pseudo-Hadamard Transform: 





A_in + B_in 
B_in + ----> B_out = A_in + 2B_in 
All inputs and outputs are 32-bits wide. Additions 
are done in modulo 2A32. 
entity pht is 





in std logic vector(31 downto 0}; 
in std-logic-vector(31 downto 0); 
out std logic vector(3l downto 0); 







B_in --- << 
architecture struct OF pht is 
component adder32 
port( dataa, datab in std logic_vector(31 downto 0} 
cout out std logic ; 
cin in std_logic:= '0'; 
result : out std logic vector(31 downto 0)) ; 
end component; - -
signal B in x2 
begin - -
std_logic_vector (31 downto 0}; 
-- top adder: A_out = A in + B_in 
top_adder : adder32 
port map (dataa => A in, 
datab => 8=in, 
result=> A_out); 
-- bottom adder: A out = A in + 28 in 
bottom_adder : adcter32 -
port map (dataa => A_in, 
datab => B in x2, 
result=> B_oUt); 
Compute 2B_in by shifting B_in left by one. 
two_times 2B_in: process (B_in,B_in_x2) 
begin 
for i IN 31 downto 1 loop 
B_in_x2(i) <= B_in(i-1}; 
end loop; 
B_in_x2(0) <= '0'; 
end process two times 28 in; 
end struct; - - -
22. REGOJ 
Author : Ananda Raja AIL Dare Raja 
ID : 1669 
Component This component is the 1bit register. 
Version : 1.0 - Beta 
library ieee; 
use ieee.std_logic_ll64.all; 
entity regOl is 
port (clock, clr, data, enable in 
q : out std_log~c), 
end regOl; 




if (clr ='1') then 
q <='0'; 
std logic; 
elsif (clock'event and clock='!') then 
if (enable='!') then 






Author Ananda Raja AIL Dare Raja 
ID : 1669 
Component : This component lS the 32bit register. 
Version : 1.0 - Beta 
library ieee; 
use ieee.std_logic_ll64.all; 
entity reg16 is 
port(data in std logic vector (15 downto 0); 
clock,clr,enable : in std logic; 
q out std logic vector tl5 downto 0)); 
end regl6; - -
architecture behavior of regl6 is 
component regOl 
port(clock,clr,enable, data in std logic; 
q out std_logic); -
end component; 
begin 
aa: fori in 0 to 15 generate 




Author Ananda Raja AIL Dare Raja 
ID : 1669 
Component This component is the 32bit register. 




entity reg32 is 
port(data in std logic vector (31 downto 0); 
clock,clr,enable in std_logic; 
q out std logic vector (31 downto 0}); 
end reg32; - -
architecture behavior of reg32 is 
component regOl 
port(clock,clr,enable, data : in std logic; 
q : out std_logic); 
end component; 
begin 
aa: for i in 0 to 31 generate 




Author Ananda Raja A/L Dare Raja 
ID 1669 
Component This component is the 128bit register. 
Version: 1.0- Beta 
library ieee; 
use ieee.std_logic_1164.all; 
entity regl28 is 
port(clock,clr,enable in std logic; 
data : in std logic vector-(127 downto OJ; 
q : out std lOgic vector (127 downto OJJ; 
end regl28; - -
architecture behavior of reg128 is 
component regOl 
port(clock,clr,enable,data : in std_logic; 
q out std_logic); 
end component; 
begin 
aa: for i in 0 to 127 generate 




Author Ananda Raja AIL Dare Raja 
ID : 1669 
Component This component rotates to the left by lbit. (32bit) 
Version : 1.0 - Beta 
library ieee; 
use ieee.std logic 1164.all; 
use ieee.std=logic=arith.all; 
entity roll 32 is 
port (data~ in std logic vector(31 downto 0); 
q : out std l;gic v8ctor(31 downto 0)); 
end roll 32; - -








Author Ananda Raja A/L Dare Raja 
ID 1669 
----Component : This component rotates to the left by 8 bits.32bit)-
--- Version : 1.0 - Beta 
library ieee; 
use leee.std_loglC 1164.all; 
use ieee.std_logic_arith.all; 
entity ro18 32 is 
port (data- in std_logic_vector(31 downto 0); 
q : out std logic vector(31 downto 0}); 
end rol8 32; - -










Author Ananda Raja A/L Dare Raja 
ID : 1669 
Component This component rotates to the left by 9 bits. (32bit}-




entity rol9 32 is 
port (data: in std_logic_vector(31 downto 0}; 
q out std logic vector(31 downto 0)); 
end rol9_32; 









Author : Ananda Raja A/L Dare Raja 




This component rotates to the right by 1 bit. (32bit)-
1. 0 - Beta 
use ieee.std logic 1164.all; 
use ieee.std=logic=arith.all; 
entity ror1_32 is 
port (data : in std_logic_vector(31 downto 0); 
q : out std_logic_vector(31 downto 0) ); 
end rorl 32; 




q <= data(O) & data(31 downto 1}; --rotate right 1 bit 
end process; 
end behavior; 
30. RSMA TRIX 
Author Ananda Raja A/L Dare Raja 
ID 1669 
Component This component is the RS Matrix. 
Version ; 1.0 -Beta 
I sOl 101 A4 55 87 SA 58 DB 9EI I mOl 
I sll IA4 56 82 F3 lE C6 68 ESIImll 
[s21 102 A1 FC C1 47 AE 3D 191 1m2 I 





In Galois field GF(2A8) with primitive polynomial 
XAB + XA6 + XA3 + XA2 + 1. 
library ieee; 
use ieee.std logic 1164.all; 
entity rsmatrix is-
port(mO in std_logic_vector{7 downto 0); 
ml in std logic vector{7 downto 0); 
m2 in std-logic-vector(7 downto 0); 
m3 in std-log~c-vector{7 downto 0}; 
m4 in std=logic_vector(7 downto 0}; 
m5 in std logic vector(7 downto 0); 
m6 in std-logic-vector{7 downto 0); 
m7 in std=logic=vector(7 downto 0}; 
sO out std_logic_vector(7 downto 0); 
sl out std log~c vector(7 downto 0); 
s2 out std logic-vector{7 downto 0); 
s3 out std=logic=vector(7 downto 0)); 
end rsmatrix; 
architecture struct of rsmatrix is 
component mu1tgf A4 
port(inbyte ; in-std logic vector(7 downto 0); 
outbyte out stct_logic_vector(7 downto 0)); 
end component; 
component multgf 55 
port(inbyte in std_log~c_vector(7 downto 0), 
outbyte out std_logic_vector(7 downto 0)}; 
end component; 
component multgf 87 
port(inbyte : in-std_logic_vector(7 downto OJ; 
outbyte out std logic vector(7 downto 0) ); 
end component; - -
component multgf SA 
port(inbyte in std_logic_vector{7 down to 0); 
outbyte out std logic vector(? downto 0)); 
end component; - -
component multgf 58 
port(inbyte : in-std logic vector(7 downto 0); 
outbyte : out std logic vector(? downto 0)); 
end component; - -
component multgf DB 
port{inbyte : in-std_log~c vector(7 downto 0), 
outbyte : out std logic vector{7 downto 0)}; 
end component; - -
component multgf 9E 
port(inbyte : in-std logic vector(7 downto 0); 
outbyte out std logic_vector(7 downto 0)); 
end component; 
component multgf 56 
port(inbyte : in-std logic vector(7 downto 0}; 
outbyte out std_logic_vector(7 downto 0)); 
end component; 
component multgf 82 
port(inbyte in-std logic vector(7 downto 0}; 
outbyte : out std logic vector(7 downto 0)}; 
end component; - -
component multgf_F3 
port(inbyte : in std log~c vector(7 downto 0), 
outbyte : out std logic vector(7 downto 0) ); 
end component; - -
component multgf lE 
port(inbyte : in-std_logic_vector(7 downto 0); 
outbyte : out std logic vector(7 downto 0)}; 
end component; - -
component multgf_C6 
port(inbyte : in std log~c vector(? downto 0); 
outbyte out std logic vector(7 downto 0)); 
end component; - -
component multg£_68 
port(inbyte : in std log~c vector(? downto 0), 
outbyte : out std logic vector(7 downto 0)); 
end component; -
component multgf ES 
port(inbyte in-std logic vector(7 downto 0); 
outbyte : out std logic vector(7 downto 0)); 
end component; - -
component multgf_02 
port(inbyte : in std logic vector(7 downto 0); 
outbyte : out std_log~c vector(7 downto 0)); 
end component; 
component multgf Al 
port(inbyte : in-std logic vector(7 downto 0}; 
outbyte : out std_logic_vector(7 downto 0)}; 
end component; 
component multgf FC 
port(inbyte in-std logic vector(7 downto 0); 
outbyte : out std_logic_vector(7 downto 0}); 
end component; 
component multgf Cl 
port(inbyte : in-std_logic_vector(7 downto 0); 
outbyte out std_logic_vector(7 downto 0}); 
end component; 
component multgf 47 
port(inbyte : in-std logic vector(7 downto 0); 
outbyte : out std_logic_vector(7 downto 0)); 
end component; 
component multgf AE 
port(inbyte : in-std_logic_vector(7 downto 0); 
outbyte out std_logic_vector(7 downto OJ); 
end component; 
component multgf 3D 
port(inbyte in-std logic vector(? downto 0); 
outbyte : out std_logic_vector{7 downto 0)); 
end component; 
component multgf 19 
port(inbyte in-std_log~c vector(7 downto OJ, 
outbyte out std_logic_vector(7 downto 0)); 
end component; 
component multgf 03 
port(inbyte in-std logic vector(? downto 0); 
outbyte : out std logic vector(7 downto 0)); 
end component; - -
component xor_8x8 
port(A in std logic vector(7 downto 
B in std logiC vect0r(7 downto 0); 
C in std-logic-vector(7 downto 0); 
D in std-logic-vector(7 downto 0); 
E ~n std-logic-vector{7 downto 0); 
F in std-logic-vector(7 downto 0); 
G in std-logic-vector(7 downto 0); 
0); 
H in std-logic-vector(7 downto 0}; 
result . oUt std-log~c vector(7 downto 0)); 
end component; -
SIGNAL mOxOl, mlxA4, m2x55, 
m0xA4, mlx56, m2x82, m3xF3, 
m0x02, mlxAl, m2xFC, m3xC1, 
















compute first row 
mOxOl <= mO; 
mlxA4_comp: multgf_A4 PORT MAP (inbyte => m1, outbyte => m1xA4); 
m2x55_comp: multgf_SS PORT MAP (inbyte => m2, outbyte => m2x55); 
m3x87_comp: multgf_87 PORT MAP (inbyte => m3, outbyte => m3x87); 
rn4x5A camp: multgf SA PORT MAP (inbyte => m4, outbyte => m4x5A); 
m5x58=comp: multgf=58 PORT MAP (inbyte => mS, outbyte => m5x58); 
m6xDB_comp: multgf_DB PORT MAP (inbyte => m6, outbyte => m6xDB); 
m7x9E_comp: multgf_9E PORT MAP (inbyte => m7, outbyte => m7x9E); 
-- computer second row 
m0xA4 comp: multgf A4 PORT MAP (inbyte =>roO, outbyte "'> m0xA4); 
mlx56-comp: rnultgf-56 PORT MAP (inbyte => m1, outbyte "'> mlx56); 
m2x82-comp: multgf-82 PORT MAP (inbyte => rn2, outbyte => m2x82); 
m3xF3=comp: multgf=F3 PORT MAP (inbyte => m3, outbyte => m3xF3); 
m4xlE camp: multgf lE PORT MAP (inbyte => m4, outbyte => m4x1E); 
m5xC6 camp: multgf-C6 PORT MAP (inbyte => m5, outbyte => m5xC6); 
m6x68-comp: multgf-68 PORT MAP (inbyte => m6, outbyte => m6x68); 
m7xE5=comp: multgf=ES PORT MAP (inbyte => m7, outbyte "'> m7xE5); 
-- computer third row 
m0x02 camp: multgf 02 PORT MAP (inbyte => mO, outbyte => m0x02); 
mlxAl=cornp: multgf=Al PORT MAP (inbyte => m1, outbyte => mlxAl); 
m2xFC comp: multgf FC PORT MAP (inbyte => m2, outbyte => m2xFC); 
m3xC1-comp: multgf-Cl PORT MAP (inbyte => m3, outbyte => m3xC1) ,· 
m4x47=comp: multgf=47 PORT MAP (inbyte => rn4, outbyte => m4x47); 
mSxAE_comp: multgf_AE PORT MAP (inbyte => m5, outbyte => mSxAE); 
m6x3D comp: multgf 3D PORT MAP (inbyte => m6, outbyte => m6x3D); 
m7xl9=comp: multgf-19 PORT MAP (inbyte => m7, outbyte => m7x19); 
computer fourth row 
m0xA4 is already done 
mlx55_comp: multgf_SS PORT MAP 
m2x87 camp: multgf 87 PORT MAP 
m3x5A-comp: multgf-SA PORT MAP 
m4x58-comp: multgf-58 PORT MAP 
m5xDB-comp: multgf-DB PORT MAP 
m6x9E-comp: multgf-9E PORT MAP 
m7x03=comp: multgf=03 PORT MAP 
-- add elements of row 1 
rowl_xor: xor_8x8 
PORT MAP ( A => mOxOl, 
B => m1xA4, 
c => m2x55, 
D => m3x87, 
E ""> m4x5A, 
F => m5x58, 
G => m6xDB, 
H => m7x9E, 
result=> so ); 
add elements of row 2 
row2_xor: xor_SxB 
( inbyte => m1, 
(inbyte => m2, 
(inbyte => m3, 
(inbyte => m4, 
(inbyte => m5, 
(inbyte => m6, 
( inbyte => m7, 
















PORT MAP ( A => rn0xA4, 
B => rnlx56, 
C => rn2x82, 
D => rn3xF3, 
E => rn4xlE, 
F => rn5xC6, 
G => rn6x68, 
H => rn7xE5, 
result => sl ) ; 
add elements of row 3 
row3 xor: xor 8x8 
PORT-MAP ( A-=> m0x02, 
B "'> mlxAl, 
c => m2xFC, 
D => m3xC1, 
E => m4x47, 
F => mSxAE, 
G => m6x3D, 
H => m7x19, 
result=> s2 ); 
add elements of row 4 
row4 xor: xor Sx8 
PORT-MAP ( A-=> m0xA4, 
end struct; 
B => mlx55, 
C "'> m2x87, 
D => m3xSA, 
E => m4x58, 
F => mSxDB, 
G => m6x9E, 
H => m7x03, 
result => s3 ) ; 
The following multiplications are carriend out 
in the Galois field GF(2A8) with primitve polynomial 
XAS + XA6 + XA3 + XA2 + 1. 
The number is in hexadecimal. inbyte is the input 




entity multgf_A4 is 
PORT{ inbyte in std logic vector(7 downto 0); 
outbyte : out std_logic vector(? downto Ol ); 
end multgf_A4; 











inbyte(4) XOR inbyte(O); 
inbyte(3); 
outbyte(3) <= inbyte(7) XOR inbyte(4) XOR inbyte(2); 
outbyte(2) <= inbyte(7) XOR inbyte(3) XOR inbyte ( 1) 
outbyte(1) <= inbyte (2); 




use 1eee.std_log1c 1164.a11; 
entity multgf_55 is 
PORT( inbyte : in std_logic_vector{7 downto 0); 
outbyte out std logic vector(7 downto 0)); 
end multgf 55; - -
architecture struct of multgf_55 is 
begin 
outbyte(7) <= inbyte ( 7) XOR inbyte(6) XOR inbyte(5) 
outbyte(6) <= inbyte{7) XOR inbyte(6) XOR inbyte(5) 
inbyte (0); 
outbyte(5) <= inbyte(7) XOR inbyte(4) XOR inbyte(3) 
outbyte(4) <= inbyte ( 7) XOR inbyte(6) XOR inbyte(3) 
inbyte (0); 
outbyte(3} <~ nbyte(6) XOR inbyte(S) XOR inbyte(2) 
outbyte(2) <~ nbyte(7) XOR inbyte(6) XOR inbyte(4) 
outbyte(1) <~ nbyte(7) XOR inbyte(3) XOR inbyte(1); 




use leee.std_loglc 1164.all; 
entity multgf_87 is 
PORT{ inbyte : in std_logic_vector(7 downto 0); 
outbyte : out std logic vector(? downto 0) ); 
end multgf_87; - -
architecture struct of multgf 87 is 
begin -
outbyte{7) <= inbyte ( 6) XOR inbyte(4) XOR inbyte(2) 
outbyte{6) <= inbyte ( 7) XOR inbyte(5) XOR inbyte(3) 
outbyte(5) <= inbyte(7); 
outbyte(4) <= inbyte{6); 
outbyte(J) <= inbyte (7) XOR inbyte{5); 
outbyte (2) <= inbyte(2) XOR inbyte(O); 
outbyte(l) <= inbyte(6) XOR inbyte(4) XOR inbyte(2) 
inbyte(O); 






ent1ty multgf SA is 
PORT(inbyte in std_loglC vector(7 downto 0); 







XOR inbyte ( 0) ; 
XOR inbyte(O); 
XOR inbyte{l); 






outbyte out std_loglc vector{? downto 0)); 
end multgf_5A; 
arch1tecture struct of mu1tgf SA 1s 
begin 
outbyte(7) <= inbyte ( 7) XOR inbyte(6) XOR inbyte(4) 
outbyte (6) <= inbyte (7) XOR inbyte(6) XOR inbyte(5) 
inbyte(O); 
outbyte(S) <= inbyte(Sl XOR inbyte ( 2) XOR inbyte ( 1) ; 
outbyte(4) <= inbyte (7) XOR inbyte ( 4) XOR inbyte(1) 
outbyte(J) <= inbyte(7) XOR inbyte ( 6) XOR inbyte (3) 
outbyte{2) <= inbyte ( 5} XOR inbyte(4) XOR inbyte(2) 
outbyte(1) <= inbyte ( 6) XOR inbyte(3) XOR inbyte(O); 





entity multgf_58 is 
PORT( inbyte : in std logic vector(? downto 0); 
outbyte out std log~c vector(? downto 0)); 
end multgf_58; -
arch1tecture struct of multgf 58 lS 
begin 
outbyte(7) <= inbyte(7) XOR inbyte{4) XOR inbyte(1); 
outbyte(6) <= inbyte ( 6) XOR inbyte(3) XOR inbyte{O); 
outbyte(5) <= inbyte(5) XOR inbyte ( 4) XOR inbyte(2) 
outbyte (4) <= inbyte (7) XOR inbyte(4) XOR inbyte(3) 
inbyte{O); 
outbyte(3) <'"' inbyte(6) XOR inbyte ( 3) XOR inbyte(2) 
outbyte(2) <= inbyte ( 7) XOR inbyte(5) XOR inbyte{4) 
outbyte(1) <= inbyte ( 6) XOR inbyte{3); 




use leee.std_log1c 1164.all; 
entity multgf DB is 
PORT( inbyte- in std log1c vector(? downto 0), 
outbyte out std log~c vector{? downto 0)); 
end multgf_DB; -
architecture struct of multgf_DB is 
begin 
outbyte(7) <= inbyte { 6) XOR inbyte ( 5 l XOR inbyte(2) 
inbyte{O); 
outbyte {6) <= inbyte(7) XOR inbyte(S) XOR inbyte(4) 
inbyte (0); 
outbyte(5} <_, inbyte(7) XOR inbyte(5) XOR inbyte ( 4) 
inbyte(2) XOR inbyte ( 1) ; 
outbyte(4) <= inbyte ( 6) XOR inbyte{4) XOR inbyte(3) 
inbyte ( 1) XOR inbyte(O); 
outbyte(3} <= inbyte(5} XOR inbyte(3) XOR inbyte(2} 
inbyte(O); 
XOR inbyte(1); 
XOR inbyte{3) XOR 
XOR inbyte ( 0) ; 
XOR inbyte (0); 
XOR inbyte ( 1 ) ; 
XOR inbyte ( 1) ; 
XOR inbyte(1) XOR 
XOR inbyte ( 0); 
XOR inbyte ( 2) ; 
XOR inbyte(1) XOR 
XOR inbyte(l) XOR 
XOR inbyte(3) XOR 
XOR inbyte(2) XOR 
XOR inbyte(l) XOR 
outbyte(2) <= inbyte ( 7) XOR inbyte ( 6) XOR inbyte(5) 
outbyte(l) <= inbyte (7) XOR inbyte(4) XOR inbyte(3) 
inbyte (1) XOR inbyte ( 0) ; 
outbyte(O) <= inbyte(7) XOR inbyte(6) XOR inbyte{3) 





entity multgf 9E is 
PORT( inbyte; in std_logic_vector(7 downto 0); 
outbyte : out std logic vector(? downto 0)); 
end multgf 9E; - -
archltecture struct of multgf 9E lS 
begin 
outbyte(7) <= inbyte(Sl XOR inbyte(3) XOR inbyte(2) 
outbyte(6) <= inbyte(7) XOR inbyte(4) XOR inbyte{2) 
outbyte(5) <= inbyte(7) XOR inbyte(6) XOR inbyte(5} 
inbyte(l); 
outbyte (4) <= inbyte(6) XOR inbyte(5) XOR inbyte(4) 
inbyte{Ol; 
outbyte{3) <= inbyte(5) XOR inbyte(4) XOR inbyte(3) 
outbyte(2) <= inbyte{5) XOR inbyte(4) XOR inbyte(O); 
outbyte(l) <= inbyte(7) XOR inbyte(5) XOR inbyte{4) 
inbyte(O); 





entity multgf 56 is 
PORT( inbyte; in std logic vector(? downto 0); 
outbyte : out std logic vector(? downto 0)); 
end multgf_56; - -
architecture struct of multg£_56 is 
begin 
outbyte(7) <= inbyte(S) XOR inbyte(l}; 
outbyte (6) <= inbyte { 4 ) XOR inbyte (0); 
outbyte(5} <= inbyte(7) XOR inbyte ( 5) XOR inbyte(3) 
outbyte (4) <= inbyte(7) XOR inbyte ( 6) XOR inbyte(4) 
inbyte (0); 
outbyte(3) <= inbyte ( 7) XOR inbyte(6) XOR inbyte(5) 
inbyte(l); 
outbyte(2) <= inbyte(6) XOR inbyte(4) XOR inbyte(2) 
inbyte(O); 
outbyte(l) <= inbyte ( 7) XOR inbyte ( 3} XOR inbyte(O); 





XOR inbyte ( 4 ) ; 
XOR inbyte(2) XOR 
XOR inbyte(2} XOR 
XOR inbyte(O); 
XOR inbyte ( 1) ; 
XOR inbyte(2) XOR 
XOR inbyte{l) XOR 
XOR inbyte (0); 
XOR inbyte{2) XOR 
XOR inbyte ( 1) ; 
XOR inbyte ( 1); 
XOR inbyte{2} XOR 
XOR inbyte(3) XOR 
XOR inbyte{l) XOR 
entity multgf_82 is 
PORT( inbyte : in std_logic_vector{7 downto 0); 
outbyte : out std_logic_vector(7 downto OJ); 
end multgf 82; 
architecture struct of multgf_82 is 
begin 
outbyte(7) <= inbyte(6) XOR inbyte(5) XOR inbyte(4) 
inbyte (0); 
outbyte(6) <= inbyte (7) XOR inbyte(5) XOR inbyte(4) 
inbyte (1); 
outbyte(5) <= inbyte {7) XOR inbyte(5) XOR inbyte ( 3) ; 
outbyte(4) <= inbyte ( 7) XOR inbyte(6) XOR inbyte(4) 
outbyte(3) <= inbyte(6) XOR inbyte(5) XOR inbyte(3) 
outbyte (2) <= inbyte (6); 
outbyte(l) <= inbyte (7) XOR inbyte(6) XOR inbyte(4) 
inbyte{O); 






entlty multgf F3 lS 
PORT( inbyte in std logic vector(? downto 0); 
outbyte : out std logic vector(? downto 0) ); 
end multgf_F3; - -
architecture struct of multgf_F3 is 
begin 
outbyte(7) <= inbyte(7} XOR inbyte ( 6) XOR inbyte(5) 
inbyte (0); 
outbyte(6) <= inbyte(6) XOR inbyte(5) XOR inbyte(4) 
outbyte(5) <= inbyte {7) XOR inbyte ( 6) XOR inbyte(4) 
inbyte{l) XOR inbyte ( 0) ,· 
outbyte{4) <= inbyte ( 7) XOR inbyte{6) XOR inbyte ( 5) 
inbyte(2) XOR inbyte ( 0) ; 
outbyte(3) <= inbyte(6) XOR inbyte(5) XOR inbyte(4} 
inbyte{l); 
outbyte{2) <= inbyte ( 7) XOR inbyte ( 6) XOR inbyte ( 4) 
outbyte(l) <= inbyte ( 7) XOR inbyte(3) XOR inbyte{2) 
inbyte(O); 






entity multgf_lE is 
PORT( inbyte in std_loglc vector(? downto 0}, 
outbyte : out std logic vector(? downto 0) ); 
end multgf_lE; - -
architecture struct of multgf_lE is 
XOR inbyte(2) XOR 
XOR inbyte(3) XOR 
XOR inbyte ( 2) ; 
XOR inbyte(l); 
XOR inbyte { 2) XOR 
XOR inbyte(3) XOR 
XOR inbyte{l) XOR 
XOR inbyte ( 0) ; 
XOR inbyte ( 3) XOR 
XOR inbyte(3) XOR 
XOR inbyte{2) XOR 
XOR inbyte(3}; 
XOR inbyte(l) XOR 
XOR inbyte ( 1) XOR 
begin 
outbyte(7) <= inbyte(4) XOR inbyte(3); 
outbyte(6) <= inbyte(7) XOR inbyte(3) XOR inbyte(2); 
outbyte(S) <= inbyte(6) XOR inbyte(4) XOR inbyte(3) XOR inbyte(2) XOR 
inbyte(l); 
outbyte(4} <= inbyte(7) XOR inbyte(S) XOR inbyte(3) XOR inbyte(2) XOR 
inbyte(l) XOR inbyte(Q); 
outbyte(3) <= inbyte(7) XOR inbyte(6) XOR inbyte(4} XOR inbyte(2) XOR 
inbyte(l) XOR inbyte(O); 
outbyte(2) <= inbyte(7) XOR inbyte{6) XOR inbyte(S) XOR inbyte(4) XOR 
inbyte(l) XOR inbyte(O); 
outbyte(l) <= inbyte(6) XOR inbyte(S) XOR inbyte(O); 





entity multgf C6 is 
PORT( inbyte: in std logic vector(? downto 0); 
outbyte : out std_logic_vector(7 downto 0)}; 
end multgf_C6; 





inbyte ( 2) XOR 
outbyte(S) <= 
outbyte(4) <= 










inbyte (1) XOR 
inbyte(7) XOR 
inbyte (7) XOR 
inbyte(6) XOR 









use ieee std_loglC 1164.all; 
inbyte(6) XOR inbyte(4) XOR inbyte(3) XOR 
inbyte (0); 
inbyte(6) XOR 













inbyte(S) XOR inbyte(4} XOR inbyte(3) XOR 
entity multgf 68 is 
PORT( inbyte ~in std logic vector(? downto 0); 
outbyte : out std logic vector(? downto 0)}; 
end multgf_68; - -












inbyte ( 7) XOR 
inbyte ( 1) XOR 
inbyte{4) XOR 
inbyte ( 3} XOR 
inbyte(7) XOR 
inbyte ( 7) XOR 







inbyte(5) XOR inbyte(4) XOR 
inbyte(Q); 
inbyte(O); 
inbyte(3) XOR inbyte(2); 
outbyte(l) <= inbyte(7) XOR inbyte(S} XOR inbyte(4) XOR inbyte(3}; 






entity multgf_ES is 
PORT( inbyte : in std_logic_vector(7 downto 0}; 
outbyte out std logic vector(? downto OJ); 
end multgf_ES; - -
arch~tecture struct of multgf ES ~s 
begin 
outbyte(7) <= inbyte(7) XOR inbyte ( 5) XOR inbyte(3) 
inbyte (0); 
outbyte{6) <"' inbyte ( 6) XOR inbyte(4) XOR inbyte(2) 
outbyte(S) <= inbyte(O); 
outbyte(4) <= inbyte (7) ; 
outbyte (3) <= inbyte (6); 
outbyte{2) <= inbyte(3) XOR inbyte(l) XOR inbyte (0); 
outbyte(l} <= inbyte(7) XOR inbyte(S) XOR inbyte(3) 
inbyte(l); 






entity multgf 02 is 
PORT( inbyte- in std log~c_vector{7 downto 0), 
outbyte : out std logic vector(? downto 0)); 
end multgf_02; - -





















use leee.std_loglc 1164.all; 




PORT( inbyte : in std_logic_vector(7 downto 0); 
outbyte : out std log~c vector(? downto 0)); 
end multgf_Al; -
XOR inbyte(l) XOR 
XOR inbyte(O); 
XOR inbyte(2) XOR 
XOR in byte ( 1) XOR 
architecture struct of multgf_Al is 
begin 
outbyte(7) <""' inbyte ( 6) XOR inbyte(5} XOR inbyte(O); 
outbyte (6) <""' inbyte(S) XOR inbyte(4); 
outbyte(S) <= inbyte(6} XOR inbyte(S) XOR inbyte ( 4) 
inbyte(O); 
outbyte (4) <= inbyte ( 7) XOR inbyte(5) XOR inbyte ( 4) 
inbyte (2); 
outbyte(3) <= inbyte(6) XOR inbyte(4) XOR inbyte(3) 
inbyte(1); 
outbyte(2) <= inbyte(7) XOR inbyte(6) XOR inbyte(3) 
inbyte(l); 
outbyte(1) <= inbyte(7) XOR inbyte(2) XOR inbyte ( 1) ; 





entity multgf_FC is 
PORT(. inbyte · : in std logic vector ( 7 down to 0); 
outbyte out std logic vector(7 downto 0)); 
end multgf_FC; - -
architecture struct of multgf FC is 
begin 
outbyte(7) <= inbyte(7) XOR inbyte(6} XOR inbyte(4) 
inbyte{O); 
outbyte(6) <= inbyte ( 6} XOR inbyte(S) XOR inbyte{3) 
outbyte(5} <= inbyte(6) XOR inbyte(S) XOR inbyte(2) 
inbyte(O); 
outbyte{4) <= inbyte ( 7) XOR inbyte(5) XOR inbyte(4) 
inbyte{O); 
outbyte (3) <"' inbyte(7} XOR inbyte ( 6) XOR inbyte(4) 
inbyte (0); 
outbyte(2} <"' inbyte(5) XOR inbyte(4) XOR inbyte(3) 
inbyte (1) XOR inbyte (0); 
outbyte(l) <-= inbyte(6) XOR inbyte(3) XOR inbyte ( 2) ; 




use ieee.std_logic 1164.all; 
entity multgf C1 is 
PORT( inbyte- in std logic vector(7 downto 0); 
outbyte : out std log~c vector{7 downto 0)); 
end multgf_Cl; -
architecture struct of multgf C1 is 
begin -
outbyte{7) <"' inbyte(7) XOR inbyte(6) XOR inbyte(4) 
inbyte(2} XOR inbyte(l) XOR inbyte(O); 
outbyte(6} <"' inbyte(6} XOR inbyte(S) XOR inbyte(3) 
inbyte ( l) XOR inbyte(O); 
outbyte(5) <"'" inbyte(7) XOR inbyte{6) XOR inbyte(S) 
XOR inbyte (3) XOR 
XOR inbyte(3) XOR 
XOR inbyte (2) XOR 
XOR inbyte(2) XOR 
XOR inbyte ( 0) ; 
XOR inbyte { 1) XOR 
XOR inbyte (0}; 
XOR inbyte(l) XOR 
XOR inbyte ( 1) XOR 
XOR inbyte(3} XOR 
XOR inbyte(2) XOR 
XOR inbyte ( 1) ; 
XOR inbyte(3} XOR 
XOR inbyte ( 2) XOR 
XOR inbyte(3); 
outbyte(4) <= inbyte(6) XOR inbyte(5) XOR inbyte(4) 
outbyte{3) <= inbyte { 5) XOR inbyte(4) XOR inbyte(3) 
outbyte(2) <= inbyte ( 6} XOR inbyte ( 1) ; 
outbyte(ll <"' inbyte ( 6) XOR inbyte ( 5) XOR inbyte { 4) 
inbyte (2) XOR inbyte ( 1) ; 
outbyte(O) <= inbyte(7) XOR inbyte(S) XOR inbyte ( 4} 




use ieee std_logic_1164.all; 
entity multgf 47 is 
PORT{ inbyte; in std logic vector(7 downto 0); 
outbyte out std logic vector(7 downto 0)); 
end multgf_47; -
architecture struct of multgf 47 is 
begin 
outbyte(7) <= inbyte(3) XOR inbyte{l}; 
outbyte(6) <= inbyte (7) XOR inbyte(2) XOR inbyte ( 0} ; 
outbyte {5) <= inbyte { 6) XOR inbyte ( 3) ; 
outbyte(4) <= inbyte ( 5) XOR inbyte ( 2) ; 
outbyte(3) <= inbyte ( 7} XOR i nbyte ( 4 ) XOR inbyte ( 1) ; 
outbyte(2) <"' inbyte(6) XOR inbyte(1) XOR inbyte{O); 
outbyte(l) <= inbyte ( 5) XOR inbyte ( 3} XOR inbyte { 1} 





entity multgf_AE is 
PORT( inbyte : in std !ogle vector(7 downto 0); 
outbyte out std logic vector(7 downto 0)); 
end multgf_AE; - -
architecture struct of multgf_AE is 
begin 
outbyte{7) <= inbyte { 6) XOR inbyte ( 4) XOR inbyte(O}; 
outbyte(6) <= inbyte (5} XOR inbyte (3); 
outbyte(5} <= inbyte(7) XOR inbyte(6) XOR inbyte(2) 
outbyte(4} <"' inbyte(7} XOR inbyte(6) XOR inbyte(S) 
outbyte(3) <= inbyte(7) XOR inbyte(6} XOR inbyte{5) 
inbyte(O); 
outbyte(2) <"' inbyte ( 5) XOR inbyte(3) XOR inbyte(O}; 
outbyte(l) <"' inbyte ( 6) XOR inbyte(2) XOR inbyte(O}; 





entlty multgf 3D lS 
PORT( inbyte in std_loglc_vector(7 downto 0); 
XOR inbyte ( 2) ; 
XOR inbyte{1); 
XOR inbyte{3) XOR 
XOR inbyte(3) XOR 
XOR inbyte(O); 
XOR inbyte ( 0) ; 
XOR inbyte ( 1) ; 
XOR inbyte(4) XOR 
Outbyte out std logic vector(? downto 0)); 
end multgf_3D; - -
architecture struct of multgf 3D is 
begin 
outbyte (7) <= inbyte(3) XOR inbyte(2); 
outbyte ( 6) <= inbyte (2) XOR inbyte(l); 
outbyte(5} <= inbyte ( 7} XOR inbyte(3) XOR inbyte(2) 
inbyte(O}; 
outbyte(4} <= inbyte(7) XOR inbyte{6) XOR inbyte(2) 
inbyte(O}; 
outbyte(3} <= inbyte(7} XOR inbyte(6) XOR inbyte(5) 
inbyte (0); 
outbyte(2} <= inbyte ( 6) XOR inbyte(S} XOR inbyte ( 4) 
in byte ( 2} XOR in byte ( 0); 
outbyte{1} <= inbyte(5} XOR inbyte(4) XOR inbyte(1); 





entity multgf_19 is 
PORT(inbyte : in std_logic_vector(7 downto 0); 
outbyte : out std logic vector{? downto 0) ); 
end multgf_l9; - -
architecture struct of multgf 19 is 
begin -
XOR inbyte(5} XOR 
XOR inbyte(4) XOR 
XOR inbyte(5) XOR 
XOR inbyte(4) XOR 
XOR inbyte(4) XOR 
XOR inbyte{4) XOR 
XOR inbyte(l) 
XOR inbyte(1} 











outbyte(7) <= inbyte(6) 
outbyte(6) <= inbyte(S) 
outbyte(5) <= inbyte{6) 
outbyte{4) <= inbyte(S) 
outbyte(3) <= inbyte(7) 
outbyte(2) <= inbyte(S) 
outbyte(l) <= inbyte(7} 




















entity multgf 03 is 
PORT{ inbyte ~in std logic vector{? downto 0); 
outbyte out std logic vector(? downto 0) ); 
end multgf 03; - -
architecture struct of multgf 03 is 
begin -
outbyte(7} <= inbyte(7) 
outbyte(6) <= inbyte(7) 
outbyte(5) <= inbyte(S) 
outbyte(4) <= inbyte(4} 
outbyte(3} <= inbyte(7) 
outbyte(2) <= inbyte(7) 
outbyte(l) <= inbyte(l} 














Author : Ananda Raja A/L Dare Raja 
ID 1669 
component This component is the s-boxes. 
Version 1.0 -Beta 
library ieee; 
use ieee.std logic 1164.all; 











--S-boxes used for multiple permutation 
ent~ty S boxes ~s 
port( income : in std logic vector(31.ctownto 0); 
sO : ~n std log~c vector(31 downto 0); 
S1 in std-logic_vector(31 downto 0); 
outcome : out std_logic_vector(31 downto 0); 
out_bankl_x, 
in_ bank2 _ x, 
out bank2 x, 
in bank3 X out std logic vector(31 down to 0)); 
end S_boxes; - ~ - -
--Building block of the S-boxes 
architecture struct OF s boxes IS 
-- component declaration 
--ql permutation modules 
component Perm_ql 
port( input :in std logic vector(? downto 0); 
output :out std~logic_vector(7 downto 0) ); 
end component; 
--qO permutation modules 
component Perm_qO 
port( input :in std_logic_vector(7 downto 0); 
output :out std logic vector(? downto 0)); 
end component; - -
--XOR modules 
component xor_2x32 
PORT(a in std logic vector(31 downto 0); 
b :-in std_logic_vector{31 downto 0); 




out bank!, --Output of the first bank of q-permutations 
in bank2, --Input to the second bank of q-permutations (after XORing 
with SO} 
out bank2, --Output of the second bank of q-permutations 
in bank3, --Input to the third bank of q-permutations (after XORing 
with Sl) 
out_bank3 : std_logic_vector(31 downto 0}; --Output of the third bank 
of q-permutations 
--start of s boxes structure 
BEGIN 
out bankl_x <= out_bankl; 
in bank2 x <= in bank2; 
out bank2 x <= oUt bank2; 
in_bank3_X <= in_bank3; 
-- BANK 1 
--first permutation in S-box 0 
S_boxO_bankl Perm_gO 
port map( input=> income(7 downto 0), 
output=> out_bank1(7 downto 0)}; 
--first permutation in S-box 1 
S box1 bankl : Perm g1 
pOrt mapt input => income(ls downto 8J, 
output=> out_bankl{l5 downto 8)}; 
--first permutation in S-box 2 
S box2 bankl Perm gO 
pOrt map! input=> income(23 downto 16J, 
output=> out_bank1(23 downto 16)); 
--first permutation in S-box 3 
S box3_bank1 · Perm_g1 
port map( input=> income(31 downto 24), 
output=> out_bank1(31 downto 24)); 
--XOR the converted output of the first bank of permutations and SO 
XOR SO xor 2x32 
port map( a~> out bankl, 
b =>SO,-
result => in_bank2}; 
BANK 2 
--Second permutation in S-box 0 
S_boxO_bank2 : Perm_qO 
port map( input => in bank2(7 downto 0), 
output=> oUt_bank2(7 downto 0)}; 
--Second permutation in S-box 1 
S boxl bank2 Perm gO 
pOrt map( input=> in_bank2(15 downto BJ, 
output=> out bank2(15 downto 8)); 
--second permutation in-S-box 2 
S box2 bank2 Perm ql 
pOrt map( input=> in bank2t23 downto 161, 
output=> oUt_bank2(23 downto 16)); 
--Second permutation in S-box 3 
S box3 bank2 Perm ql 
pOrt mapt input=> in bank2t31 downto 241, 
output=> oUt_bank2{31 downto 24)); 
BANK2 XOR Sl 
XOR_Sl : xor_2x32 
port map( a => out_bank2, 
b => Sl, 
result => in_bank3); 
BANK 3 
--Third permutation in S-box 0 
S_box0_bank3 : Perm_ql 
port map( input=> in_bank3(7 downto 0), 
output=> out bank3(7 downto 0)); 
--Third permutation in S-box 1 
S boxl bank3 : Perm qO 
pOrt mapt input=> in bank3(15 downto BJ, 
output=> oUt_bank3(15 downto 8}); 
--Third permutation in S-box 2 
S box2 bank3 Perm g1 
pOrt map( input=> in_bank3t23 downto 161, 
output=> out bank3(23 downto 16) ); 
--Thirdd permutation in-S-box 3 
S box3 bank3 Perm gO 
pOrt mapt input=> in_bank3t31 downto 241, 
output=> out bank3(31 downto 24}); 
Final result 
--Assigning output from signal 
outcome <= out bank3; 
end struct; -
32. SUB_ OPSELECT 
Author : Ananda Raja A/L Dare Raja 
ID : 1669 
Component : This component is the suboperation selector. 
Version : 1.0 -Beta 
library ieee; 
use ieee.std_logic 1164.all; 
entity sub opselect is 
port (rotate before : in std logic; 
F fct- in std logic vector{31 downto 0}; 
ifiput : in std-logic-vector(31 downto 0); 
output : out std_logic_vector(31 downto 0)}; 
end sub_opselect; 
architecture struct of sub opselect is 
component mux 2x32 ~ 
port (sel : iTI std_logic; 
in 0 : in std logic vector (31 downto 0); 
in-1 : in std-logic~vector (31 downto 0); 
output : out Std_lo9ic_vector {31 downto 0)); 
end component; 
component xor 2x32 
port (a ln Stct logic vector(31 downto 0); 
b in std-logic-vector(31 downto 0); 
result oUt std-logic vector(31 downto 0)); 
end component; - -
component roll_32 
port (data ; in std logic vector(31 downto 0); 
q out std_l0gic_v8ctor(31 downto 0) ); 
end component; 
component rorl 32 
port (data in std_logic_vector(31 downto 0); 
q ; out std logic vector(31 downto 0)); 
end component; - -
signal rolinput,outmux1,outmuxl_xor_f,ror_outmuxl_xor_f 
std_logic_vector (31 downto 0 }; 
begin 
rotleftl: roll 32 
port map (data => input, 
q "'> rolinput); 
sell: mux_2x32 
port map(sel => rotate_before, 
in 0 "'"> input, 
in-1 => rolinput, 
output =>outmuxl); 
xor1: xor 2x32 
port map (a => F fct, 
b => oUtmux1, 
result=> outmux1 xor_f); 
rotrightl: ror1_32 
port map (data => outmuxl xor f, 
q => ror outmuxl xor f); 
- - -
sel2: mux 2x32 
port map(-sel => rotate_before, 
in 0 => ror outmuxl xor f, 




Author Ananda Raja A/L Dare Raja 
ID : 1669 
Component This component is the top-level wrapper. 
Version : 1.0 -Beta 
library IEEE; 
use IEEE.STD LOGIC_1164.ALL; 
use IEEE.STD LOGIC ARITH.ALL; 
use IEEE.STD=LOGIC=UNSIGNED ALL; 
entity wrapper is 
Port I address in std_loglC vector(4 downto 0); -- address 
write ln std logic write 
read in std-logic -- read 
clock in std=logic -- clock 
reset in std logic -- reset 
enable in std-logic -- enable 
load_key: in std-logic -- load key 
start in std-logic - start process 
encrypt in std-logic -- encrypt or decrypt 
append in std-logic - encrypt or decrypt 
indatabus in Std_logic_vector(31 downto 0} ; -- input 
databus 
idle out std_logic ; -- idle 
outdatabus out std_logic_vector(31 downto 0)) ; --databus 
end wrapper; 
architecture Behavioral of wrapper is 
component core 
port (inport ln std loglc vector(l27 downto 0); --input from 
cleartext entity 
inkey : in std logic vector (127 downto 0); --input from 
keymodule entity - -
elk : in std logic; --Clock signal 
reset : in std_logic; 
usr ld key : in std logic; --Usr requests load key 
usr-start in std logic; --Usr requests start 
usr-encrypt : in std logic; --Usr requests encrypt 
idl~ : out std logici --Device is idle 
outCiphertext: out std_logic_vector(l27 downto 0)); 
end component; 
component reg32 
port(clock,clr,enable : in std loglc, 
data : in std_logic_vector (31 downto 0); 
q: out std logic vector {31 downto 0)); 




lowerkeyO, lowerkeyl, lowerkey2, lowerkey3, 
lowkeyO, lowkey1, lowkey2, lowkey3, 
lowerciphertextO,lowerciphertext1,lowerciphertext2,lowerciphertext3 
std_logic_vector(31 downto 0); 
Signal writesignal, 
readsignal : std_logic; 
signal inputplaintext, 
outCiphertext, 




if (address = "00000"} then lowerplaintextO <= indatabus; 
elsif (address = "00001") then lowerplaintext1 <= indatabus; 
elsif (address = "00010") then lowerplaintext2 <= indatabus; 
elsif (address = "00011") then lowerplaintext3 <= indatabus; 
elsif (address = "00100"} then lowerkeyO <= indatabus; 
elsif (address "00101") then lowerkey1 <= indatabus; 
elsif (address = "00110") then lowerkey2 <= indatabus; 
elsif (address = "00111"} then lowerkey3 <= indatabus; 
elsif (address = "01000") then lowerciphertextO <= outCiphertext(31 
downto 0}; 
elsif (address "01001") then lowerciphertext1 <= outCiphertext(63 
downto 32); 
elsif (address "01010"} then lowerciphertext2 <= outCiphertext(95 
down to 64); 









if (write= '1' and read '0' and enable 
'1'; 






if (write = '0' and read= '1' and enable 
'1'; 





port map (data => lowerp1aintext0, 
clock => clock, 
clr => reset, 
enable => writesignal, 
q => lowplainO); 
memblockl: reg32 
port map (data => lowerplaintextl, 
clock => clock, 
'1') then writesignal <= 
'1') then readsignal <= 
clr ""> reset, 
enable => writesignal, 
q => lowplainl}; 
memblock2: reg32 
port map (data => lowerplaintext2, 
clock => clock, 
clr => reset, 
enable => writesignal, 
q => lowplain2}; 
memblock3: reg32 
port map (data => lowerplaintext3, 
clock => clock, 
clr => reset, 
enable => writesignal, 
q => lowplain3}; 
memblock4: reg32 
port map (data => lowerkeyO, 
clock "'> clock, 
clr => reset, 
enable => writesignal, 
q => lowkeyO} ; 
memblock5: reg32 
port map (data => lowerkeyl, 
clock => clock, 
clr => reset, 
enable => writesignal, 
q => lowkey1) ; 
memblock6: reg32 
port map (data => lowerkey2, 
clock => clock, 
clr => reset, 
enable => writesignal, 
q => lowkey2} ; 
memblock7: reg32 
port map (data => lowerkey3, 
clock => clock, 
clr "'> reset, 
enable => writesignal, 
q => lowkey3} ; 
process(clock} 
begin 




inputplaintext <= lowplain3 & lowplain2 & lowplainl & 







if (address "' "01000" and readsignal "''1') then outdatabus 
<=lowerciphertextO; 
elsif (address= "01001" and readsignal ='1') then outdatabus 
<=lowerciphertext1; 
elsif (address= "01010" and readsignal ='1') then outdatabus 
<=lowerciphertext2; 








(inport => inputplaintext, 
inkey => inputkey, 
elk => clock, 
reset => reset, 
usr ld_key => load_key, 
usr_start => start, 
usr_encrypt => encrypt, 




Author Ananda Raja A/L Dare Raja 
- ID 1669 
Component : This component XORs 2 single bit values. 
-Version : 1.0 - Beta 
library ieee; 
use ieee.std_logic_1164.all; 
entity xor 2xl is 
port(a,b :-in std logic; 
result : out-std_logic); 
end xor_2xl; 








Author Ananda Raja A/L Dare Raja 
ID 1669 
Component : This component XORs 2 -> 4 bit values. 
Version 1.0 - Beta 
library ieee; 
use ieee.std_logic_1164.all; 
entity xor 2x4 is 
PORT(a i; std logic vector(3 downto 0); 
b : in std-logic-vector(3 downto 0); 
result : o~t std-log1c vector(3 downto 0)); 
end xor 2x4; -
architecture behavior of xor_2x4 is 
component xor 2x1 
PORT(a,b : in-std_logic; 
result out std logic); 
END component; 
begin 
aa: for i in 0 to 3 generate 




Author Ananda Raja A/L Dare Raja 
ID : 1669 
Component : This component XORs 2 -> 32 bit values. 
Version : 1.0- Beta 
library ieee; 
use ieee.std_logic_l164.all; 
entity xor_2x32 is 
port(a,b in std logic vector(31 downto 0); 
result : out-std lOgic vector(31 downto 0)); 
end xor_2x32; - -
architecture behavior of xor 2x32 is 
component xor 2x1 -
PORT(a,b : in std logic; 
result oUt std log1c), 
END component; 
begin 
aa: for i in 0 to 31 generate 





--- Autho~ : Ananda Raja A/L Dare Raja 
---- ID : 1669 
---- Component This component XORs 3 single bit values. 
--- Version 1.0 - Beta 
library ieee; 
use ieee.std_logic_ll64.all; 
entity xor_3xl is 
port(dataa,datab,datac · Ln std !ogle, 
result out std logic); 
end xor_3xl; -









Author : Ananda Raja A/1 Dare Raja 
ID : 1669 
Component This component XORs 3 -> 4 bit values. 
version 1.0 - Beta 
library ieee; 
use ieee.std_logic_1164.all; 
entity xor 3x4 is 
PORT{ dataa in std log~c vector{3 downto 0); 
datab in std-logic-vector{3 downto 0); 
datac in std-logic-vector(3 downto 0); 
result : out std loglc_vector{3 downto 0)); 
end xor 3x4; 
architecture behavior of xor_3x4 is 
component xor_3xl 
PORT{dataa,datab,datac ~n std log~c, 
result out std_logic); 
END component; 
begin 
aa: for i in 0 to 3 generate 




Author Ananda Raja A/L Dare Raja 
ID : 1669 
Component 
Version 
This component XORs 4 single bit values. 
1.0 - Beta 
library ieee; 
use ~eee.std_log~c 1164.all; 
entity xor 4xl is 
port (x,y,Z,w : in std_logic; 
result : out std_logic}; 
end xor 4xl; 








Author : Ananda Raja AIL Dore Raja 
ID : 1669 
Component This component XORs 4 -> 8 bit values. 
Version : 1.0- Beta 
library ieee; 
use ieee.std logic_1164.all; 
entity xor 4x8 is 
port(x : i; std_logic_vector(7 downto 0); 
y in std logic vector(? downto 0); 
z : in std=logic=vector(7 downto 0); 
w : in std logic vector(7 downto 0); 
result : oUt std=logic_vector(7 downto 0)); 
END xor 4x8; 
architecture behavior of xor_4x8 is 







aa: for i in 0 to 7 generate 





Author : Ananda Raja A/L Dare Raja 




This component XORs 8 single bit values. 
1.0 - Beta 
use ieee.std_logic_l164.all; 
entity xor_Bxl is 
port(a,b,c,d,e,f,g,h . lO std loglc, 
result out std logic); 
end xor Bxl; -








Author : Ananda Raja A/L Dare Raja 
ID 1669 
Component This component XORs 8 -> 8 bit values. 
Version : 1.0 - Beta 
library ieee; 
use ieee.std_logic_l164.all; 
entity xor 8x8 is 
port (a In std logic vector(7 downto 0}; 
b in std logic vector(7 downto OJ; 
c in std-logic-vector(7 downto 0); 
d in std-logic-vector(7 downto 0); 
e in std-logic-vector(7 downto 0); 
f in std-logic-vector(7 downto 0); 
g in std-log~c-vector(7 downto 0); 
h in std-log~c vector(7 downto 0); 
result oUt std_logic vector(7 downto 0)); 
end xor_8x8; 
architecture behavior of xor 8x8 is 
component xor_8xl -
PORT(A,B,C,D,E,F,G,H : in std log~c; 
result out std logic); 
END component; 
begin 
aa: for i in 0 to 7 generate 
bb: xor 8xl port 
map(a (i) ,b(i}~c(i} ,d(i) ,e (i), f(i) ,g(i) ,h(i) ,result(i)); 
end generate; 
end behavior; 
