An embedded system for networking security applying cryptographic acceleration in field programmable gate array hardware by Paramasivam, Vishnu
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION ii
DEDICATION iii
ACKNOWLEDGEMENT iv
ABSTRACT v
ABSTRAK vi
TABLE OF CONTENTS x
LIST OF TABLES xii
LIST OF FIGURES xv
LIST OF ABBREVIATIONS xvi
LIST OF APPENDICES xviii
1 INTRODUCTION 1
1.1 Background 1
1.2 Problem Statement 3
1.3 Objectives 4
1.4 Scope of Work 5
1.5 Methodology 6
1.6 Research Contribution 8
1.7 Thesis Organization 8
2 LITERATURE REVIEW AND BACKGROUND 10
2.1 Previous Work 10
2.2 Nios2-Linux 12
2.3 The SSL/TLS Protocol 14
2.3.1 OpenSSL 17
2.4 Certificates 17
2.5 Cryptography: The AES Symmetric Block Cipher 19
2.6 Cryptography: Hashing 21
2.7 Public Key Cryptography 22
viii
2.8 Random Number Generation 23
3 THE EMBEDDED CRYPTOSYSTEM - SYSTEM
OVERVIEW AND NIOS2-LINUX 25
3.1 Introduction 25
3.2 System Layers 26
3.2.1 The Application Layer 27
3.2.2 The Operating System Layer 28
3.2.3 The Hardware Layer 30
3.3 Nios2-Linux: Setup and Implementation 32
3.3.1 Compiling the Nios2-Linux Kernel 34
3.3.2 Nios2-Linux Root Filesystem 35
3.3.3 Nios2-Linux Boot Sequence 35
3.3.4 Cross-Compiling Programs for Nios2-Linux 36
3.3.5 IO Programming in User Space 37
3.3.6 Customizing the Kernel and Bundled Appli-
cations 38
4 HARDWARE DESIGN OF THE EMBEDDED CRYPTOSYS-
TEM 39
4.1 Overview 39
4.2 Nios II Processor and the Avalon Interface 39
4.2.1 Avalon Interface Specifications 41
4.2.2 Designing the Avalon Interface Unit 43
4.2.3 Device Driver Design 48
4.3 Design of the AES Cryptographic Core 50
4.3.1 The Key Expansion Block 53
4.3.2 The AES Core Avalon Interface and Device
Driver 58
4.4 SHA-1 Cryptographic Core 60
4.4.1 Device Driver Design for the SHA-1 Core 62
4.5 SHA-2 Cryptographic Core 65
4.5.1 Device Driver Design for the SHA-2 Core 65
4.6 RSA Cryptographic Core 67
4.6.1 Device Driver Design for the RSA Core 69
4.7 RNG Hardware Core 71
4.7.1 Device Driver Design for the RNG Core 72
4.8 USB Chip Controller 73
ix
4.9 Custom Clock Counter Unit 76
4.10 Ethernet and Memory Device Controllers 77
5 SOFTWARE SUBSYSTEM DESIGN AND OPENSSL INTE-
GRATION 78
5.1 Introduction 78
5.2 Configuring the OpenSSL Library for the Target
Environment 80
5.3 Modification of the Cryptographic Functions in the
OpenSSL Library 81
5.3.1 Modification of the AES Function 86
5.3.2 Modification of the SHA-1 Function 88
5.3.3 Modification of the SHA-2 Function 89
5.3.4 Modification of the RSA Function 91
5.3.5 Calculating the Montgomery Value 96
5.4 Certificate Generation 98
6 DESIGN VERIFICATION AND SYSTEM VALIDATION -
RESULTS AND DISCUSSION 101
6.1 Verification of the Hardware Cryptographic Cores 101
6.1.1 Verification of the AES Core 102
6.1.2 Verification of the SHA-1 Core 104
6.1.3 Verification of the SHA-2 Core 105
6.1.4 Verification of the RSA Core 107
6.1.5 Verification of the RNG Core 109
6.2 Performance Test and Results 110
6.2.1 Performance of the AES Core 110
6.2.2 Performance of the Hardware Accelerated
OpenSSL Cryptographic Functions 112
6.3 System Validation with Application Prototypes 114
6.3.1 Complete Cryptographic Test Program 114
6.3.2 USB File Encryption and Decryption Program 116
6.3.3 Secure Bank Check Transfer Program 117
7 CONCLUSIONS 122
7.1 Concluding Remarks 122
7.2 Future Work 124
xREFERENCES 125
Appendices A – G 128 – 165
xi
LIST OF TABLES
TABLE NO. TITLE PAGE
2.1 Time taken for several network functions on the RCM2100
embedded system [2]. 11
4.1 Avalon interface register table for the AES core. 45
4.2 Example of cached and uncached addresses for hardware cores. 50
4.3 Avalon interface register table for the SHA-1 core. 63
4.4 Avalon interface register table for the SHA-2 core. 66
4.5 Avalon interface register table for the RSA core. 70
4.6 Avalon interface register table for the clock counter unit. 76
5.1 Example 8-bit memory address pointers and its equivalent 32-bit
memory address pointers when converted. 85
6.1 AES key expansion execution time comparison in clock cycles. 111
6.2 AES core statistics and performance. 111
6.3 Comparison of AES core latency with other designs in clock cycles.111
6.4 AES core performance in the Nios2-Linux environment. 112
6.5 Execution time in clock cycles for the AES algorithm. 112
6.6 Performance results for the Rijndael algorithm on the
TMS320C6201 [4]. 113
6.7 Execution time in clock cycles for the SHA-1/SHA-2 hash
algorithm. 113
6.8 Execution time in clock cycles for the RSA private key encryption. 113
6.9 Execution time in clock cycles for the RSA public key encryption. 114
6.10 Execution time in seconds for the complete cryptographic test
program. 115
6.11 File encryption and decryption in AES for a large file. 117
6.12 File encryption and decryption in AES for a small file. 117
6.13 Execution time in seconds for the secure bank check transfer
application. 121
xii
7.1 Performance improvement for the OpenSSL cryptographic library. 122
A.1 Fundamental signals for the Avalon-MM interface. 129
xiii
LIST OF FIGURES
FIGURE NO. TITLE PAGE
1.1 Illustration of a man in the middle attack. 2
1.2 Layers of the proposed embedded cryptosystem. 5
1.3 Secure communication between the embedded system and a PC. 6
1.4 Methodology used to achieve research objectives. 7
2.1 Diagram of the connection between Nios2-Linux, FPGA, Nios
II CPU and the external hardware. 13
2.2 The SSL protocol [16]. 15
2.3 The AES encryption and decryption process [20]. 20
2.4 One iteration in a SHA-2 family compression function [21]. 21
2.5 Illustration of how communication with public key cryptography
works. 23
3.1 Secure communication between the embedded system and a PC. 26
3.2 Layers of the proposed embedded cryptosystem and its relation
with the prototyping hardware. 27
3.3 Nios2-Linux RTOS functional block diagram. 28
3.4 Functional block diagram of the hardware layer. 30
3.5 Nios2-Linux hello world program. 36
3.6 Nios2-Linux memory pointers to hardware. 37
3.7 Nios2-Linux version of the IORD and IOWR macros. 37
4.1 Avalon slave interface to a PIO with only output signals [27]. 42
4.2 Slave read and write transfers with fixed wait-states [27]. 42
4.3 The AES core connected to an Avalon interface. 44
4.4 Verilog design of the Avalon Interface (main part). 46
4.5 Sending the input data to the AES core using C-language macros. 49
4.6 Reading the output data from the AES core using C-language
macros. 49
4.7 The AES encryption and decryption process. 51
xiv
4.8 AES top level FSM flowchart. 52
4.9 AES cryptographic core functional block diagram. 53
4.10 The Rijndael key schedule. 53
4.11 Rcon lookup table in Verilog. 55
4.12 Multiplexed inputs for S-box with rotated output. 57
4.13 Multiplexing the first S-box with rotation in Verilog. 57
4.14 Key expansion functional block diagram. 58
4.15 Key expansion IO block diagram. 58
4.16 AES core block diagram. 59
4.17 The AES core device driver. 60
4.18 The SHA-1 cryptographic core with Avalon interfacing. 61
4.19 The modified SHA-1 core. 62
4.20 The SHA-1 core driver. 64
4.21 The SHA-2 cryptographic core with Avalon interfacing. 65
4.22 The SHA-2 core driver. 67
4.23 The RSA cryptographic core with Avalon interfacing. 68
4.24 The RNG cryptographic core with Avalon interfacing. 71
4.25 The RNG core driver. 72
4.26 Microtronix USB-Ethernet connections to Santa Cruz Header. 73
4.27 IO for Avalon interface and the physical connections. 74
4.28 Declaring registers and assigning the Avalon signals to the
corresponding ISP1161A pins. 74
4.29 Pin assignments in respect to the status of the reset signal. 75
4.30 Functional block diagram of the clock counter unit. 76
4.31 The custom clock counter device driver. 77
5.1 The OpenSSL library and its connections to the other layers. 79
5.2 The linuxsystem.h common header file. 83
5.3 Illustration on byte endianness. 84
5.4 Memory copying function to convert an 8-bit pointer into as 32-
bit pointer to avoid memory address misalignment. 86
5.5 Using OpenSSL to do a AES encryption operation. 87
5.6 Using OpenSSL to do a SHA-1 hashing operation. 88
5.7 Using OpenSSL to do a SHA512 hashing operation. 90
5.8 Using the OpenSSL command line tool to generate RSA key pairs. 93
5.9 Using OpenSSL to do RSA2048 encryption and decryption. 94
5.10 The Montgomery value calculation pseudocode. 97
5.11 The Montgomery value calculation function. 98
6.1 Print screen of the AES core test program showing the results. 103
xv
6.2 Print screen of the SHA-1 core test program showing the results. 105
6.3 Print screen of the SHA-2 core test program showing the results. 106
6.4 Print screen of the RSA core test program showing the results. 108
6.5 Print screen of the RSA core test program showing the results. 110
6.6 The USB file encryption and decryption program. 116
6.7 The secure bank check transfer program. 118
6.8 Generating a plaintext electronic check. 118
6.9 Hashing and signing an electronic check. 119
6.10 Signature verification of the electronic check. 119
6.11 Signature verification successful notification on bank server. 120
6.12 Signature verification failure notification on bank server. 120
A.1 Slave read and write transfers with waitrequest. 129
G.1 Lookup table for the Rcon function. 165
xvi
LIST OF ABBREVIATIONS
3DES – Triple Data Encryption Algorithm
AES – Advanced Encryption Standard
API – Application Programming Interface
ASIC – Application Specific Integrated Circuit
BN – Big Number
CA – Certificate Authority
CPU – Central Processing Unit
DES – Data Encryption Standard
ECC – Elliptic Curve Cryptography
FIFO – First In First Out
FIPS – Federal Information Processing Standard
FPGA – Field Programmable Gate Array
FSM – Finite State Machine
GPU – Graphic Processing Unit
GUI – Graphical User Interface
HAL – Hardware Abstraction Layer
HDL – Hardware Development Language
IC – Integrated Circuit
I/O – Input/Output
IP – Intellectual Property
LE – Logic Elements
Mbit – Mega Bits
MD5 – Message Digest Algorithm (Fifth Series)
xvii
MHz – Mega Hertz
ms – millisecond
PC – Personal Computer
PCI – Peripheral Component Interconnect
PIO – Parellel Input Output
PKI – Public Key Infrastructure
RISC – Reduced Instruction Set Computer
RNG – Random Number Generator
ROM – Read Only Memory
RSA – Rivest-Shamir-Adleman
RTL – Register Transfer Level
SDRAM – Synchronous Dynamic Random Access Memory
SHA-1 – Secure Hash Algorithm - 1
SHA-2 – Secure Hash Algorithm - 2
SoC – System-on-Chip
SOPC – System-on-Programmable-Chip
SSL – Secure Sockets Layer
TLS – Transport Layer Security
UART – Universal Asynchronous Receiver Transmitter
UNIX – Uniplexed Information and Computing System (originally
spelled UNICS)
USB – Universal Serial Bus
VHDL – Very High Speed Integrated Circuit Hardware Description
Language
VLSI – Very Large Scale Integration
VoIP – Voice over Internet Protocol
xviii
LIST OF APPENDICES
APPENDIX TITLE PAGE
A AVALON-MM INTERFACE SPECIFICATION 128
B AES CORE SOURCE CODE 131
C AVALON INTERFACE SOURCE CODES 143
D PORTING THE OPENSSL LIBRARY 149
E OPENSSL CRYPTOGRAPHIC LIBRARY MODIFICATIONS 152
F TEST APPLICATION SOURCE CODES AND RSA TEST
VECTORS 158
G MISCELLANEOUS HARDWARE SOURCE CODES 165
