ABSTRACT
INTRODUCTION
Many encryption algorithms have come into an existence for information confidentiality, authenticity, integrity, non repudiation and access control such as DES, TDES, Advanced Encryption Standard ( AES) and Blowfish [4, 10] etc. This research work analyzes the merits and demerits of Blowfish compared to TDES algorithm in terms of their operation, propagation delay, memory utilization and throughput of the algorithms considered. The brief information about following algorithms is explained below:
DATA ENCRYPTION STANDARD (DES)
Data Encryption Standard encrypts 64-bit block plain text with 56-bits key length. It is a fiestel network. After initial permutations, it undergoes 16-rounds of processing steps. It can operate in Cipher Block Chaining (CBC), Electronic Code Book (ECB), Cipher Feedback (CFB) and Output Feedback (OFB) modes [10] [12] . It is prone to Brute Force attack in which hacker attempts to break the key by applying all possible combinations of inputs. It's a popular and most widely used algorithm before TDES, AES, and BF algorithms. It's an insecure algorithm [23] [10]
TRIPLE DATA ENCRYPTION STANDARD (TDES)
It is also known as Triple Data Encryption Algorithm (TDEA) which is triplication of Data Encryption Standard (3DES) applied to every 64-bits data block, came into an existence to overcome the brute force attacks commonly suffered by DES algorithm. It has 48 rounds of operations. In this method, three keying options are there:
• Three keys k1,k2 and k3 are independent • Keys k1 and k2 are independent and k3 = k1.
• All three keys are equal, i.e., k1= k2 =k3.
Thus option-1 is the strongest among all three. It has 168 bits of independent key bits where as option-2 has 112 key bits which are moderately secured compared to the option-1. The last option is having 56 key bits as same as DES but used thrice in the algorithm because of all three keys are equal and prediction can be done easily. It is a symmetric key block cipher [16] . It is less secured than AES.
ADVANCED ENCRYPTION STANDARD (AES)
AES is a block cipher with variable key length. The block length is 128 bits and key length may be 128/192/256 bits with 9/11/13 rounds respectively. Each processing round consists of four steps, i.e., substitute bytes, shift rows, mix columns and add round key. AES encryption is flexible, more secured and fast [14] [20] [16] . It is a popular and secured encryption algorithm in the industry compared to DES [9] [12], but it is prone to side channel attacks.
BLOW-FISH (BF)
Blowfish is a symmetric block cipher with variable key length. The plain text is in 64-bit blocks but the key length varies from 32 to 448 bits. The data encryption occurs through 16-round fiestel network. Each round consists of plain text and key dependent operations such as XOR, ADD AND SUBSTITUTE etc. It's faster than TDES and AES [15] [17] . It's a replacement for DES algorithm [16] [4] [10] . Blowfish algorithm is used more than AES due to its large key length and high security. It provides high throughput compared to other algorithms considered in this research work [18, 21] 
RELATED RESEARCH REVIEW
Literature review reveals that Blowfish Algorithm implemented using Verilog HDL gave better results in terms of reduced delay and increased throughput. To mention a few, the jest of few papers referred is given below:
Performance of blowfish algorithm based on field programmable gate array (FPGA) is analyzed in terms of speed, rate of encrypting the given data and power. Results indicate that the proposed Blowfish algorithm reduced delay and increases throughput with low power consumption compared to AES. This paper focused on small high-speed security architectures and systems with low power consumption for mobile devices [1] .
The amalgamation algorithm consists of both Blowfish and Rivest Cipher 6 (RC6) to solve the security problems and maintains the efficiency. It provides faster data transfer and high security, both are very important for Wi-Fi applications. The collision attack problem is eliminated using S-Box overlapping process and Brute Force attack is eliminated using Sub key generation process. It decreases delay time and frequency [2] .
Cloud computing needs secure, fast and area efficient cryptographic techniques. Blowfish cryptosystem is one of the strong and fast algorithms used for cryptography. It uses hybrid algorithm consists of RSA and blowfish algorithms and implemented using VHDL. It has symmetric and asymmetric properties. Thus, it is more useful for cloud computing applications [3] Various range application of blowfish algorithm can be implemented for data encryption sent from an Internet of Things physical network which has IP-based data. Performance metrics are analyzed such as Security, Complexity, propagation delay, and throughput of Blowfish Algorithm. Hardware implementation of blowfish algorithm on FPGA using VHDL which yielded reduced propagation delay and enhanced throughput [6, 18] Conjugate-structure algebraic CELP coding method is used in speech encryption using Blowfish algorithm. A new method for generating S-boxes and P-arrays which are the main building blocks of the Blowfish algorithm is proposed which reduces time, complexity and provides more security [8] .
Performance of symmetric encryption algorithms on power consumption for wireless devices is studied and analyzed to have less battery power consumption. The algorithms considered are DES, 3DES, AES, Blowfish, Rivest Cipher2 (RC2), and RC6. Energy efficiency is the main focus of this design [15, 19] .
Blowfish has better performance than other commonly used encryption algorithms. Blowfish can be considered as an excellent standard encryption algorithm than AES. AES requires more processing power and more processing time than Blowfish algorithm [20] .
Performance analysis of DES and Blowfish is done for wireless networks to provide security to the information. It presented about security, speed and power consumption. Results confirm that Blowfish algorithm runs faster than DES but power consumption is almost same even though blowfish has 448-bit key length and more number of iterations/operations [23] .
THEORETICAL ANALYSIS OF BLOW-FISH ALGORITHM
Blowfish is a block cipher; encryption and decryption is performed in the block sizes of 64-bits. It is a 16-round fiestel network and symmetric algorithm. The plain text of 64-bits separated as two halves, 32 bit each (LE and RE). We perform 16- As shown in fig.2 .0 above, Blowfish algorithm [5, 13] is divided in to two parts: Encryption & Decryption unit for data processing and Sub-key generation Unit for generation of sub-keys to be used in each round of operation. In the data encryption and Decryption block, input 64-bit data block is divided in to two halves as 32-bit Left Encryption (LE) and 32-bit Right Encryption (RE). In each round of operation, the algorithm will perform RE and LE operations as shown in equation (1) for encryption and equation (2) for Decryption which is also shown in fig.1 .0 for Encryption process. The Fiestel function (F) in each round consists of combination of substitution, addition/modulo addition, XOR and addition/modulo addition operations. Thus, the algorithm follows the procedure for 16-rounds. RE 16 and LE 16 are XORed with P 17 and P 18 respectively to generate RE 17 and LE 17. Reverse operation is performed for the decryption operation [7] [11] [22] .
ARCHITECTURAL DESIGN OF PROPOSED BLOWFISH ALGORITHM
The sub-key generation unit is to generate 18-sub-keys (P-Array) from 448-bit input key, i.e., Karray has 14 input sub-keys of 32-bit each, can be used in generating P-Array of P1 to P18 initial sub-keys as shown in fig.2 .0, each one is 32-bit in width which is updated as per the following equations (3): P1=P1^K1, P2=P2 ^K2…P14=P14^K14, P15=P15^K1, P16=P16 ^K2, P17=P17^K3, P18=P18 ^K4;
Where K1 to K14 (32-bits each) are generated from 448-bit input key.
MODULO-M-BIT ADDER:
In the encryption or decryption operation, modulo-addition operation [24] with and without WDDL logic is shown in fig.3 .0. For increasing the speed of series adders in this figure can be operated in parallel. one adder adds Two h-bit residues, X and Y to form their sum S 1 +2 h C out1 .Another one is 3-operand adder that computes "X+Y+m". Note that if m=2 n +1, we have h=n+1.It has been reported that if either Cout1 or Cout2 of this addition is '1' then the output is X+Y+m instead of X+Y. However, in the following we illustrate that only if the carry of "X+Y+m" is '1', it is sufficient to select it as the final output. 
CONSTANT DELAY N-BIT ADDER:
Constant Delay n-bit adder is adder is used to perform two array addition operations as shown in fig.4 .0. The main advantage of this adder is irrespective of input the delay is constant so it's called Constant Delay n-bit adder. The constant delay n-bit adder consists of three n-bit XOR gates, two n-bit AND gates, one n-bit OR gate. In this the input arrays are Xored in XOR n-bit gate and AND operation is performed in AND n-bit gate. The Output of AND n-bit gate is performed left shift operation and the resulted value XORed with output of XOR n-bit gate and AND operation with AND n-bit gate. Again the Output of AND gate is performed left shift operation and the resulted value XORed with output of present XOR n-bit gate. The output of XOR n-bit adder is declared as sum of constant delay nbit adder. The MSB bit of each AND n-bit gate output before shift, is performed OR operation with OR gate and output is declared as carry bit output.
RESULTS AND DISCUSSION
The implementation of Blowfish Algorithm is done in three methods which are compared with Triple DES Algorithm. The implementation of the design followed bottom up approach. The test bench is written in Verilog HDL for every module of the design to provide 100% code coverage of the design. Top level Test Bench (TB) of the design is instantiated with top module of the design which consists of all the sub modules instantiated in it. Test cases are generated, applied to the Design Under Test (DUT) and results are generated for further verification of functionality, Delay estimation, frequency of the design and Throughput calculation. Mentor Graphics ModelSim is used for simulation. Xilinx ISE Design Suite14.2 is used to implement the design on Altera 6.3g_p1 (Quartus II 8.1). The synthesis tool Xilinx ISE 14.2 generated the RTL circuit, Memory Utilization, Propagation delay and even the percentage of area utilized by the design. The Comparison of the four implementations is given in the table.1. This paper compares delay, frequency, memory utilization and throughput of the four implementations listed in the table. 1. In the delay comparison shown in fig.5 .0, Blowfish with constant delay n-bit adder and WDDL logic implementation produces constant delay irrespective of number of stages of adders in the parallel adder design. Constant delay adder makes lot of difference in hardware implementation compared to modulo adder with and without WDDL gates. Hence, it resulted in lowest delay (76.337ns) compared to other implantations. As the delay is less for Blowfish with constant delay n-bit adder and WDDL logic because of parallelism in implementing the hardware design,the frequency is more (13.09MHz) for BFCDNBA implementation compared to TDES, Blowfish with modulo adder implementations as shown below in fig.6 .0. As the frequency of design is high which can convert plaintext to ciphertext at faster rate. Triple DES , Blowfish with modulo adder with and without WDDL logic implementations are more of sequential implementations. Hence, they are slow. Critical path delay is reduced with effective implementation and thus the frequency is improved for constant delay n-bit adder approach. As per fig.7 .0 shown below, the memory utilization of BFCDNBA is more (520.584Mb) because more of parallelism in implementing the hardware design, data related to more number of operations and more iterations are to be stored than other implementations for high speed of Encryption and Decryption processes. S-Boxes are also called as Look Up Tables (LUTs) contains large number of data items to be stored in for future substitutions. Intermediate P-array keys are also requires more memory utility to generate sub-keys for every round of encryption and decryption operations. Throughput is defined as the ratio of number of bits Encrypted/Decrypted to the time taken by the algorithm. As per the results obtained shown in fig.8 .0, BFCDNBA implementation yielded best throughput (840Mbps) compared to other implementations considered. As explained with respect to fig.5 .0, the delay is very less in BFCDNBA implementation compared to other implementations considered in this research work. Hence throughput is very good in the BFCDNBA implementation. As shown in fig. 9 .0, delay of the Blowfish with constant delay n-bit adder and WDDL logic implementation is less and thus throughput of the same is more than the other implementations considered in this research paper. Even though the number of bits of the adder is increasing, the delay is constant and thus the throughput is increased with this approach. 
CONCLUSIONS
As discussed in the results and discussion that BFCDNBA implementation gave better results compared to other implementations. Constant delay n-bit adder circuit used in the Blowfish Algorithm which reduced the delay to 76.337ns, increased frequency to 13.09MHz and thus increased throughput to 840Mbps compared to BFMAWDDL, BFMA and Triple DES implementations. It is providing more security because of 448 bit key length and incorporating WDDL logic in the Encryption and Decryption process of Crypto-processor digital design flow. However, the memory utilization is more (520.584Mb) for BFCDNBA implementation compared to other implementations considered in this research paper because of its complexity, more number of iterations/operations and more key length to provide at most security to the plaintext. Blowfish algorithm is developed in Verilog HDL and implemented it on ModelSim-Altera 6.3g_p1 (Quartus II 8.1) Web Edition and Xilinx ISE Design Suite14.2. This was run on a Windows 7 Home Basic (64-bit) Operating System, Intel® Core(TM) i3-2350M Processor @ 2.30 GHz clock rate with an internal Memory of 4 GB and 500 GB Hard Disk.
Future scope of this research work is to decrease the delay, improve the frequency and yielding better throughput compared to BFCDNBA implementation. This research work is also expected extend it analysis to compare the area utilization of the crypto algorithms considered in this design.
