Abstract-Increased demand for data security is an undeniable fact. Towards achieving higher security, cryptographic algorithms play an important role in the protection of data from unapproved usage. In this paper, we present a crypto processor using Advanced Encryption Standard (AES). The AES is integrated with a 32-bit general purpose 5-stage pipelined MIPS processor. The integrated AES module is a fully pipelined module which follows inner round and outer round pipeline design. The results show that the presented pipeline version of the AES algorithm along with MIPS processor outperforms traditional methods. At the operating frequency of 553 MHz, the proposed design can achieve the throughput of 58 Gbps, the latency of 240 ns, and the minimum power consumption of 76 mw.
INTRODUCTION
The common goal of cryptographic algorithms is providing security. From last several years, Data Encryption Standard (DES) had been used as a cryptographic algorithm [1] . Due to the short key length of DES it is replaced by the Rijndael algorithm which has became as a standard in the cryptography domain, known as Advanced Encryption Standard (AES) [2] . Encryption is a transformation technique to change one form of data called plain text to an unreadable form of data, called cipher text. The hardware implementation of crypto algorithms, associated with keys, is by nature very secure and cannot be easily modified from outside [3] . After the Rijndael algorithm has become a standard encryption algorithm, many hardware implementations based on FPGA and ASIC have been proposed [3] [4] [5] [6] [7] [8] [9] . ASIC provides low power design but the design time and time to market is very high. Moreover, it has lack of flexibility for changing design parameters. FPGA provides the best platform for the hardware implementation of cryptographic algorithms because of its re-programmable capability and reconfigurability. This paper proposes an approach to combine general purpose processor with crypto co-processor. In this work, the general purpose processor is a MIPS-32 [10] which has five pipeline stages. We used AES-128 as a crypto algorithm acting as a crypto processor. We propose a system which is the integration of MIPS and AES Crypto processor, called MAC. MAC has the ability to run at different frequencies which gives flexibility and choice to end user to adjust the system with the throughput, latency, and power requirements.
Our design is implemented in such a way that crypto instructions do not block the instruction fetch cycle of the processor even though the crypto co-processor is running at the same time. By default, each instruction is fetched from the instruction memory unit and completed all its cycles on the MIPS processor if the instruction is designed for the processor. However, if the fetch instruction is not a MIPS instruction, it will be sent to the crypto co-processor in the next clock cycle after the decode stage. We incorporate crypto co-processor with MIPS and make this integration in a way that crypto co-processor runs by the MIPS without disturbing pipeline stages. The main contributions of this paper are follows:
• The pipeline version of AES is implemented, obtaining high throughput, low latency, and low power consumption.
• The integration of AES and MIPS is presented which has the ability to run at different frequencies.
• The implemented AES acts as a crypto processor controlled by MIPS instruction while it does not disturb the pipeline stages of the MIPS processor.
This paper is organized as follows: Section II presents the FPGA implementation of AES pipelined architecture. Section III describes MIPS based AES Crypto. In Section IV experimental results are discussed, while the conclusion is given in the last section.
II. FPGA IMPLEMENTATION OF AES PIPELINED ARCHITECTURE
The efficient implementation of the AES algorithm on FPGA is being under discussion from last several years in terms of throughput, minimum area requirement, and high speed [11] [12] [13] [14] [15] [16] . The main reason to choose FPGA for the implementation of cryptographic algorithms is that it allows changing design with no additional time cost while the design cycle is also very short. An FPGA based AES implementation is presented in [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] . Fig. 1 shows the implementation technique of pipelined AES in which several procedures can be run concurrently. Pipelining in the AES is performed for a high throughput while speed is increased by handling multiple rounds of AES concurrently. However, pipeline technique tends to consume a lot of area [15] [16] [17] [18] [19] [20] [21] .
III. MIPS BASED AES CRYPTO (MAC)
In our proposed design, shown in Fig. 2 , we integrate the crypto co-processor based on the AES algorithm with the MIPS processor in such a way that AES is executed as the crypto co-processor. This method is called MAC (MIPS-AES Crypto). The hardware implementation using FPGA provides significant performance gains compared to the software implementation using general purpose processor (microprocessor) in terms of parallel processing, pipelining, word size, and speed [22] . Throughput up to several Gbps can be easily achieved by using FPGA. The crypto algorithm agility is also possible using FPGA [9] . In our design if the fetched instruction is a crypto instruction it will be sent to the pipeline stages of crypto co-processor for execution through the decode stage without affecting the pipeline stages of the MIPS processor. There are five different stages named IF, ID, EX, MEM, and WB. If the fetch instruction is a MIPS instruction it runs on MIPS and completes all its cycle on the pipeline stages. However, if the fetched instruction is a crypto instruction it is forwarded to the crypto co-processor from the decode stage as shown in Fig. 2 . One of the contributions of this paper is to extend the general purpose processor with the AES crypto co-processor. If the instruction fetched from the instruction memory is a MIPS instruction, it will run on the pipeline stages of the MIPS processor (decode, execute, and write back). However, if the instruction is crypto-instruction, it will run on the crypto co-processor. The crypto-instruction fetched from instruction memory is decoded during the decode stages and sent to crypto co-processor. The proposed design is also flexible to run on different frequencies to achieve different throughputs and power consumptions. IV. EXPERIMENTAL RESULT In order to implement the design on FPGA, we have used Xilinx ISE 14.4. For power measurement, we have used Xilinx XPower tool and the design is implemented on Virtex 6 ML605. Table I shows hardware and timing statistics. MAC runs on different frequencies, throughput has been measured for different frequencies which have been illustrated in Fig. 3 . The best throughput is 58 Gbps obtained at the maximum frequency of 553 MHz. The minimum latency of the design is 240 ns as shown in Fig. 4 . As depicted in Fig. 5 the minimum power consumption is 76 mw observed at 50 MHz and maximum power consumption is 813 mw observed at 553 MHz. The hardware resource utilizations for MIPS and AES are shown in Fig. 6 . According to this figure, the occupied area for AES and MIPS is 662 and 1885, respectively. We have compared the proposed approach with other implementations reported in the literature in terms of power consumption and throughput. Results are shown in Fig. 7 and Fig. 8 watt at the frequency of 142.8 and 125.3 MHz, respectively. The power consumption of MAC is 0.813 watt at the maximum operating frequency of 553 MHz. Similarly, the maximum throughput of MAC is 58 Gbps which is higher than the throughput reported in each of approaches in [3] and [21] . Similarly, power consumption of MAC is lower than the design presented in [3] and [21] . Table II summarize all the comparisons and results. 
V. CONCLUSION
Encryption algorithm is being used by military and government over a last couple of decades for secure communication. The main purpose of encryption is to hide data from unauthorized usage. In this paper, we purposed a method to employ the crypto processor run in an integration with a General Purpose Processor. In this direction, we have presented a pipeline version of AES algorithm that can encrypt data. The high performance and high configurability of the combination of General Purpose Processor and crypto processor makes it pertinent to various security applications. The proposed design, MAC, has the ability to run on different frequencies and provide flexibility to user adjusting the frequency to meet the throughput, latency, and power consumption requirements. Ours S.M.Yoo. approach 2 [3] 
