16 research outputs found

    Optimization of Advanced Encryption Standard (AES) Using Vivado High Level Synthesis (HLS)

    Get PDF
    Advanced Encryption Standard (AES) represents a fundamental building module of many network security protocols to ensure data confidentiality in various applications ranging from data servers to low-power hardware embedded systems. In order to optimize such hardware implementations, High-Level Synthesis (HLS) provides exibility in designing and rapid optimization of dedicated hardware to meet the design constraints. In this paper, we present the implementation of AES encryption processor on FPGA using Xilinx Vivado HLS. The AES architecture was analyzed and designed by loop unrolling, and inner-round and outer-round pipelining techniques to achieve a maximum throughput of the AES algorithm up to 1290 Mbps (Mega bit per second) with very significant low resources of 3.24% slices of the FPGA, achieving 3 Mbps per slice area

    Parallel Multiplier Designs for the Galois/Counter Mode of Operation

    Get PDF
    The Galois/Counter Mode of Operation (GCM), recently standardized by NIST, simultaneously authenticates and encrypts data at speeds not previously possible for both software and hardware implementations. In GCM, data integrity is achieved by chaining Galois field multiplication operations while a symmetric key block cipher such as the Advanced Encryption Standard (AES), is used to meet goals of confidentiality. Area optimization in a number of proposed high throughput GCM designs have been approached through implementing efficient composite Sboxes for AES. Not as much work has been done in reducing area requirements of the Galois multiplication operation in the GCM which consists of up to 30% of the overall area using a bruteforce approach. Current pipelined implementations of GCM also have large key change latencies which potentially reduce the average throughput expected under traditional internet traffic conditions. This thesis aims to address these issues by presenting area efficient parallel multiplier designs for the GCM and provide an approach for achieving low latency key changes. The widely known Karatsuba parallel multiplier (KA) and the recently proposed Fan-Hasan multiplier (FH) were designed for the GCM and implemented on ASIC and FPGA architectures. This is the first time these multipliers have been compared with a practical implementation, and the FH multiplier showed note worthy improvements over the KA multiplier in terms of delay with similar area requirements. Using the composite Sbox, ASIC designs of GCM implemented with subquadratic multipliers are shown to have an area savings of up to 18%, without affecting the throughput, against designs using the brute force Mastrovito multiplier. For low delay LUT Sbox designs in GCM, although the subquadratic multipliers are a part of the critical path, implementations with the FH multiplier showed the highest efficiency in terms of area resources and throughput over all other designs. FPGA results similarly showed a significant reduction in the number of slices using subquadratic multipliers, and the highest throughput to date for FPGA implementations of GCM was also achieved. The proposed reduced latency key change design, which supports all key types of AES, showed a 20% improvement in average throughput over other GCM designs that do not use the same techniques. The GCM implementations provided in this thesis provide some of the most area efficient, yet high throughput designs to date

    A High Throughput/Gate AES Hardware Architecture by Compressing Encryption and Decryption Datapaths --- Toward Efficient CBC-Mode Implementation

    Get PDF
    This paper proposes a highly efficient AES hardware architecture that supports both encryption and decryption for the CBC mode. Some conventional AES architectures employ pipelining techniques to enhance the throughput and efficiency. However, such pipelined architectures are frequently unfit because many practical cryptographic applications work in the CBC mode, where block-wise parallelism is not available for encryption. In this paper, we present an efficient AES encryption/decryption hardware design suitable for such block-chaining modes. In particular, new operation-reordering and register-retiming techniques allow us to unify the inversion circuits for encryption and decryption (i.e., SubBytes and InvSubBytes) without any delay overhead. A new unification technique for linear mappings further reduces both the area and critical delay in total. Our design employs a common loop architecture and can therefore efficiently perform even in the CBC mode. We also present a shared key scheduling datapath that can work on-the-fly in the proposed architecture. To the best of our knowledge, the proposed architecture has the shortest critical path delay and the most efficient in terms of throughput per area among conventional AES encryption/decryption architectures with tower-field S-boxes. We evaluate the performance of the proposed and some conventional datapaths by logic synthesis results with the TSMC 65-nm standard-cell library and NanGate 45- and 15-nm open-cell libraries. As a result, we confirm that our proposed architecture achieves approximately 53--72% higher efficiency (i.e., a higher bps/GE) than any other conventional counterpart

    VLSI ENHANCEMENT OF AREA OPTIMIZED FLEXIBLE ARCHITECTURE SUPPORTING SYMMETRIC CRYPTOGRAPHY

    Get PDF
    Data encryption (cryptography) is utilized in various applications and environments. The specific utilization of encryption and the implementation of the AES will be based on many factors particular to the computer system and its associated components. Communication security provides protection to data by enciphering it at the transmitting point and deciphering it at the receiving point. File security provides protection to data by enciphering it when it is recorded on a storage medium and deciphering it when it is read back from the storage medium. In the proposed design the security method uses symmetric Cryptography technique which provides same keys to sender and receiver to transfer the information for reducing the design complexity. The transferred information is stored in the memory unit for further using it or for further processing using low cost buffer element. Transmission cables are used to provide communication between the connected devices. Control logic is used to provide the patters for transmission through crypto unit

    The Algorithm of AAES

    Get PDF
    The Advanced Encryption Standard (AES) was specified in 2001 by the National Institute of Standards and Technology. This paper expand the method and make it possible to realize a new AES-like algorithm that has 256 bits fixed block size, which is named AAES algorithm. And we use Verilog to simulate the arithmetic and use Lattice Diamond to simulate the hardware property and action. We get the conclusion that the algorithm can be easily used on indestury and it is more robustness and safety than AES. And they are on the same order of magnitude in hardware implementation

    Design and analysis of an FPGA-based, multi-processor HW-SW system for SCC applications

    Get PDF
    The last 30 years have seen an increase in the complexity of embedded systems from a collection of simple circuits to systems consisting of multiple processors managing a wide variety of devices. This ever increasing complexity frequently requires that high assurance, fail-safe and secure design techniques be applied to protect against possible failures and breaches. To facilitate the implementation of these embedded systems in an efficient way, the FPGA industry recently created new families of devices. New features added to these devices include anti-tamper monitoring, bit stream encryption, and optimized routing architectures for physical and functional logic partition isolation. These devices have high capacities and are capable of implementing processors using their reprogrammable logic structures. This allows for an unprecedented level of hardware and software interaction within a single FPGA chip. High assurance and fail-safe systems can now be implemented within the reconfigurable hardware fabric of an FPGA, enabling these systems to maintain flexibility and achieve high performance while providing a high level of data security. The objective of this thesis was to design and analyze an FPGA-based system containing two isolated, softcore Nios processors that share data through two crypto-engines. FPGA-based single-chip cryptographic (SCC) techniques were employed to ensure proper component isolation when the design is placed on a device supporting the appropriate security primitives. Each crypto-engine is an implementation of the Advanced Encryption Standard (AES), operating in Galois/Counter Mode (GCM) for both encryption and authentication. The features of the microprocessors and architectures of the AES crypto-engines were varied with the goal of determining combinations which best target high performance, minimal hardware usage, or a combination of the two
    corecore