561 research outputs found

    Design and analysis of an FPGA-based, multi-processor HW-SW system for SCC applications

    Get PDF
    The last 30 years have seen an increase in the complexity of embedded systems from a collection of simple circuits to systems consisting of multiple processors managing a wide variety of devices. This ever increasing complexity frequently requires that high assurance, fail-safe and secure design techniques be applied to protect against possible failures and breaches. To facilitate the implementation of these embedded systems in an efficient way, the FPGA industry recently created new families of devices. New features added to these devices include anti-tamper monitoring, bit stream encryption, and optimized routing architectures for physical and functional logic partition isolation. These devices have high capacities and are capable of implementing processors using their reprogrammable logic structures. This allows for an unprecedented level of hardware and software interaction within a single FPGA chip. High assurance and fail-safe systems can now be implemented within the reconfigurable hardware fabric of an FPGA, enabling these systems to maintain flexibility and achieve high performance while providing a high level of data security. The objective of this thesis was to design and analyze an FPGA-based system containing two isolated, softcore Nios processors that share data through two crypto-engines. FPGA-based single-chip cryptographic (SCC) techniques were employed to ensure proper component isolation when the design is placed on a device supporting the appropriate security primitives. Each crypto-engine is an implementation of the Advanced Encryption Standard (AES), operating in Galois/Counter Mode (GCM) for both encryption and authentication. The features of the microprocessors and architectures of the AES crypto-engines were varied with the goal of determining combinations which best target high performance, minimal hardware usage, or a combination of the two

    Area and Energy Optimizations in ASIC Implementations of AES and PRESENT Block Ciphers

    Get PDF
    When small, modern-day devices surface with neoteric features and promise benefits like streamlined business processes, cashierless stores, and autonomous driving, they are all too often accompanied by security risks due to a weak or absent security component. In particular, the lack of data privacy protection is a common concern that can be remedied by implementing encryption. This ensures that data remains undisclosed to unauthorized parties. While having a cryptographic module is often a goal, it is sometimes forfeited because a device's resources do not allow for the conventional cryptographic solutions. Thus, smaller, lower-energy security modules are in demand. Implementing a cipher in hardware as an application-specific integrated circuit (ASIC) will usually achieve better efficiency than alternatives like FPGAs or software, and can help towards goals such as extended battery life and smaller area footprint. The Advanced Encryption Standard (AES) is a block cipher established by the National Institute of Standards and Technology (NIST) in 2001. It has since become the most widely adopted block cipher and is applied in a variety of applications ranging from smartphones to passive RFID tags to high performance microprocessors. PRESENT, published in 2007, is a smaller lightweight block cipher designed for low-power applications. In this study, low-area and low-energy optimizations in ASICs are addressed for AES and PRESENT. In the low-area work, three existing AES encryption cores are implemented, analyzed, and benchmarked using a common fabrication technology (STM 65 nm). The analysis includes an examination of various implementations of internal AES operations and their suitability for different architectural choices. Using our taxonomy of design choices, we designed Quark-AES, a novel 8-bit AES architecture. At 1960 GE, it features a 13% improvement in area and 9% improvement in throughput/areaĀ² over the prior smallest design. To illustrate the extent of the variations due to the use of different ASIC libraries, Quark-AES and the three analyzed designs are also synthesized using three additional technologies. Even for the same transistor size, different ASIC libraries produce significantly different area results. To accommodate a variety of applications that seek different levels of tradeoffs in area and throughput, we extend all four designs to 16-bit and 32-bit datawidths. In the low-energy work, round unrolling and glitch filtering are applied together to achieve energy savings. Round unrolling, which applies multiple block cipher rounds in a combinational path, reduces the energy due to registers but increases the glitching energy. Glitch filtering complements round unrolling by reducing the amount of glitches and their associated energy consumption. For unrolled designs of PRESENT and AES, two glitch filtering schemes are assessed. One method uses AND-gates in between combinational rounds while the other used latches. Both methods work by allowing the propagation of signals only after they have stabilized. The experiments assess how energy consumption changes with respect to the degree of unrolling, the glitch filtering scheme, the degree of pipelining, the spacing between glitch filters, and the location of glitch filters when only a limited number of them can be applied due to area constraints. While in PRESENT, the optimal configuration depends on all the variables, in a larger cipher such as AES, the latch-based method consistently offers the most energy savings

    Performance Evaluation, Comparison and Improvement of the Hardware Implementations of the Advanced Encryption Standard S-box

    Get PDF
    The Advanced Encryption Standard (AES) is the most popular algorithm used in symmetric key cryptography. The eļ¬ƒcient computation of AES is essential for many computing platforms. The S-box is the only nonlinear transformation step of the AES algorithm. Eļ¬ƒcient implementation of the AES S-box is very crucial for AES hardware. The AES S-box could be implemented by using look-up table method or by using ļ¬nite ļ¬eld arithmetic. The ļ¬nite ļ¬eld arithmetic design approach to implement the AES S-box is superior in saving the hardware resources. The main objective of this thesis is to evaluate, compare and improve the hardware implementations of the forward, inverse and combined AES S-box in terms of area and/or delay. Both the composite ļ¬eld GF((2^4)^2) and the tower ļ¬eld GF(((2^2)^2)^2) are considered. Our ļ¬rst improvement is the optimization of the input and output linear mappings of the S- box in order to design a more compact circuit. Our second improvement aims at modifying the architecture of the S-box to achieve a higher speed. We used multiplication of the S-box input by an 8-bit binary ļ¬eld element to optimize the input and output transformation matrices of the S-box. A MatlabĀ® search is then conducted to ļ¬nd more compact linear mappings for the S-box. We also modified the fast S-box architecture, in addition to optimizing and searching the extended linear input mappings to improve the speed of Reyhani et al. fast S-box. The improved fast S-box, Fast 3, is the fastest and most eļ¬ƒcient (measured by area Ɨ delay) AES S-box available in the literature, up to our knowledge. We also improved the area and delay of the inversion circuit of the lightweight and fast S-boxes in [69], by slightly modifying the exponentiation block and designing a new subļ¬eld inverter block. The improved inversion circuit leads to a more compact and a faster lightweight S-box and it yields a lower area fast S-box. Moreover, we show that the ā€œtech. XORsā€ concept proposed by Maximov et al. [54] to estimate the delay of the S-box is not accurate. We show how to use the logical eļ¬€ort method [74] instead to estimate and compare the delay of previous and improved S-boxes, regardless of the CMOS technology library used for the implementation. We veriļ¬ed all the codes at the RTL level using Mentor Graphics ModelsimĀ®, by comparing against the legitimate S-box outputs. We synthesized the designs using STM 65nm CMOS standard cell library and we used VHDL coding as the design entry method to Synopsys Design CompilerĀ®. The synthesis results conļ¬rm the lower area and / or delay of the improved S-box designs and match our space and timing analyses

    An 8-Bit ROM-Free AES Design for Low-Cost Applications

    Get PDF
    We have presented a memory-less design of the advanced encryption standard (AES) with 8-bit data path for applications of wireless communications. The design uses the minimal 160 clock cycles to process a 128-bit data block. For achieving the requirements of low area cost and high performance, new design methods are used to optimize the MixColumns (MC) and Inverse MixColumns (IMC) and ShiftRows (SR) and Inverse ShiftRows (ISR) transformations. Our methods can efficiently reduce the required clock cycles, critical path delays, and area costs of these transformations compared with previous designs. In chip realization, our design with both encryption and decryption abilities has a 29% area increase but achieves 4.85 times improvement in throughput/area compared with the best 8-bit AES design reported before. For encryption only, our AES occupies 3.5ā€‰k gates with the critical delay of 12.5ā€‰ns and achieves a throughput of 64ā€‰Mbps which is the best design compared with previous encryption-only designs

    Studies on high-speed hardware implementation of cryptographic algorithms

    Get PDF
    Cryptographic algorithms are ubiquitous in modern communication systems where they have a central role in ensuring information security. This thesis studies efficient implementation of certain widely-used cryptographic algorithms. Cryptographic algorithms are computationally demanding and software-based implementations are often too slow or power consuming which yields a need for hardware implementation. Field Programmable Gate Arrays (FPGAs) are programmable logic devices which have proven to be highly feasible implementation platforms for cryptographic algorithms because they provide both speed and programmability. Hence, the use of FPGAs for cryptography has been intensively studied in the research community and FPGAs are also the primary implementation platforms in this thesis. This thesis presents techniques allowing faster implementations than existing ones. Such techniques are necessary in order to use high-security cryptographic algorithms in applications requiring high data rates, for example, in heavily loaded network servers. The focus is on Advanced Encryption Standard (AES), the most commonly used secret-key cryptographic algorithm, and Elliptic Curve Cryptography (ECC), public-key cryptographic algorithms which have gained popularity in the recent years and are replacing traditional public-key cryptosystems, such as RSA. Because these algorithms are well-defined and widely-used, the results of this thesis can be directly applied in practice. The contributions of this thesis include improvements to both algorithms and techniques for implementing them. Algorithms are modified in order to make them more suitable for hardware implementation, especially, focusing on increasing parallelism. Several FPGA implementations exploiting these modifications are presented in the thesis including some of the fastest implementations available in the literature. The most important contributions of this thesis relate to ECC and, specifically, to a family of elliptic curves providing faster computations called Koblitz curves. The results of this thesis can, in their part, enable increasing use of cryptographic algorithms in various practical applications where high computation speed is an issue

    Fault-Resilient Lightweight Cryptographic Block Ciphers for Secure Embedded Systems

    Get PDF
    The development of extremely-constrained environments having sensitive nodes such as RFID tags and nano-sensors necessitates the use of lightweight block ciphers. Indeed, lightweight block ciphers are essential for providing low-cost confidentiality to such applications. Nevertheless, providing the required security properties does not guarantee their reliability and hardware assurance when the architectures are prone to natural and malicious faults. In this thesis, considering false-alarm resistivity, error detection schemes for the lightweight block ciphers are proposed with the case study of XTEA (eXtended TEA). We note that lightweight block ciphers might be better suited for low-resource environments compared to the Advanced Encryption Standard, providing low complexity and power consumption. To the best of the author\u27s knowledge, there has been no error detection scheme presented in the literature for the XTEA to date. Three different error detection approaches are presented and according to our fault-injection simulations for benchmarking the effectiveness of the proposed schemes, high error coverage is derived. Finally, field-programmable gate array (FPGA) implementations of these proposed error detection structures are presented to assess their efficiency and overhead. The proposed error detection architectures are capable of increasing the reliability of the implementations of this lightweight block cipher. The schemes presented can also be applied to lightweight hash functions with similar structures, making the presented schemes suitable for providing reliability to their lightweight security-constrained hardware implementations

    A High Performance Advanced Encryption Standard (AES) Encrypted On-Chip Bus Architecture for Internet-of-Things (IoT) System-on-Chips (SoC)

    Get PDF
    With industry expectations of billions of Internet-connected things, commonly referred to as the IoT, we see a growing demand for high-performance on-chip bus architectures with the following attributes: small scale, low energy, high security, and highly configurable structures for integration, verification, and performance estimation. Our research thus mainly focuses on addressing these key problems and finding the balance among all these requirements that often work against each other. First of all, we proposed a low-cost and low-power System-on-Chips (SoCs) architecture (IBUS) that can frame data transfers differently. The IBUS protocol provides two novel transfer modes ā€“ the block and state modes, and is also backward compatible with the conventional linear mode. In order to evaluate the bus performance automatically and accurately, we also proposed an evaluation methodology based on the standard circuit design flow. Experimental results show that the IBUS based design uses the least hardware resource and reduces energy consumption to a half of an AMBA Advanced High-Performance Bus (AHB) and Advanced eXensible Interface (AXI). Additionally, the valid bandwidth of the IBUS based design is 2.3 and 1.6 times, respectively, compared with the AHB and AXI based implementations. As IoT advances, privacy and security issues become top tier concerns in addition to the high performance requirement of embedded chips. To leverage limited resources for tiny size chips and overhead cost for complex security mechanisms, we further proposed an advanced IBUS architecture to provide a structural support for the block-based AES algorithm. Our results show that the IBUS based AES-encrypted design costs less in terms of hardware resource and dynamic energy (60.2%), and achieves higher throughput (x1.6) compared with AXI. Effectively dealing with the automation in design and verification for mixed-signal integrated circuits is a critical problem, particularly when the bus architecture is new. Therefore, we further proposed a configurable and synthesizable IBUS design methodology. The flexible structure, together with bus wrappers, direct memory access (DMA), AES engine, memory controller, several mixed-signal verification intellectual properties (VIPs), and bus performance models (BPMs), forms the basic for integrated circuit design, allowing engineers to integrate application-specific modules and other peripherals to create complex SoCs

    Cryptography for Ultra-Low Power Devices

    Get PDF
    Ubiquitous computing describes the notion that computing devices will be everywhere: clothing, walls and floors of buildings, cars, forests, deserts, etc. Ubiquitous computing is becoming a reality: RFIDs are currently being introduced into the supply chain. Wireless distributed sensor networks (WSN) are already being used to monitor wildlife and to track military targets. Many more applications are being envisioned. For most of these applications some level of security is of utmost importance. Common to WSN and RFIDs are their severely limited power resources, which classify them as ultra-low power devices. Early sensor nodes used simple 8-bit microprocessors to implement basic communication, sensing and computing services. Security was an afterthought. The main power consumer is the RF-transceiver, or radio for short. In the past years specialized hardware for low-data rate and low-power radios has been developed. The new bottleneck are security services which employ computationally intensive cryptographic operations. Customized hardware implementations hold the promise of enabling security for severely power constrained devices. Most research groups are concerned with developing secure wireless communication protocols, others with designing efficient software implementations of cryptographic algorithms. There has not been a comprehensive study on hardware implementations of cryptographic algorithms tailored for ultra-low power applications. The goal of this dissertation is to develop a suite of cryptographic functions for authentication, encryption and integrity that is specifically fashioned to the needs of ultra-low power devices. This dissertation gives an introduction to the specific problems that security engineers face when they try to solve the seemingly contradictory challenge of providing lightweight cryptographic services that can perform on ultra-low power devices and shows an overview of our current work and its future direction
    • ā€¦
    corecore