405 research outputs found

    Efficient hardware implementations of high throughput SHA-3 candidates keccak, luffa and blue midnight wish for single- and multi-message hashing

    Get PDF
    In November 2007 NIST announced that it would organize the SHA-3 competition to select a new cryptographic hash function family by 2012. In the selection process, hardware performances of the candidates will play an important role. Our analysis of previously proposed hardware implementations shows that three SHA-3 candidate algorithms can provide superior performance in hardware: Keccak, Luffa and Blue Midnight Wish (BMW). In this paper, we provide efficient and fast hardware implementations of these three algorithms. Considering both single- and multi-message hashing applications with an emphasis on both speed and efficiency, our work presents more comprehensive analysis of their hardware performances by providing different performance figures for different target devices. To our best knowledge, this is the first work that provides a comparative analysis of SHA-3 candidates in multi-message applications. We discover that BMW algorithm can provide much higher throughput than previously reported if used in multi-message hashing. We also show that better utilization of resources can increase speed via different configurations. We implement our designs using Verilog HDL, and map to both ASIC and FPGA devices (Spartan3, Virtex2, and Virtex 4) to give a better comparison with those in the literature. We report total area, maximum frequency, maximum throughput and throughput/area of the designs for all target devices. Given that the selection process for SHA3 is still open; our results will be instrumental to evaluate the hardware performance of the candidates

    High-performance FPGA implementation of the secure hash algorithm 3 for single and multi-message processing

    Get PDF
    The secure hash function has become the default choice for information security, especially in applications that require data storing or manipulation. Consequently, optimized implementations of these functions in terms of Throughput or Area are in high demand. In this work we propose a new conception of the secure hash algorithm 3 (SHA-3), which aim to increase the performance of this function by using pipelining, four types of pipelining are proposed two, three, four, and six pipelining stages. This approach allows us to design data paths of SHA-3 with higher Throughput and higher clock frequencies. The design reaches a maximum Throughput of 102.98 Gbps on Virtex 5 and 115.124 Gbps on Virtex 6 in the case of the 6 stages, for 512 bits output length. Although the utilization of the resource increase with the increase of the number of the cores used in each one of the cases. The proposed designs are coded in very high-speed integrated circuits program (VHSIC) hardware description language (VHDL) and implemented in Xilinx Virtex-5 and Virtex-6 A field-programmable gate array (FPGA) devices and compared to existing FPGA implementations

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

    Versatile FPGA architecture for skein hashing algorithm

    Get PDF
    Digital communications and data storage are expanding at fast rates, increasing the need for advanced cryptographic standards to validate and provide privacy for that data. One of the basic components commonly used in information security systems is cryptographic hashing. Cryptographic hashing involves the compression of an arbitrary block of data into a fixed-size string of bits known as the hash value. These functions are designed such that it is computationally infeasible to determine a message that results in a given hash value. It should also be infeasible to find two messages with the same hash value and to change a message without its hash value being changed. Some of the most common uses of these algorithms are digital signatures, message authentication codes, file identification, and data integrity. Due to developments in attacks on the Secure Hash Standard (SHS), which includes SHA-1 and SHA-2 (SHA-224, SHA-256, SHA-384, SHA-512), the National Institute of Standards and Technology (NIST) will be selecting a new hashing algorithm to replace the current standards. In 2008, 64 algorithms were entered into the NIST competition and in December 2010, five finalists were chosen. The final candidates are BLAKE, Keccak, Gr{o}stl, JH, and Skein. In 2012, one of these algorithms will be selected for the Secure Hash Algorithm 3 (SHA-3). This thesis focuses on the development of a versatile hardware architecture for Skein that provides both sequential and tree hashing functions of Skein. The performance optimizations rely heavily on pipelined and unrolled architectures to allow for simultaneous hashing of multiple unique messages and reduced area tree hashing implementations. Additional result of this thesis is a comprehensive overview of the newly developed architectures and an analysis of their performance in comparison with other software and hardware implementations

    Security of the SHA-3 candidates Keccak and Blue Midnight Wish: Zero-sum property

    Get PDF
    The SHA-3 competition for the new cryptographic standard was initiated by National Institute of Standards and Technology (NIST) in 2007. In the following years, the event grew to one of the top areas currently being researched by the CS and cryptographic communities. The first objective of this thesis is to overview, analyse, and critique the SHA-3 competition. The second one is to perform an in-depth study of the security of two candidate hash functions, the finalist Keccak and the second round candidate Blue Midnight Wish. The study shall primarily focus on zero-sum distinguishers. First we attempt to attack reduced versions of these hash functions and see if any vulnerabilities can be detected. This is followed by attacks on their full versions. In the process, a novel approach is utilized in the search of zero-sum distinguishers by employing SAT solvers. We conclude that while such complex attacks can theoretically uncover undesired properties of the two hash functions presented, such attacks are still far from being fully realized due to current limitations in computing power

    On the Exploitation of a High-throughput SHA-256 FPGA Design for HMAC

    Get PDF
    High-throughput and area-efficient designs of hash functions and corresponding mechanisms for Message Authentication Codes (MACs) are in high demand due to new security protocols that have arisen and call for security services in every transmitted data packet. For instance, IPv6 incorporates the IPSec protocol for secure data transmission. However, the IPSec's performance bottleneck is the HMAC mechanism which is responsible for authenticating the transmitted data. HMAC's performance bottleneck in its turn is the underlying hash function. In this article a high-throughput and small-size SHA-256 hash function FPGA design and the corresponding HMAC FPGA design is presented. Advanced optimization techniques have been deployed leading to a SHA-256 hashing core which performs more than 30% better, compared to the next better design. This improvement is achieved both in terms of throughput as well as in terms of throughput/area cost factor. It is the first reported SHA-256 hashing core that exceeds 11Gbps (after place and route in Xilinx Virtex 6 board)

    A comparative study of hash algorithms with the prospect of developing a CAN bus authentication technique

    Get PDF
    In this paper, the performances of SHA-3 final round candidates along with new versions of other hash algorithms are analyzed and compared. An ARM-Cortex A9 microcontroller and a Spartan -3 FPGA circuit are involved in the study, with emphasis placed on the number of cycles and the authentication speed. These hash functions are implemented and tested resulting in a set of ranked algorithms in terms of the specified metrics. Taking into account the performances of the most efficient algorithms and the proposed hardware platform components, an authentication technique can be developed as a possible solution to the limitations and weaknesses of automotive CAN (Controlled Area Network) bus – based embedded systems in terms of security, privacy and integrity. From there, the main elements of such a potential structure are set forth

    Comparison of power consumption in pipelined implementations of the BLAKE3 cipher in FPGA devices

    Get PDF
    This article analyzes the dynamic power losses generated by various hardware implementations of the BLAKE3 hash function. Estimations of the parameters were based on the results of post-route simulations of designs implemented in Xilinx Spartan-7 FPGAs. The algorithm was tested in various hardware organizations: based on a standard iterative architecture with one round instance in the programmable array, various derived versions with pipeline processing were elaborated, which ultimately led to a set of 6 architectural variants of the cipher, from the iterative case (without pipeline) to one with maximum of 6 pipeline stages. Moreover, the results obtained for the iterative architecture were compared with analogous implementations of the BLAKE2 (direct predecessor) and Keccak (the foundation of the current SHA-3 standard) algorithms. This case study illustrates the differences (or lack thereof) in the power requirements of these three hash functions when they are implemented on an FPGA platform, and illustrate the significant savings that can be achieved by introducing pipeline to the processing of the BLAKE round

    FPGA design and performance analysis of SHA-512, whirlpool and PHASH hashing functions

    Get PDF
    Hashing functions play a fundamental role in modern cryptography. Such functions process data of finite length and produce a small fixed size output referred to as a digest or hash. Typical applications of these functions include data integrity verification and message authentication schemes. With the ever increasing amounts of data that needs to be hashed, the throughput of hashing functions becomes an important factor. This work presents and compares high performance FPGA implementations of SHA- 512, Whirlpool and a recently proposed parallelizable hash function, PHASH. The novelty of PHASH is that it is able to process multiple data blocks at once, making it suitable for achieving ultra high-performance. It utilizes the W cipher, as described in Whirlpool, at its core. The SHA (SHA-0, SHA-1, SHA-2) family of functions is one of the first widely accepted standards for hashing. According to currently published literature, the fastest SHA-512 (a variant of SHA-2) implementation achieves a throughput of 1550 Mbps. A recently introduced hashing function, Whirlpool, provides comparable security to SHA- 512 and is able to achieve much better performance. According to currently published literature, the fastest Whirlpool implementation achieves a throughput of 4896 Mbps. The proposed PHASH hash function greatly outperforms both SHA-512 and Whirlpool. All implementations are targeted for the state-of-the-art Xilinx Virtex-5 LX330 FPGA. The SHA-512 implementation attains a throughput of 1828 Mbps, and Whirlpool attains 7687 Mbps. PHASH achieves a throughput over 15 Gbps using a singleWcipher instance. Using 8 W cipher instances a throughput over 100 Gbps is achieved and 16 instances provide a throughput over 182 Gbps
    corecore