18 research outputs found

    Performance analysis of a scalable hardware FPGA Skein implementation

    Get PDF
    Hashing functions are a key cryptographic primitive used in many everyday applications, such as authentication, ensuring data integrity, as well as digital signatures. The current hashing standard is defined by the National Institute of Standards and Technology (NIST) as the Secure Hash Standard (SHS), and includes SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512 . SHS\u27s level of security is waning as technology and analysis techniques continue to develop over time. As a result, after the 2005 Cryptographic Hash Workshop, NIST called for the creation of a new cryptographic hash algorithm to replace SHS. The new candidate algorithms were submitted on October 31st, 2008, and of them fourteen have advanced to round two of the competition. The competition is expected to produce a final replacement for the SHS standard by 2012. Multi-core processors, and parallel programming are the dominant force in computing, and some of the new hashing algorithms are attempting to take advantage of these resources by offering parallel tree-hashing variants to the algorithms. Tree-hashing allows multiple parts of the data on the same level of a tree to be operated on simultaneously, resulting in the potential to reduce the execution time complexity for hashing from O(n) to O(log n). Designs for tree-hashing require that the scalability and parallelism of the algorithms be researched on all platforms, including multi-core processors (CPUs), graphics processors (GPUs), as well as custom hardware (ASICs and FPGAs). Skein, the hashing function that this work has focused on, offers a tree-hashing mode with different options for the maximum tree height, and leaf node size, as well as the node fan-out. This research focuses on creating and analyzing the performance of scalable hardware designs for Skein\u27s tree hashing mode. Different ideas and approaches on how to modify sequential hashing cores, and create scalable control logic in order to provide for high-speed and low-area parallel hashing hardware are presented and analyzed. Equations were created to help understand the expected performance and potential bottlenecks of Skein in FPGAs. The equations are intended to assist the decision making process during the design phase, as well as potentially provide insight into design considerations for other tree hashing schemes in FPGAs. The results are also compared to current sequential designs of Skein, providing a complete analysis of the performance of Skein in an FPGA

    Compact Hardware Implementations of ChaCha, BLAKE, Threefish, and Skein on FPGA

    Get PDF
    The cryptographic hash functions BLAKE and Skein are built from the ChaCha stream cipher and the tweakable Threefish block cipher, respectively. Interestingly enough, they are based on the same arithmetic operations, and the same design philosophy allows one to design lightweight coprocessors for hashing and encryption. The key element of our approach is to take advantage of the parallelism of the algorithms to deeply pipeline our Arithmetic an Logic Units, and to avoid data dependencies by interleaving independent tasks. We show for instance that a fully autonomous implementation of BLAKE and ChaCha on a Xilinx Virtex-6 device occupies 144 slices and three memory blocks, and achieves competitive throughputs. In order to offer the same features, a coprocessor implementing Skein and Threefish requires a substantial higher slice count

    Hardware Implementation of the SHA-3 Candidate Skein

    Get PDF
    Skein is a submission to the NIST SHA-3 hash function competition which has been optimized towards implementation in modern 64-bit processor architectures. This paper investigates the performance characteristics of a high-speed hardware implementation of Skein with a 0.18\,\textmu}m standard-cell library and on different modern FPGAs. The results allow a first comparison of the hardware performance figures of full Skein with other SHA-3 candidates

    Design and implementation of Threefish cipher algorithm in PNG file

    Get PDF
    This paper is presenting design and implementation of Threefish block cipher on grayscale images. Despite the fact that Threefish block cipher is one of the best secure algorithms, most studies concerning Threefish have focused on hardware implementation and have not commonly been applied on image encryption due to huge amount of data. The main contribution here was to reduce the time and the amount of data to be encrypted while maintaining encryption performance. This objective was achieved by encrypting just the most significant bits of image pixels. A 256-bit plain text blocks of the Threefish was constructed from 2n most significant bits of the pixels, where 0<n<3. Furthermore, Threefish block cipher was applied when n=3 to analyze the impact of uninvolving some bits in encryption process on the encryption performance. The results indicated that the encryption achieved good encryption quality when n=1, but it might cause some loss in decryption. In contrast, the encryption achieved high encryption quality when n=2, almost as good as the encryption of the whole pixel bits. Furthermore, the encryption time and the amount of data to be encrypted were decreased 50% as n decreased by 1. It was concluded that encrypting half of the pixel bits reduces both time and data, as well as significantly preserves the encryption quality. Finally, although the proposed method passed the statistical analysis, further work is needed to find a method resistant to the differential analysis

    Threefish-256 algorithm implementation on reconfigurable hardware

    Get PDF
    This article  presents  both  the  description and  results  of  the Threefish  cryptographic  algorithm hardware  implementation  for  encryption  process. The implementation of the algorithm was performed by using the iterative round architecture on the FPGA (Field Programmable Gate Array) Virtex-5 present in the development system XUPV5-LX110T. Place and route results show that the design Threefish-256 iterative round has a throughput of 551Mbps.En este artículo se presenta la descripción y los resultados de la implementación en hardware del algoritmo criptográfico Threefish en su proceso de cifrado. La implementación se realizó usando la arquitectura de ronda iterativa sobre la Field Programmable Gate Array (FPGA) Virtex-5 presente en el sistema de desarrollo XUPV5-LX110T. Los resultados posteriores al place and route muestran que el diseño Threefish-256 de ronda iterativa tiene un throughput de 551Mbps

    A Standalone FPGA-based Miner for Lyra2REv2 Cryptocurrencies

    Full text link
    Lyra2REv2 is a hashing algorithm that consists of a chain of individual hashing algorithms, and it is used as a proof-of-work function in several cryptocurrencies. The most crucial and exotic hashing algorithm in the Lyra2REv2 chain is a specific instance of the general Lyra2 algorithm. This work presents the first hardware implementation of the specific instance of Lyra2 that is used in Lyra2REv2. Several properties of the aforementioned algorithm are exploited in order to optimize the design. In addition, an FPGA-based hardware implementation of a standalone miner for Lyra2REv2 on a Xilinx Multi-Processor System on Chip is presented. The proposed Lyra2REv2 miner is shown to be significantly more energy efficient than both a GPU and a commercially available FPGA-based miner. Finally, we also explain how the simplified Lyra2 and Lyra2REv2 architectures can be modified with minimal effort to also support the recent Lyra2REv3 chained hashing algorithm.Comment: 13 pages, accepted for publication in IEEE Trans. Circuits Syst. I. arXiv admin note: substantial text overlap with arXiv:1807.0576

    A Low-Area Unified Hardware Architecture for the AES and the Cryptographic Hash Function Grøstl

    Get PDF
    This article describes the design of an 8-bit coprocessor for the AES (encryption, decryption, and key expansion) and the cryptographic hash function Grøstl on several Xilinx FPGAs. Our Arithmetic and Logic Unit performs a single instruction that allows for implementing AES encryption, AES decryption, AES key expansion, and Grøstl at all levels of security. Thanks to a careful organization of AES and Grøstl internal states in the register file, we manage to generate all read and write addresses by means of a modulo-128 counter and a modulo-256 counter. A fully autonomous implementation of Grøstl and AES on a Virtex-6 FPGA requires 169 slices and a single 36k memory block, and achieves a competitive throughput. Assuming that the security guarantees of Grøstl are at least as good as the ones of the other SHA-3 finalists, our results show that Grøstl is the best candidate for low-area cryptographic coprocessors

    Symmetric Encryption Algorithms: Review and Evaluation Study

    Get PDF
    The increased exchange of data over the Internet in the past two decades has brought data security and confidentiality to the fore front. Information security can be achieved by implementing encryption and decryption algorithms to ensure data remains secure and confidential, especially when transmitted over an insecure communication channel. Encryption is the method of coding information to prevent unauthorized access and ensure data integrity and confidentiality, whereas the reverse process is known as decryption. All encryption algorithms aim to secure data, however, their performance varies according to several factors such as file size, type, complexity, and platform used. Furthermore, while some encryption algorithms outperform others, they have been proven to be vulnerable against certain attacks. In this paper, we present a general overview of common encryption algorithms   and explain their inner workings. Additionally, we select ten different symmetric encryption algorithms and conduct a simulation in Java to test their performance. The algorithms we compare are: AES, BLOWFISH, RC2, RC4, RC6, DES, DESede, SEED, XTEA, and IDEA. We present the results of our simulation in terms of encryption speed, throughput, and CPU utilization rate for various file sizes ranging from 1MB to 1GB. We further analyze our results for all measures that have been tested, taking into account the level of security they provide

    A Hardware Perspective on the ChaCha Ciphers: Scalable Chacha8/12/20 Implementations Ranging from 476 Slices to Bitrates of 175 Gbit/s

    Get PDF
    AES (Advanced Encryption Standard) accelerators are commonly used in high-throughput applications, but they have notable resource requirements. We investigate replacing the AES cipher with ChaCha ciphers and propose the first ChaCha FPGA implementations optimized for data throughput. In consequence, we compare implementations of three different system architectures and analyze which aspects dominate the performance of those.Our experimental results indicate that a bandwidth of 175 Gbit/s can be reached with as little as 2982 slices, whereas comparable state of the art AES accelerators require 10 times as many slices. Taking advantage of the flexibility inherent in the ChaCha cipher, we also demonstrate how our implementation scales to even higher throughputs or lower resource usage (down to 476 slices), benefiting applications which previously could not employ cryptography because of resource limitations
    corecore