81 research outputs found

    A Standalone FPGA-based Miner for Lyra2REv2 Cryptocurrencies

    Full text link
    Lyra2REv2 is a hashing algorithm that consists of a chain of individual hashing algorithms, and it is used as a proof-of-work function in several cryptocurrencies. The most crucial and exotic hashing algorithm in the Lyra2REv2 chain is a specific instance of the general Lyra2 algorithm. This work presents the first hardware implementation of the specific instance of Lyra2 that is used in Lyra2REv2. Several properties of the aforementioned algorithm are exploited in order to optimize the design. In addition, an FPGA-based hardware implementation of a standalone miner for Lyra2REv2 on a Xilinx Multi-Processor System on Chip is presented. The proposed Lyra2REv2 miner is shown to be significantly more energy efficient than both a GPU and a commercially available FPGA-based miner. Finally, we also explain how the simplified Lyra2 and Lyra2REv2 architectures can be modified with minimal effort to also support the recent Lyra2REv3 chained hashing algorithm.Comment: 13 pages, accepted for publication in IEEE Trans. Circuits Syst. I. arXiv admin note: substantial text overlap with arXiv:1807.0576

    Efficient computation of hashes

    Get PDF
    The sequential computation of hashes at the core of many distributed storage systems and found, for example, in grid services can hinder efficiency in service quality and even pose security challenges that can only be addressed by the use of parallel hash tree modes. The main contributions of this paper are, first, the identification of several efficiency and security challenges posed by the use of sequential hash computation based on the Merkle-Damgard engine. In addition, alternatives for the parallel computation of hash trees are discussed, and a prototype for a new parallel implementation of the Keccak function, the SHA-3 winner, is introduced

    A Lyra2 FPGA Core for Lyra2REv2-Based Cryptocurrencies

    Full text link
    Lyra2REv2 is a hashing algorithm that consists of a chain of individual hashing algorithms and it is used as a proof-of-work function in several cryptocurrencies that aim to be ASIC-resistant. The most crucial hashing algorithm in the Lyra2REv2 chain is a specific instance of the general Lyra2 algorithm. In this work we present the first FPGA implementation of the aforementioned instance of Lyra2 and we explain how several properties of the algorithm can be exploited in order to optimize the design.Comment: 5 pages, to be presented at the IEEE International Symposium on Circuits and Systems (ISCAS) 201

    Cryptographic algorithm acceleration using CUDA enabled GPUs in typical system configurations

    Get PDF
    The need to encrypt data is becoming more and more necessary. As the size of datasets continues to grow, the speed of encryption must increase to keep up or it will become a bottleneck. CUDA GPUs have been shown to offer performance improvements versus conventional CPUs for some data-intensive problems. This thesis evaluates the applicability of CUDA GPUs in accelerating the execution of cryptographic algorithms, which are increasingly used for growing amounts of data and thus will require significantly faster encryption and hashing throughput. Specifically, the CUDA environment was used to implement and experiment with three distinct cryptographic algorithms -- AES, SHA-2, and Keccak -- in order to show the applicability for various cryptographic algorithm classes. They were implemented in a system that emulates the conditions present in a real world environment, and the effects of offloading these tasks from the CPU to the GPU were assessed. Speedups up to 2.6x relative to the CPU were seen for single-kernel AES, but SHA-2 and Keccak did not perform as well as on the GPU as on the CPU. Multi-kernel AES saw speedups over single-kernel AES up to 1.4x, 1.65x, and 1.8x for two, three, and four kernels, respectively. This translates to speedups between 3.6x and 4.7x over CPU implementations of AES. Introducing a CPU load had a minimal effect on throughput whereas a GPU load was seen to decrease throughput by as much as 4%. Overall, CUDA GPUs appear to have potential for improving encryption throughputs if a parallelizable algorithm is selected

    Analysis of KECCAK Tree Hashing on GPU Architectures

    Get PDF
    In an effort to provide security and data integrity, hashing algorithms have been designed to consume an input of any length to produce a fixed length output. KECCAK was selected by NIST to become the next Secure Hashing Algorithm SHA-3) after nearly five years of competition. In addition to providing a sequential operating mode, there is also a tree mode that allows large input messages to be hashed in parallel. This thesis focuses on the exploration and analysis of the KECCAK tree hashing mode on a GPU platform. Based on the implementation, there are core features of the GPU that could be used to accelerate the time it takes to complete a hash due to the massively parallel architecture of the device. In addition to analyzing the speed of the algorithm, the underlying hardware is profiled to identify the bottlenecks that limited the speed. The results of this work show that tree hashing can hash data at rates of up to 3 GB/s for the fixed size tree mode. On a 3.40 GHz CPU, this is the equivalent of 1.03 cycles per byte, more than six times faster than a sequential implementation for a very large input. For the variable size tree mode, the throughput was 500 MB/s. Based on the performance analysis, modification of the input rate of the KECCAK sponge resulted in a negligible change to the overall speed. As a result of the hardware profiling, the register and L1 cache usage in the GPU was a major bottleneck to the overall throughput. In a simulated GPU environment, it was shown that increasing the L1 cache by 25 percent could increase the throughput by up to 30 percent for a small tree and 15 percent for a tree that will achieve the greatest throughput on a real GPU. When this modification is combined with an increase of the L2 cache, performance can be improved by up to 20 percent

    Comparative Study of Keccak SHA-3 Implementations

    Get PDF
    This paper conducts an extensive comparative study of state-of-the-art solutions for im- plementing the SHA-3 hash function. SHA-3, a pivotal component in modern cryptography, has spawned numerous implementations across diverse platforms and technologies. This research aims to provide valuable insights into selecting and optimizing Keccak SHA-3 implementations. Our study encompasses an in-depth analysis of hardware, software, and software–hardware (hybrid) solutions. We assess the strengths, weaknesses, and performance metrics of each approach. Critical factors, including computational efficiency, scalability, and flexibility, are evaluated across differ- ent use cases. We investigate how each implementation performs in terms of speed and resource utilization. This research aims to improve the knowledge of cryptographic systems, aiding in the informed design and deployment of efficient cryptographic solutions. By providing a comprehensive overview of SHA-3 implementations, this study offers a clear understanding of the available options and equips professionals and researchers with the necessary insights to make informed decisions in their cryptographic endeavors

    Non-Full Sbox Linearization: Applications to Collision Attacks on Round-Reduced Keccak

    Get PDF
    The Keccak hash function is the winner of the SHA-3 competition and became the SHA-3 standard of NIST in 2015. In this paper, we focus on practical collision attacks against round-reduced Keccak hash function, and two main results are achieved: the first practical collision attacks against 5-round Keccak-224 and an instance of 6-round Keccak collision challenge. Both improve the number of practically attacked rounds by one. These results are obtained by carefully studying the algebraic properties of the nonlinear layer in the underlying permutation of Keccak and applying linearization to it. In particular, techniques for partially linearizing the output bits of the nonlinear layer are proposed, utilizing which attack complexities are reduced significantly from the previous best results

    High-performance FPGA implementation of the secure hash algorithm 3 for single and multi-message processing

    Get PDF
    The secure hash function has become the default choice for information security, especially in applications that require data storing or manipulation. Consequently, optimized implementations of these functions in terms of Throughput or Area are in high demand. In this work we propose a new conception of the secure hash algorithm 3 (SHA-3), which aim to increase the performance of this function by using pipelining, four types of pipelining are proposed two, three, four, and six pipelining stages. This approach allows us to design data paths of SHA-3 with higher Throughput and higher clock frequencies. The design reaches a maximum Throughput of 102.98 Gbps on Virtex 5 and 115.124 Gbps on Virtex 6 in the case of the 6 stages, for 512 bits output length. Although the utilization of the resource increase with the increase of the number of the cores used in each one of the cases. The proposed designs are coded in very high-speed integrated circuits program (VHSIC) hardware description language (VHDL) and implemented in Xilinx Virtex-5 and Virtex-6 A field-programmable gate array (FPGA) devices and compared to existing FPGA implementations
    • …
    corecore