2,010 research outputs found

    High Speed and Low-Complexity Hardware Architectures for Elliptic Curve-Based Crypto-Processors

    Get PDF
    The elliptic curve cryptography (ECC) has been identified as an efficient scheme for public-key cryptography. This thesis studies efficient implementation of ECC crypto-processors on hardware platforms in a bottom-up approach. We first study efficient and low-complexity architectures for finite field multiplications over Gaussian normal basis (GNB). We propose three new low-complexity digit-level architectures for finite field multiplication. Architectures are modified in order to make them more suitable for hardware implementations specially focusing on reducing the area usage. Then, for the first time, we propose a hybrid digit-level multiplier architecture which performs two multiplications together (double-multiplication) with the same number of clock cycles required as the one for one multiplication. We propose a new hardware architecture for point multiplication on newly introduced binary Edwards and generalized Hessian curves. We investigate higher level parallelization and lower level scheduling for point multiplication on these curves. Also, we propose a highly parallel architecture for point multiplication on Koblitz curves by modifying the addition formulation. Several FPGA implementations exploiting these modifications are presented in this thesis. We employed the proposed hybrid multiplier architecture to reduce the latency of point multiplication in ECC crypto-processors as well as the double-exponentiation. This scheme is the first known method to increase the speed of point multiplication whenever parallelization fails due to the data dependencies amongst lower level arithmetic computations. Our comparison results show that our proposed multiplier architectures outperform the counterparts available in the literature. Furthermore, fast computation of point multiplication on different binary elliptic curves is achieved

    Hardware/software optimizations for elliptic curve scalar multiplication on hybrid FPGAs

    Get PDF
    Elliptic curve cryptography (ECC) offers a viable alternative to Rivest-Shamir-Adleman (RSA) by delivering equivalent security with a smaller key size. This has several advantages, including smaller bandwidth demands, faster key exchange, and lower latency encryption and decryption. The fundamental operation for ECC is scalar point multiplication, wherein a point P on an elliptic curve defined over a finite field is multiplied by a scalar k. The complexity of this operation requires a hardware implementation to achieve high performance. The algorithms involved in scalar point multiplication are constantly evolving, incorporating the latest developments in number theory to improve computation time. These competing needs, high performance and flexibility, have caused previous implementations to either limit their adaptability or to incur performance losses. This thesis explores the use of a hybrid-FPGA for scalar point multiplication. A hybrid- FPGA contains a general purpose processor (GPP) in addition to reconfigurable fabric. This allows for a software/hardware co-design with low latency communication between the GPP and custom hardware. The elliptic curve operations and finite field inversion are programmed in C code. All other finite field arithmetic is implemented in the FPGA hardware, providing higher performance while retaining flexibility. The resulting implementation achieves speedups ranging from 24 times to 55 times faster than an optimized software implementation executing on a Pentium II workstation. The scalability of the design is investigated in two directions: faster finite field multiplication and increased instruction level parallelism exploitation. Increasing the number of parallel arithmetic units beyond two is shown to be less efficient than increasing the speed of the finite field multiplier

    Throughput/Area-efficient ECC Processor Using Montgomery Point Multiplication on FPGA

    Get PDF
    High throughput while maintaining low resource is a key issue for elliptic curve cryptography (ECC) hardware implementations in many applications. In this brief, an ECC processor architecture over Galois fields is presented, which achieves the best reported throughput/area performance on field-programmable gate array (FPGA) to date. A novel segmented pipelining digit serial multiplier is developed to speed up ECC point multiplication. To achieve low latency, a new combined algorithm is developed for point addition and point doubling with careful scheduling. A compact and flexible distributed-RAM-based memory unit design is developed to increase speed while keeping area low. Further optimizations were made via timing constraints and logic level modifications at the implementation level. The proposed architecture is implemented on Virtex4 (V4), Virtex5 (V5), and Virtex7 (V7) FPGA technologies and, respectively, achieved throughout/slice figures of 19.65, 65.30, and 64.48 (106/(Seconds × Slices))

    High Speed and Low Latency ECC Implementation over GF(2m) on FPGA

    Get PDF
    In this paper, a novel high-speed elliptic curve cryptography (ECC) processor implementation for point multiplication (PM) on field-programmable gate array (FPGA) is proposed. A new segmented pipelined full-precision multiplier is used to reduce the latency, and the Lopez-Dahab Montgomery PM algorithm is modified for careful scheduling to avoid data dependency resulting in a drastic reduction in the number of clock cycles (CCs) required. The proposed ECC architecture has been implemented on Xilinx FPGAs' Virtex4, Virtex5, and Virtex7 families. To the best of our knowledge, our single- and three-multiplier-based designs show the fastest performance to date when compared with reported works individually. Our one-multiplier-based ECC processor also achieves the highest reported speed together with the best reported area-time performance on Virtex4 (5.32 μs at 210 MHz), on Virtex5 (4.91 μs at 228 MHz), and on the more advanced Virtex7 (3.18 μs at 352 MHz). Finally, the proposed three-multiplier-based ECC implementation is the first work reporting the lowest number of CCs and the fastest ECC processor design on FPGA (450 CCs to get 2.83 μs on Virtex7)

    A versatile Montgomery multiplier architecture with characteristic three support

    Get PDF
    We present a novel unified core design which is extended to realize Montgomery multiplication in the fields GF(2n), GF(3m), and GF(p). Our unified design supports RSA and elliptic curve schemes, as well as the identity-based encryption which requires a pairing computation on an elliptic curve. The architecture is pipelined and is highly scalable. The unified core utilizes the redundant signed digit representation to reduce the critical path delay. While the carry-save representation used in classical unified architectures is only good for addition and multiplication operations, the redundant signed digit representation also facilitates efficient computation of comparison and subtraction operations besides addition and multiplication. Thus, there is no need for a transformation between the redundant and the non-redundant representations of field elements, which would be required in the classical unified architectures to realize the subtraction and comparison operations. We also quantify the benefits of the unified architectures in terms of area and critical path delay. We provide detailed implementation results. The metric shows that the new unified architecture provides an improvement over a hypothetical non-unified architecture of at least 24.88%, while the improvement over a classical unified architecture is at least 32.07%

    A Brand-New, Area - Efficient Architecture for the FFT Algorithm Designed for Implementation of FPGAs

    Get PDF
    Elliptic curve cryptography, which is more commonly referred to by its acronym ECC, is widely regarded as one of the most effective new forms of cryptography developed in recent times. This is primarily due to the fact that elliptic curve cryptography utilises excellent performance across a wide range of hardware configurations in addition to having shorter key lengths. A High Throughput Multiplier design was described for Elliptic Cryptographic applications that are dependent on concurrent computations. A Proposed (Carry-Select) Division Architecture is explained and proposed throughout the whole of this work. Because of the carry-select architecture that was discussed in this article, the functionality of the divider has been significantly enhanced. The adder carry chain is reduced in length by this design by a factor of two, however this comes at the expense of additional adders and control. When it comes to designs for high throughput FFT, the total number of butterfly units that are implemented is what determines the amount of space that is needed by an FFT processor. In addition to blocks that may either add or subtract numbers, each butterfly unit also features blocks that can multiply numbers. The size of the region that is covered by these dual mathematical blocks is decided by the bit resolution of the models. When the bit resolution is increased, the area will also increase. The standard FFT approach requires that each stage contain  times as many butterfly units as the stage before it. This requirement must be met before moving on to the next stage

    Efficient hardware prototype of ECDSA modules for blockchain applications

    Get PDF
    This paper concentrates on the hardware implementation of efficient and re- configurable elliptic curve digital signature algorithm (ECDSA) that is suitable for verifying transactions in Blockchain related applications. Despite ECDSA architecture being computationally expensive, the usage of a dedicated stand-alone circuit enables speedy execution of arithmetic operations. The prototype put forth supports N-bit elliptic curve cryptography (ECC) group operations, signature generation and verification over a prime field for any elliptic curve. The research proposes new hardware framework for modular multiplication and modular multiplicative inverse which is adopted for group operations involved in ECDSA. Every hardware design offered are simulated using modelsim register transfer logic (RTL) simulator. Field programmable gate array (FPGA) implementation of var- ious modules within ECDSA circuit is compared with equivalent existing techniques that is both hardware and software based to highlight the superiority of the suggested work. The results showcased prove that the designs implemented are both area and speed efficient with faster execution and less resource utilization while maintaining the same level of security. The suggested ECDSA structure could replace the software equivalent of digital signatures in hardware blockchain to thwart software attacks and to provide better data protection
    corecore