Search CORE

38 research outputs found

Throughput/Area-Efficient Accelerator of Elliptic Curve Point Multiplication over GF(2233) on FPGA

Author: Alotaibi Saud S.
Arif Muhammad
Rashid Muhammad
Sajid Asher
Sonbul Omar S.
Zia Muhammad Yousuf Irfan
Publication venue: MDPI
Publication date: 01/08/2023
Field of study

This paper presents a throughput/area-efficient hardware accelerator architecture for elliptic curve point multiplication (ECPM) computation over GF(2233). The throughput of the proposed accelerator design is optimized by reducing the total clock cycles using a bit-parallel Karatsuba modular multiplier. We employ two techniques to minimize the hardware resources: (i) a consolidated arithmetic unit where we combine a single modular adder, multiplier, and square block instead of having multiple modular operators, and (ii) an Itoh–Tsujii inversion algorithm by leveraging the existing hardware resources of the multiplier and square units for multiplicative inverse computation. An efficient finite-state-machine (FSM) controller is implemented to facilitate control functionalities. To evaluate and compare the results of the proposed accelerator architecture against state-of-the-art solutions, a figure-of-merit (FoM) metric in terms of throughput/area is defined. The implementation results after post-place-and-route simulation are reported for reconfigurable field-programmable gate array (FPGA) devices. Particular to Virtex-7 FPGA, the accelerator utilizes 3584 slices, needs 7208 clock cycles, operates on a maximum frequency of 350 MHz, computes one ECPM operation in 20.59 s, and the calculated value of FoM is 13.54. Consequently, the results and comparisons reveal that our accelerator suits applications that demand throughput and area-optimized ECPM implementations

Repositorio Institucional Universidad de Málaga

Optimization of Supersingular Isogeny Cryptography for Deeply Embedded Systems

Author: Calhoun Jeffrey Denton
Publication venue: UNM Digital Repository
Publication date: 11/05/2018
Field of study

Public-key cryptography in use today can be broken by a quantum computer with sufficient resources. Microsoft Research has published an open-source library of quantum-secure supersingular isogeny (SI) algorithms including Diffie-Hellman key agreement and key encapsulation in portable C and optimized x86 and x64 implementations. For our research, we modified this library to target a deeply-embedded processor with instruction set extensions and a finite-field coprocessor originally designed to accelerate traditional elliptic curve cryptography (ECC). We observed a 6.3-7.5x improvement over a portable C implementation using instruction set extensions and a further 6.0-6.1x improvement with the addition of the coprocessor. Modification of the coprocessor to a wider datapath further increased performance 2.6-2.9x. Our results show that current traditional ECC implementations can be easily refactored to use supersingular elliptic curve arithmetic and achieve post-quantum security

Customisable arithmetic hardware designs

Author: Cheung Chak-Chung Ray
Cheung Chak-Chung Ray
Publication venue
Publication date: 01/01/2007
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Hardware implementation of elliptic curve Diffie-Hellman key agreement scheme in GF(p)

Author: Sangma Zerene
Publication venue: RIT Scholar Works
Publication date: 01/01/2008
Field of study

With the advent of technology there are many applications that require secure communication. Elliptic Curve Public-key Cryptosystems are increasingly becoming popular due to their small key size and efficient algorithm. Elliptic curves are widely used in various key exchange techniques including Diffie-Hellman Key Agreement scheme. Modular multiplication and modular division are one of the basic operations in elliptic curve cryptography. Much effort has been made in developing efficient modular multiplication designs, however few works has been proposed for the modular division. Nevertheless, these operations are needed in various cryptographic systems. This thesis examines various scalable implementations of elliptic curve scalar multiplication employing multiplicative inverse or field division in GF(p) focussing mainly on modular divison architectures. Next, this thesis presents a new architecture for modular division based on the variant of Extended Binary GCD algorithm. The main contribution at system level architecture to the modular division unit is use of counters in place of shift registers that are basis of the algorithm and modifying the algorithm to introduce a modular correction unit for the output logic. This results in 62% increase in speed with respect to a prototype design. Finally, using the modular division architecture an Elliptic Curve ALU in GF(p) was implemented which can be used as the core arithmetic unit of an elliptic curve processor. The resulting architecture was targeted to Xilinx Vertex2v6000-bf957 FPGA device and can be implemented for different elliptic curves for almost all practical values of field p. The frequency of the ALU is 58.8 MHz for 128-bits utilizing 20% of the device at 27712 gates which is 30% faster than a prototype implementation with a 2% increase in area utilization. The ALU was tested to perform Diffie-Hellman Key Agreement Scheme and is suitable for other public-key cryptographic algorithms

RIT Scholar Works

Pipelining GF(P) Elliptic Curve Cryptography Computation

Author: Gutub Adnan
Ibrahim Mohammad K.
Kayali Ahmad
Publication venue
Publication date: 11/03/2006
Field of study

This paper proposes a new method to compute Elliptic Curve Cryptography in Galois Fields GF(p). The method incorporates pipelining to utilize the benefit of both parallel and serial methodology used before. It allows the exploitation of the inherited independency that exists in elliptic curve point addition and doubling operations. The results showed attraction because of its improvement over many parallel and serial techniques of elliptic curve crypto-computations

KFUPM ePrints

Merging GF(p) Elliptic Curve Point Adding and Doubling on Pipelined VLSI Cryptographic ASIC Architecture

Author: Gutub Adnan
Publication venue: IJCSNS, Dae-Sang Office 301, Sangdo 5 dong 509-1, Dongjack Gu, Seoul 156-743, Korea
Publication date: 01/03/2006
Field of study

This paper merges between elliptic curve addition presents a modified processor architecture for Elliptic Curve Cryptography computations in Galois Fields GF(p). The architecture incorporates the methodology of pipelining to utilize the benefit of both parallel and serial implementations. It allows the exploitation of the inherited independency that exists in elliptic curve point addition and doubling operations using a single pipelined core. The processor architecture showed attraction because of its improvement over many parallel and serial implementations of elliptic curve crypto-systems. It proved to be efficient having better performance with regard to area, speed, and power consumption

KFUPM ePrints

Merging GF(p) Elliptic Curve Point Adding and Doubling on Pipelined VLSI Cryptographic ASIC Architecture

Author: Gutub Adnan
Publication venue: IJCSNS, Dae-Sang Office 301, Sangdo 5 dong 509-1, Dongjack Gu, Seoul 156-743, Korea
Publication date: 01/03/2006
Field of study

KFUPM ePrints

An algorithmic and architectural study on Montgomery exponentiation in RNS

Author: Bajard J.C.
Gandino Filippo
Lamberti Fabrizio
Montuschi Paolo
Paravati Gianluca
Publication venue: Piscataway, N.J. : IEEE
Publication date: 01/01/2012
Field of study

The modular exponentiation on large numbers is computationally intensive. An effective way for performing this operation consists in using Montgomery exponentiation in the Residue Number System (RNS). This paper presents an algorithmic and architectural study of such exponentiation approach. From the algorithmic point of view, new and state-of-the-art opportunities that come from the reorganization of operations and precomputations are considered. From the architectural perspective, the design opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell architecture. Furthermore, since the use of efficient RNS bases with a low Hamming weight are being considered with ever more interest, four additional cell architectures specifically tailored to these bases are developed and the tradeoff between benefits and drawbacks is carefully explored. An overall comparison among all the considered algorithmic approaches and cell architectures is presented, with the aim of providing the reader with an extensive overview of the Montgomery exponentiation opportunities in RNS

Crossref

HAL Descartes

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Hal-Diderot

PORTO Publications Open Repository TOrino

Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation

Author: Frederik Vercauteren
Ingrid Verbauwhede
Kimmo Järvinen
Sujoy Sinha Roy
Vassil Dimitrov
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 11/09/2015
Field of study

We present a hardware architecture for all building blocks required in polynomial ring based fully homomorphic schemes and use it to instantiate the somewhat homomorphic encryption scheme YASHE. Our implementation is the first FPGA implementation that is designed for evaluating functions on homomorphically encrypted data (up to a certain multiplicative depth) and we illustrate this capability by evaluating the SIMON-64/128 block cipher in the encrypted domain. Our implementation provides a fast polynomial operations unit using CRT and NTT for multiplication combined with an optimized memory access scheme; a fast Barrett like polynomial reduction method; an efficient divide and round unit required in the multiplication of ciphertexts and an efficient CRT unit. These building blocks are integrated in an instruction-set coprocessor to execute YASHE, which can be controlled by a computer for evaluating arbitrary functions (up to the multiplicative depth 44 and 128-bit security level). Our architecture was compiled for a single Virtex-7 XC7V1140T FPGA, where it consumes 23\% of registers, 53\% of LUTs, 53\% of DSP slices, and 38\% of BlockRAM memory. The implementation evaluates SIMON-64/128 in approximately 171.3s (at 143 MHz) and it processes 2048 ciphertexts at once giving a relative time of only 83.6 ms per block. This is 24.5 times faster than the leading software implementation on a 4-core Intel Core-i7 processor running at 3.4 GHz

Cryptology ePrint Archive

Implementação de um co-processador RSA

Author: Costa Alcides Silveira
Publication venue
Publication date: 01/01/2004
Field of study

Lume 5.8