523 research outputs found
Generalised Mersenne Numbers Revisited
Generalised Mersenne Numbers (GMNs) were defined by Solinas in 1999 and
feature in the NIST (FIPS 186-2) and SECG standards for use in elliptic curve
cryptography. Their form is such that modular reduction is extremely efficient,
thus making them an attractive choice for modular multiplication
implementation. However, the issue of residue multiplication efficiency seems
to have been overlooked. Asymptotically, using a cyclic rather than a linear
convolution, residue multiplication modulo a Mersenne number is twice as fast
as integer multiplication; this property does not hold for prime GMNs, unless
they are of Mersenne's form. In this work we exploit an alternative
generalisation of Mersenne numbers for which an analogue of the above property
--- and hence the same efficiency ratio --- holds, even at bitlengths for which
schoolbook multiplication is optimal, while also maintaining very efficient
reduction. Moreover, our proposed primes are abundant at any bitlength, whereas
GMNs are extremely rare. Our multiplication and reduction algorithms can also
be easily parallelised, making our arithmetic particularly suitable for
hardware implementation. Furthermore, the field representation we propose also
naturally protects against side-channel attacks, including timing attacks,
simple power analysis and differential power analysis, which is essential in
many cryptographic scenarios, in constrast to GMNs.Comment: 32 pages. Accepted to Mathematics of Computatio
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared
Quantum resource estimates for computing elliptic curve discrete logarithms
We give precise quantum resource estimates for Shor's algorithm to compute
discrete logarithms on elliptic curves over prime fields. The estimates are
derived from a simulation of a Toffoli gate network for controlled elliptic
curve point addition, implemented within the framework of the quantum computing
software tool suite LIQ. We determine circuit implementations for
reversible modular arithmetic, including modular addition, multiplication and
inversion, as well as reversible elliptic curve point addition. We conclude
that elliptic curve discrete logarithms on an elliptic curve defined over an
-bit prime field can be computed on a quantum computer with at most qubits using a quantum circuit of at most Toffoli gates. We are able to classically simulate the
Toffoli networks corresponding to the controlled elliptic curve point addition
as the core piece of Shor's algorithm for the NIST standard curves P-192,
P-224, P-256, P-384 and P-521. Our approach allows gate-level comparisons to
recent resource estimates for Shor's factoring algorithm. The results also
support estimates given earlier by Proos and Zalka and indicate that, for
current parameters at comparable classical security levels, the number of
qubits required to tackle elliptic curves is less than for attacking RSA,
suggesting that indeed ECC is an easier target than RSA.Comment: 24 pages, 2 tables, 11 figures. v2: typos fixed and reference added.
ASIACRYPT 201
Low-Weight Primes for Lightweight Elliptic Curve Cryptography on 8-bit AVR Processors
Small 8-bit RISC processors and micro-controllers based on the AVR instruction set architecture are widely used in the embedded domain with applications ranging from smartcards over control systems to wireless sensor nodes. Many of these applications require asymmetric encryption or authentication, which has spurred a body of research into implementation aspects of Elliptic Curve Cryptography (ECC) on the AVR platform. In this paper, we study the suitability of a special class of finite fields, the so-called Optimal Prime Fields (OPFs), for a "lightweight" implementation of ECC with a view towards high performance and security. An OPF is a finite field Fp defined by a prime of the form p = u*2^k + v, whereby both u and v are "small" (in relation to 2^k) so that they fit into one or two registers of an AVR processor. OPFs have a low Hamming weight, which allows for a very efficient implementation of the modular reduction since only the non-zero words of p need to be processed. We describe a special variant of Montgomery multiplication for OPFs that does not execute any input-dependent conditional statements (e.g. branch instructions) and is, hence, resistant against certain side-channel attacks. When executed on an Atmel ATmega processor, a multiplication in a 160-bit OPF takes just 3237 cycles, which compares favorably with other implementations of 160-bit modular multiplication on an 8-bit processor. We also describe a performance-optimized and a security-optimized implementation of elliptic curve scalar multiplication over OPFs. The former uses a GLV curve and executes in 4.19M cycles (over a 160-bit OPF), while the latter is based on a Montgomery curve and has an execution time of approximately 5.93M cycles. Both results improve the state-of-the-art in lightweight ECC on 8-bit processors
Efficient Implementations of Pairing-Based Cryptography on Embedded Systems
Many cryptographic applications use bilinear pairing such as identity based signature, instance identity-based key agreement, searchable public-key encryption, short signature scheme, certificate less encryption and blind signature. Elliptic curves over finite field are the most secure and efficient way to implement bilinear pairings for the these applications. Pairing based cryptosystems are being implemented on different platforms such as low-power and mobile devices. Recently, hardware capabilities of embedded devices have been emerging which can support efficient and faster implementations of pairings on hand-held devices. In this thesis, the main focus is optimization of Optimal Ate-pairing using special class of ordinary curves, Barreto-Naehring (BN), for different security levels on low-resource devices with ARM processors. Latest ARM architectures are using SIMD instructions based NEON engine and are helpful to optimize basic algorithms. Pairing implementations are being done using tower field which use field multiplication as the most important computation. This work presents NEON implementation of two multipliers (Karatsuba and Schoolbook) and compare the performance of these multipliers with different multipliers present in the literature for different field sizes. This work reports the fastest implementation timing of pairing for BN254, BN446 and BN638 curves for ARMv7 architecture which have security levels as 128-, 164-, and 192-bit, respectively. This work also presents comparison of code performance for ARMv8 architectures
- …