1,684 research outputs found
Generalised Mersenne Numbers Revisited
Generalised Mersenne Numbers (GMNs) were defined by Solinas in 1999 and
feature in the NIST (FIPS 186-2) and SECG standards for use in elliptic curve
cryptography. Their form is such that modular reduction is extremely efficient,
thus making them an attractive choice for modular multiplication
implementation. However, the issue of residue multiplication efficiency seems
to have been overlooked. Asymptotically, using a cyclic rather than a linear
convolution, residue multiplication modulo a Mersenne number is twice as fast
as integer multiplication; this property does not hold for prime GMNs, unless
they are of Mersenne's form. In this work we exploit an alternative
generalisation of Mersenne numbers for which an analogue of the above property
--- and hence the same efficiency ratio --- holds, even at bitlengths for which
schoolbook multiplication is optimal, while also maintaining very efficient
reduction. Moreover, our proposed primes are abundant at any bitlength, whereas
GMNs are extremely rare. Our multiplication and reduction algorithms can also
be easily parallelised, making our arithmetic particularly suitable for
hardware implementation. Furthermore, the field representation we propose also
naturally protects against side-channel attacks, including timing attacks,
simple power analysis and differential power analysis, which is essential in
many cryptographic scenarios, in constrast to GMNs.Comment: 32 pages. Accepted to Mathematics of Computatio
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared
Quantum resource estimates for computing elliptic curve discrete logarithms
We give precise quantum resource estimates for Shor's algorithm to compute
discrete logarithms on elliptic curves over prime fields. The estimates are
derived from a simulation of a Toffoli gate network for controlled elliptic
curve point addition, implemented within the framework of the quantum computing
software tool suite LIQ. We determine circuit implementations for
reversible modular arithmetic, including modular addition, multiplication and
inversion, as well as reversible elliptic curve point addition. We conclude
that elliptic curve discrete logarithms on an elliptic curve defined over an
-bit prime field can be computed on a quantum computer with at most qubits using a quantum circuit of at most Toffoli gates. We are able to classically simulate the
Toffoli networks corresponding to the controlled elliptic curve point addition
as the core piece of Shor's algorithm for the NIST standard curves P-192,
P-224, P-256, P-384 and P-521. Our approach allows gate-level comparisons to
recent resource estimates for Shor's factoring algorithm. The results also
support estimates given earlier by Proos and Zalka and indicate that, for
current parameters at comparable classical security levels, the number of
qubits required to tackle elliptic curves is less than for attacking RSA,
suggesting that indeed ECC is an easier target than RSA.Comment: 24 pages, 2 tables, 11 figures. v2: typos fixed and reference added.
ASIACRYPT 201
Enhancing an Embedded Processor Core with a Cryptographic Unit for Performance and Security
We present a set of low-cost architectural enhancements to accelerate the execution of certain arithmetic operations common in cryptographic applications on an extensible embedded processor core. The proposed enhancements are generic in the sense that they can be beneficially applied in almost any RISC processor. We implemented the enhancements in form of a cryptographic unit (CU) that offers the programmer an extended instruction set. The CU features a 128-bit wide register file and datapath, which enables it to process 128-bit words and perform 128-bit loads/stores. We analyze the speed-up factors for some arithmetic operations and public-key cryptographic algorithms obtained through
these enhancements. In addition, we evaluate the hardware overhead (i.e. silicon area) of integrating the CU into an embedded RISC processor. Our experimental results show that the proposed architectural enhancements allow for a
significant performance gain for both RSA and ECC at the expense of an acceptable increase in silicon area. We also demonstrate that the proposed enhancements facilitate the protection of cryptographic algorithms against certain types of side-channel attacks and present an AES implementation
hardened against cache-based attacks as a case study
- …