9,756 research outputs found
Parametric, Secure and Compact Implementation of RSA on FPGA
We present a fast, efficient, and parameterized modular multiplier and a secure exponentiation circuit especially intended for FPGAs on the low end of the price range. The design utilizes dedicated block multipliers as the main functional unit and Block-RAM as storage unit for the operands. The adopted design methodology allows adjusting the number of multipliers, the radix used in the multipliers, and number of words to meet the system requirements such as
available resources, precision and timing constraints. The architecture, based on the Montgomery modular multiplication algorithm, utilizes a pipelining technique that allows concurrent operation of hardwired multipliers. Our
design completes 1020-bit and 2040-bit modular multiplications in 7.62 ÎŒs and 27.0 ÎŒs, respectively. The multiplier uses a moderate amount of system resources while achieving the best area-time product in literature. 2040-bit modular exponentiation engine can easily fit into Xilinx Spartan-3E 500; moreover the exponentiation circuit withstands known side channel attacks
An algorithmic and architectural study on Montgomery exponentiation in RNS
The modular exponentiation on large numbers is computationally intensive. An effective way for performing this operation consists in using Montgomery exponentiation in the Residue Number System (RNS). This paper presents an algorithmic and architectural study of such exponentiation approach. From the algorithmic point of view, new and state-of-the-art opportunities that come from the reorganization of operations and precomputations are considered. From the architectural perspective, the design opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell architecture. Furthermore, since the use of efficient RNS bases with a low Hamming weight are being considered with ever more interest, four additional cell architectures specifically tailored to these bases are developed and the tradeoff between benefits and drawbacks is carefully explored. An overall comparison among all the considered algorithmic approaches and cell architectures is presented, with the aim of providing the reader with an extensive overview of the Montgomery exponentiation opportunities in RNS
Secure and Efficient RNS Approach for Elliptic Curve Cryptography
Scalar multiplication, the main operation in elliptic
curve cryptographic protocols, is vulnerable to side-channel
(SCA) and fault injection (FA) attacks. An efficient countermeasure
for scalar multiplication can be provided by using alternative
number systems like the Residue Number System (RNS). In RNS,
a number is represented as a set of smaller numbers, where each
one is the result of the modular reduction with a given moduli
basis. Under certain requirements, a number can be uniquely
transformed from the integers to the RNS domain (and vice
versa) and all arithmetic operations can be performed in RNS.
This representation provides an inherent SCA and FA resistance
to many attacks and can be further enhanced by RNS arithmetic
manipulation or more traditional algorithmic countermeasures.
In this paper, extending our previous work, we explore the
potentials of RNS as an SCA and FA countermeasure and provide
an description of RNS based SCA and FA resistance means. We
propose a secure and efficient Montgomery Power Ladder based
scalar multiplication algorithm on RNS and discuss its SCAFA
resistance. The proposed algorithm is implemented on an
ARM Cortex A7 processor and its SCA-FA resistance is evaluated
by collecting preliminary leakage trace results that validate our
initial assumptions
Generalised Mersenne Numbers Revisited
Generalised Mersenne Numbers (GMNs) were defined by Solinas in 1999 and
feature in the NIST (FIPS 186-2) and SECG standards for use in elliptic curve
cryptography. Their form is such that modular reduction is extremely efficient,
thus making them an attractive choice for modular multiplication
implementation. However, the issue of residue multiplication efficiency seems
to have been overlooked. Asymptotically, using a cyclic rather than a linear
convolution, residue multiplication modulo a Mersenne number is twice as fast
as integer multiplication; this property does not hold for prime GMNs, unless
they are of Mersenne's form. In this work we exploit an alternative
generalisation of Mersenne numbers for which an analogue of the above property
--- and hence the same efficiency ratio --- holds, even at bitlengths for which
schoolbook multiplication is optimal, while also maintaining very efficient
reduction. Moreover, our proposed primes are abundant at any bitlength, whereas
GMNs are extremely rare. Our multiplication and reduction algorithms can also
be easily parallelised, making our arithmetic particularly suitable for
hardware implementation. Furthermore, the field representation we propose also
naturally protects against side-channel attacks, including timing attacks,
simple power analysis and differential power analysis, which is essential in
many cryptographic scenarios, in constrast to GMNs.Comment: 32 pages. Accepted to Mathematics of Computatio
Efficient long division via Montgomery multiply
We present a novel right-to-left long division algorithm based on the
Montgomery modular multiply, consisting of separate highly efficient loops with
simply carry structure for computing first the remainder (x mod q) and then the
quotient floor(x/q). These loops are ideally suited for the case where x
occupies many more machine words than the divide modulus q, and are strictly
linear time in the "bitsize ratio" lg(x)/lg(q). For the paradigmatic
performance test of multiword dividend and single 64-bit-word divisor,
exploitation of the inherent data-parallelism of the algorithm effectively
mitigates the long latency of hardware integer MUL operations, as a result of
which we are able to achieve respective costs for remainder-only and full-DIV
(remainder and quotient) of 6 and 12.5 cycles per dividend word on the Intel
Core 2 implementation of the x86_64 architecture, in single-threaded execution
mode. We further describe a simple "bit-doubling modular inversion" scheme,
which allows the entire iterative computation of the mod-inverse required by
the Montgomery multiply at arbitrarily large precision to be performed with
cost less than that of a single Newtonian iteration performed at the full
precision of the final result. We also show how the Montgomery-multiply-based
powering can be efficiently used in Mersenne and Fermat-number trial
factorization via direct computation of a modular inverse power of 2, without
any need for explicit radix-mod scalings.Comment: 23 pages; 8 tables v2: Tweak formatting, pagecount -= 2. v3: Fix
incorrect powers of R in formulae [7] and [11] v4: Add Eldridge & Walter ref.
v5: Clarify relation between Algos A/A',D and Hensel-div; clarify
true-quotient mechanics; Add Haswell timings, refs to Agner Fog timings pdf
and GMP asm-timings ref-page. v6: Remove stray +bw in MULL line of Algo D
listing; add note re byte-LUT for qinv_
- âŠ