82 research outputs found
Secure and Efficient RNS Approach for Elliptic Curve Cryptography
Scalar multiplication, the main operation in elliptic
curve cryptographic protocols, is vulnerable to side-channel
(SCA) and fault injection (FA) attacks. An efficient countermeasure
for scalar multiplication can be provided by using alternative
number systems like the Residue Number System (RNS). In RNS,
a number is represented as a set of smaller numbers, where each
one is the result of the modular reduction with a given moduli
basis. Under certain requirements, a number can be uniquely
transformed from the integers to the RNS domain (and vice
versa) and all arithmetic operations can be performed in RNS.
This representation provides an inherent SCA and FA resistance
to many attacks and can be further enhanced by RNS arithmetic
manipulation or more traditional algorithmic countermeasures.
In this paper, extending our previous work, we explore the
potentials of RNS as an SCA and FA countermeasure and provide
an description of RNS based SCA and FA resistance means. We
propose a secure and efficient Montgomery Power Ladder based
scalar multiplication algorithm on RNS and discuss its SCAFA
resistance. The proposed algorithm is implemented on an
ARM Cortex A7 processor and its SCA-FA resistance is evaluated
by collecting preliminary leakage trace results that validate our
initial assumptions
Residue Number System Hardware Emulator and Instructions Generator
Residue Number System (RNS) is an alternative
form of representing integers on which a large value gets
represented by a set of smaller and independent integers.
Cryptographic and signal filtering algorithms benefit from the
use of RNS, due to its capabilities to increase performance and
security. Herein, a simulation tool is presented which emulates
the hardware implementation of an actual RNS co-processor. An
âhigh-level to assemblyâ instructions generator is also built into
this tool. The programmability and scalable architecture of the
considered processor along with the high level description of the
algorithm allows researchers and developers to easily evaluate
and test their RNS algorithms on an actual architecture, using
Java
Exact Error Bound of Cox-Rower Architecture for RNS Arithmetic
Residue Number System (RNS) is a method for representing an integer as an n-tuple of its residues with respect to a given base. Since RNS has inherent parallelism, it is actively researched to implement fast public-key cryptography using RNS. This paper derives the exact error bound of approximation on the Cox-Rower architecture which was proposed for RNS modular multiplication. This is the tightest bound ever found and enables us to find new parameter sets for the Cox-Rower architecture, which cannot be found with old bounds
An algorithmic and architectural study on Montgomery exponentiation in RNS
The modular exponentiation on large numbers is computationally intensive. An effective way for performing this operation consists in using Montgomery exponentiation in the Residue Number System (RNS). This paper presents an algorithmic and architectural study of such exponentiation approach. From the algorithmic point of view, new and state-of-the-art opportunities that come from the reorganization of operations and precomputations are considered. From the architectural perspective, the design opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell architecture. Furthermore, since the use of efficient RNS bases with a low Hamming weight are being considered with ever more interest, four additional cell architectures specifically tailored to these bases are developed and the tradeoff between benefits and drawbacks is carefully explored. An overall comparison among all the considered algorithmic approaches and cell architectures is presented, with the aim of providing the reader with an extensive overview of the Montgomery exponentiation opportunities in RNS
A coprocessor for secure and high speed modular arithmetic
We present a coprocessor design for fast arithmetic over large numbers of cryptographic sizes. Our design provides a efficient way to prevent side channel analysis as well as fault analysis targeting modular arithmetic with large prime or composite numbers. These two countermeasure are then suitable both for Elliptic Curve Cryptography over prime fields or RSA using CRT or not. To do so, we use the residue number system (RNS) in an efficient manner to protect from leakage and fault, while keeping its ability to fast execute modular arithmetic with large numbers. We illustrate our countermeasure with a fully protected RSA-CRT implementation using our architecture, and show that it is possible to execute a secure 1024 bit RSA-CRT in less than 0:7 ms on a FPGA
Improving Modular Inversion in RNS using the Plus-Minus Method
The paper describes a new RNS modular inversion algorithm based on the extended Euclidean algorithm and the plus-minus trick. In our algorithm, comparisons over large RNS values are replaced by cheap computations modulo 4. Comparisons to an RNS version based on Fermatâs little theorem were carried out. The number of elementary modular operations is significantly reduced: a factor 12 to 26 for multiplications and 6 to 21 for additions. Virtex 5 FPGAs implementations show that for a similar area, our plus-minus RNS modular inversion is 6 to 10 times faster
High-Speed and Unified ECC Processor for Generic Weierstrass Curves over GF(p) on FPGA
In this paper, we present a high-speed, unified elliptic curve cryptography (ECC) processor for arbitrary Weierstrass curves over GF(p), which to the best of our knowledge, outperforms other similar works in terms of execution time. Our approach employs the combination of the schoolbook long and Karatsuba multiplication algorithm for the elliptic curve point multiplication (ECPM) to achieve better parallelization while retaining low complexity. In the hardware implementation, the substantial gain in speed is also contributed by our n-bit pipelined Montgomery Modular Multiplier (pMMM), which is constructed from our n-bit pipelined multiplier-accumulators that utilizes digital signal processor (DSP) primitives as digit multipliers. Additionally, we also introduce our unified, pipelined modular adder-subtractor (pMAS) for the underlying field arithmetic, and leverage a more efficient yet compact scheduling of the Montgomery ladder algorithm. The implementation for 256-bit modulus size on the 7-series FPGA: Virtex-7, Kintex-7, and XC7Z020 yields 0.139, 0.138, and 0.206 ms of execution time, respectively. Furthermore, since our pMMM module is generic for any curve in Weierstrass form, we support multi-curve parameters, resulting in a unified ECC architecture. Lastly, our method also works in constant time, making it suitable for applications requiring high speed and SCA-resistant characteristics
A FPGA pairing implementation using the Residue Number System
Recently, a lot of progresses have been made in software implementations of pairings
at the 128-bit security level in large characteristic. In this work, we obtain analogous progresses
for hardware implementations. For this, we use the RNS representation of numbers which is
especially well suited for pairing computation in a hardware context. A FPGA implementation
is proposed, based on an adaptation of Guillermin\u27s architecture which computes a pairing in
1.07 ms. It is 2 times faster than all previous hardware implementations (including ASIC and
small characteristic implementations) and almost as fast as best software implementations
Parallel FPGA Implementation of RSA with Residue Number Systems - Can side-channel threats be avoided? - Extended version
In this paper, we present a new parallel architecture to avoid
side-channel analyses such as: timing attack, simple/differential
power analysis, fault induction attack and simple/differential
electromagnetic analysis. We use a Montgomery Multiplication based
on Residue Number Systems. Thanks to RNS, we develop a design able
to perform an RSA signature in parallel on a set of identical and
independent coprocessors. Of independent interest, we propose a
new DPA countermeasure in the framework of RNS. It is only
(slightly) memory consuming (1.5 KBytes). Finally, we synthesized
our new architecture on FPGA and it presents promising performance
results. Even if our aim is to sketch a secure architecture, the
RSA signature is performed in less than 160 ms, with competitive
hardware resources. To our knowledge, this is the first proposal
of an architecture counteracting electromagnetic analysis apart
from hardware countermeasures reducing electromagnetic radiations
- âŠ