Search CORE

24,099 research outputs found

Realizing arbitrary-precision modular multiplication with a fixed-precision multiplier datapath

Author: Grossschaedl Johann
Savas Erkay
Savaş Erkay
Yumbul Kazım
Yumbul Kazim
Publication venue: IEEE (Institute of Electrical and Electronics Engineers)
Publication date: 30/09/2009
Field of study

Within the context of cryptographic hardware, the term scalability refers to the ability to process operands of any size, regardless of the precision of the underlying data path or registers. In this paper we present a simple yet effective technique for increasing the scalability of a fixed-precision Montgomery multiplier. Our idea is to extend the datapath of a Montgomery multiplier in such a way that it can also perform an ordinary multiplication of two n-bit operands (without modular reduction), yielding a 2n-bit result. This conventional (nxn->2n)-bit multiplication is then used as a “sub-routine” to realize arbitrary-precision Montgomery multiplication according to standard software algorithms such as Coarsely Integrated Operand Scanning (CIOS). We show that performing a 2n-bit modular multiplication on an n-bit multiplier can be done in 5n clock cycles, whereby we assume that the n-bit modular multiplication takes n cycles. Extending a Montgomery multiplier for this extra functionality requires just some minor modifications of the datapath and entails a slight increase in silicon area

Crossref

Sabanci University Research Database

Open Repository and Bibliography - Luxembourg

Analysis of Parallel Montgomery Multiplication in CUDA

Author: Liu Yuheng
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2013
Field of study

For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared

SJSU ScholarWorks

Efficient Pipelining for Modular Multiplication Architectures in Prime Fields

Author: Bart Preneel
Ingrid Verbauwhede
Kazuo Sakiyama
Nele Mentens
Publication venue
Publication date: 01/01/2007
Field of study

This paper presents a pipelined architecture of a modular Montgomery multiplier, which is suitable to be used in public key coprocessors. Starting from a baseline implementation of the Montgomery algorithm, a more compact pipelined version is derived. The design makes use of 16bit integer multiplication blocks that are available on recently manufactured FPGAs. The critical path is optimized by omitting the exact computation of intermediate results in the Montgomery algorithm using a 6-2 carry-save notation. This results in a high-speed architecture, which outperforms previously designed Montgomery multipliers. Because a very popular application of Montgomery multiplication is public key cryptography, we compare our implementation to the state-of-the-art in Montgomery multipliers on the basis of performance results for 1024-bit RSA

CiteSeerX

Crossref

Edwards curves and CM curves

Author: Morain François
Publication venue
Publication date: 14/04/2009
Field of study

Edwards curves are a particular form of elliptic curves that admit a fast, unified and complete addition law. Relations between Edwards curves and Montgomery curves have already been described. Our work takes the view of parameterizing elliptic curves given by their j-invariant, a problematic that arises from using curves with complex multiplication, for instance. We add to the catalogue the links with Kubert parameterizations of X0(2) and X0(4). We classify CM curves that admit an Edwards or Montgomery form over a finite field, and justify the use of isogenous curves when needed

arXiv.org e-Print Archive

HAL-Polytechnique

Survey on Hardware Implementation of Montgomery Modular

Author: Muthaiah Rajappa
Pratibha K
Publication venue: http://www.ijpam.eu
Publication date: 01/01/2018
Field of study

This paper gives the information regarding different methodology for modular multiplication with the modification of Montgomery algorithm. Montgomery multiplier proved to be more efficient multiplier which replaces division by the modulus with series of shifting by a number and an adder block. For larger number of bits, Modular multiplication takes more time to compute and also takes more area of the chip. Different methods ensure more speed and less chip size of the system. The speed of the multiplier is decided by the multiplier. Here three modified Montgomery algorithm discussed with their output compared with each other. The three methods are Iterative architecture, Montgomery multiplier for faster Cryptography and Vedic multipliers used in Montgomery algorithm for multiplication.Here three boards have been used for the analysis and they are Altera DE2-70, FPGA board Virtex 6 and Kintex 7

E-LIS

A versatile Montgomery multiplier architecture with characteristic three support

Author: Ozturk Erdinc
Savas Erkay
Savaş Erkay
Sunar Berk
Öztürk Erdinç
Publication venue: 'Elsevier BV'
Publication date: 08/05/2008
Field of study

We present a novel unified core design which is extended to realize Montgomery multiplication in the fields GF(2n), GF(3m), and GF(p). Our unified design supports RSA and elliptic curve schemes, as well as the identity-based encryption which requires a pairing computation on an elliptic curve. The architecture is pipelined and is highly scalable. The unified core utilizes the redundant signed digit representation to reduce the critical path delay. While the carry-save representation used in classical unified architectures is only good for addition and multiplication operations, the redundant signed digit representation also facilitates efficient computation of comparison and subtraction operations besides addition and multiplication. Thus, there is no need for a transformation between the redundant and the non-redundant representations of field elements, which would be required in the classical unified architectures to realize the subtraction and comparison operations. We also quantify the benefits of the unified architectures in terms of area and critical path delay. We provide detailed implementation results. The metric shows that the new unified architecture provides an improvement over a hypothetical non-unified architecture of at least 24.88%, while the improvement over a classical unified architecture is at least 32.07%

Sabanci University Research Database

Secure and Efficient RNS Approach for Elliptic Curve Cryptography

Author: Batina Lejla
Fournaris Apostolos P.
Papachristodoulou Louiza
Sklavos Nicolas
Publication venue
Publication date: 15/11/2016
Field of study

Scalar multiplication, the main operation in elliptic curve cryptographic protocols, is vulnerable to side-channel (SCA) and fault injection (FA) attacks. An efficient countermeasure for scalar multiplication can be provided by using alternative number systems like the Residue Number System (RNS). In RNS, a number is represented as a set of smaller numbers, where each one is the result of the modular reduction with a given moduli basis. Under certain requirements, a number can be uniquely transformed from the integers to the RNS domain (and vice versa) and all arithmetic operations can be performed in RNS. This representation provides an inherent SCA and FA resistance to many attacks and can be further enhanced by RNS arithmetic manipulation or more traditional algorithmic countermeasures. In this paper, extending our previous work, we explore the potentials of RNS as an SCA and FA countermeasure and provide an description of RNS based SCA and FA resistance means. We propose a secure and efficient Montgomery Power Ladder based scalar multiplication algorithm on RNS and discuss its SCAFA resistance. The proposed algorithm is implemented on an ARM Cortex A7 processor and its SCA-FA resistance is evaluated by collecting preliminary leakage trace results that validate our initial assumptions

UPCommons. Portal del coneixement obert de la UPC