Search CORE

1,856 research outputs found

Efficient Software Implementations of Modular Exponentiation

Author: Shay Gueron
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 28/06/2011
Field of study

RSA computations have a significant effect on the workloads of SSL/TLS servers, and therefore their software implementations on general purpose processors are an important target for optimization. We concentrate here on 512-bit modular exponentiation, used for 1024-bit RSA. We propose optimizations in two directions. At the primitives’ level, we study and improve the performance of an “Almost” Montgomery Multiplication. At the exponentiation level, we propose a method to reduce the cost of protecting the w-ary exponentiation algorithm against cache/timing side channel attacks. Together, these lead to an efficient software implementation of 512-bit modular exponentiation, which outperforms the currently fastest publicly available alternative. When measured on the latest x86-64 architecture, the 2nd Generation Intel® Core™ processor, our implementation is 43% faster than that of the current version of OpenSSL (1.0.0d)

Cryptology ePrint Archive

Fast modular squaring with AVX512IFMA

Author: Nir Drucker
Shay Gueron
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 11/04/2018
Field of study

Modular exponentiation represents a signicant workload for public key cryptosystems. Examples include not only the classical RSA, DSA, and DH algorithms, but also the partially homomorphic Paillier encryption. As a result, efficient software implementations of modular exponentiation are an important target for optimization. This paper studies methods for using Intel\u27s forthcoming AVX512 Integer Fused Multiply Accumulate (AVX512IFMA) instructions in order to speed up modular (Montgomery) squaring, which dominates the cost of the exponentiation. We further show how a minor tweak in the architectural definition of AVX512IFMA has the potential to further speed up modular squaring

Cryptology ePrint Archive

Parametric, Secure and Compact Implementation of RSA on FPGA

Author: Oksuzoglu Ersin
Savas Erkay
Savaş Erkay
Öksüzoğlu Ersin
Publication venue: IEEE Computer Society
Publication date: 09/09/2008
Field of study

We present a fast, efficient, and parameterized modular multiplier and a secure exponentiation circuit especially intended for FPGAs on the low end of the price range. The design utilizes dedicated block multipliers as the main functional unit and Block-RAM as storage unit for the operands. The adopted design methodology allows adjusting the number of multipliers, the radix used in the multipliers, and number of words to meet the system requirements such as available resources, precision and timing constraints. The architecture, based on the Montgomery modular multiplication algorithm, utilizes a pipelining technique that allows concurrent operation of hardwired multipliers. Our design completes 1020-bit and 2040-bit modular multiplications in 7.62 μs and 27.0 μs, respectively. The multiplier uses a moderate amount of system resources while achieving the best area-time product in literature. 2040-bit modular exponentiation engine can easily fit into Xilinx Spartan-3E 500; moreover the exponentiation circuit withstands known side channel attacks

Sabanci University Research Database

Implementing a protected zone in a reconfigurable processor for isolated execution of cryptographic algorithms

Author: Durahim Onur Ahmet
Savas Erkay
Savaş Erkay
Yumbul Kazım
Yumbul Kazim
Publication venue: IEEE (Institute of Electrical and Electronics Engineers)
Publication date: 01/01/2009
Field of study

We design and realize a protected zone inside a reconfigurable and extensible embedded RISC processor for isolated execution of cryptographic algorithms. The protected zone is a collection of processor subsystems such as functional units optimized for high-speed execution of integer operations, a small amount of local memory, and general and special-purpose registers. We outline the principles for secure software implementation of cryptographic algorithms in a processor equipped with the protected zone. We also demonstrate the efficiency and effectiveness of the protected zone by implementing major cryptographic algorithms, namely RSA, elliptic curve cryptography, and AES in the protected zone. In terms of time efficiency, software implementations of these three cryptographic algorithms outperform equivalent software implementations on similar processors reported in the literature. The protected zone is designed in such a modular fashion that it can easily be integrated into any RISC processor; its area overhead is considerably moderate in the sense that it can be used in vast majority of embedded processors. The protected zone can also provide the necessary support to implement TPM functionality within the boundary of a processor

Sabanci University Research Database

Realizing arbitrary-precision modular multiplication with a fixed-precision multiplier datapath

Author: Grossschaedl Johann
Savas Erkay
Savaş Erkay
Yumbul Kazım
Yumbul Kazim
Publication venue: IEEE (Institute of Electrical and Electronics Engineers)
Publication date: 30/09/2009
Field of study

Within the context of cryptographic hardware, the term scalability refers to the ability to process operands of any size, regardless of the precision of the underlying data path or registers. In this paper we present a simple yet effective technique for increasing the scalability of a fixed-precision Montgomery multiplier. Our idea is to extend the datapath of a Montgomery multiplier in such a way that it can also perform an ordinary multiplication of two n-bit operands (without modular reduction), yielding a 2n-bit result. This conventional (nxn->2n)-bit multiplication is then used as a “sub-routine” to realize arbitrary-precision Montgomery multiplication according to standard software algorithms such as Coarsely Integrated Operand Scanning (CIOS). We show that performing a 2n-bit modular multiplication on an n-bit multiplier can be done in 5n clock cycles, whereby we assume that the n-bit modular multiplication takes n cycles. Extending a Montgomery multiplier for this extra functionality requires just some minor modifications of the datapath and entails a slight increase in silicon area

Crossref

Sabanci University Research Database

Open Repository and Bibliography - Luxembourg

Secure and Efficient RNS Approach for Elliptic Curve Cryptography

Author: Batina Lejla
Fournaris Apostolos P.
Papachristodoulou Louiza
Sklavos Nicolas
Publication venue
Publication date: 15/11/2016
Field of study

Scalar multiplication, the main operation in elliptic curve cryptographic protocols, is vulnerable to side-channel (SCA) and fault injection (FA) attacks. An efficient countermeasure for scalar multiplication can be provided by using alternative number systems like the Residue Number System (RNS). In RNS, a number is represented as a set of smaller numbers, where each one is the result of the modular reduction with a given moduli basis. Under certain requirements, a number can be uniquely transformed from the integers to the RNS domain (and vice versa) and all arithmetic operations can be performed in RNS. This representation provides an inherent SCA and FA resistance to many attacks and can be further enhanced by RNS arithmetic manipulation or more traditional algorithmic countermeasures. In this paper, extending our previous work, we explore the potentials of RNS as an SCA and FA countermeasure and provide an description of RNS based SCA and FA resistance means. We propose a secure and efficient Montgomery Power Ladder based scalar multiplication algorithm on RNS and discuss its SCAFA resistance. The proposed algorithm is implemented on an ARM Cortex A7 processor and its SCA-FA resistance is evaluated by collecting preliminary leakage trace results that validate our initial assumptions

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

ASIC design and implementation of a parallel exponentiation algorithm using optimized scalable Montgomery multipliers

Author: Kurniawan Budiyoso
Publication venue: 'Oregon State University'
Publication date
Field of study

Modular exponentiation and modular multiplication are the most used operations in current cryptographic systems. Some well-known cryptographic algorithms, such as RSA, Diffie-Hellman key exchange, and DSA, require modular exponentiation operations. This is performed with a series of modular multiplications to the extent of its exponent in a certain fashion depending on the exponentiation algorithm used. Cryptographic functions are very likely to be applied in current applications that perform information exchange to secure, verify, or authenticate data. Most notable is the use of such applications in Internet based information exchange. Smart cards, hand-helds, cell phones and many other small devices also need to perform information exchange and are likely to apply cryptographic functions. A hardware solution to perform a cryptographic function is generally faster and more secure than a software solution. Thus, a fast and area efficient modular exponentiation hardware solution would provide a better infrastructure for current cryptographic techniques. In certain cryptographic algorithms, very large precisions are used. Further, the precision may vary. Most of the hardware designs for modular multiplication and modular exponentiation are fixed-precision solutions. A scalable Montgomery Multiplier (MM) to perform modular multiplication has been proposed and can operate on input values of any bit-size, but the maximum bit-size should be known and is the limiting factor. The multiplier can calculate any operand size less than the maximal precision. However, this design's parameters should be optimized depending on the operand precision for which the design is used. A software application was developed in C to find the optimized design for the scalable MM module. It performs area-time trade-off for the most commonly used precisions in order to obtain a fast and area efficient solution for the common case. A modular exponentiation system is developed using this scalable multiplier design. Since the multiplier can operate on any operand size up to a certain maximum value, the exponentiation system that utilizes the multiplier will inherit the same capability. This thesis work presents the design and implementation of an exponentiation algorithm in hardware utilizing the optimized scalable Montgomery Multiplier. The design uses a parallel exponentiation algorithm to reduce the total computation time. The modular exponentiation system experimental results are analyzed and compared with software and other hardware implementations

ScholarsArchive@OSU