Search CORE

159 research outputs found

Realizing arbitrary-precision modular multiplication with a fixed-precision multiplier datapath

Author: Grossschaedl Johann
Savas Erkay
Savaş Erkay
Yumbul Kazım
Yumbul Kazim
Publication venue: IEEE (Institute of Electrical and Electronics Engineers)
Publication date: 30/09/2009
Field of study

Within the context of cryptographic hardware, the term scalability refers to the ability to process operands of any size, regardless of the precision of the underlying data path or registers. In this paper we present a simple yet effective technique for increasing the scalability of a fixed-precision Montgomery multiplier. Our idea is to extend the datapath of a Montgomery multiplier in such a way that it can also perform an ordinary multiplication of two n-bit operands (without modular reduction), yielding a 2n-bit result. This conventional (nxn->2n)-bit multiplication is then used as a “sub-routine” to realize arbitrary-precision Montgomery multiplication according to standard software algorithms such as Coarsely Integrated Operand Scanning (CIOS). We show that performing a 2n-bit modular multiplication on an n-bit multiplier can be done in 5n clock cycles, whereby we assume that the n-bit modular multiplication takes n cycles. Extending a Montgomery multiplier for this extra functionality requires just some minor modifications of the datapath and entails a slight increase in silicon area

Crossref

Sabanci University Research Database

Open Repository and Bibliography - Luxembourg

Montgomery and RNS for RSA Hardware Implementation

Author: Manochehri Kooroush
Pourmozafari Saadat
Sadeghian Babak
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

There are many architectures for RSA hardware implementation which improve its performance. Two main methods for this purpose are Montgomery and RNS. These are fast methods to convert plaintext to ciphertext in RSA algorithm with hardware implementation. RNS is faster than Montgomery but it uses more area. The goal of this paper is to compare these two methods based on the speed and on the used area. For this purpose the architecture that has a better performance for each method is selected, and some modification is done to enhance their performance. This comparison can be used to select the proper method for hardware implementation in both FPGA and ASIC design

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Low Power and Improved Speed Montgomery Multiplier using Universal Building Blocks

Author: Chinnayan Senthilpari
Joseph Sheela Francisca
Pitchandi Velrajkumar
Raj Nirmal
Publication venue: Electronics and Telecommunications Committee
Publication date: 01/01/2019
Field of study

This paper describes the arithmetic blocks based on Montgomery Multiplier (MM), which reduces complexity, gives lower power dissipation and higher operating frequency. The main objective in designing these arithmetic blocks is to use modified full adder structure and carry save adder structure that can be implemented in algorithm based MM circuit. The conventional full adder design acts as a benchmark for comparison, the second is the modified Boolean equation for full adder and third design is the design of full adder consisting of two XOR gate and a 2-to-1 Multiplexer. Besides Universal gates such as NOR gate and NAND gate, full adder circuits are used to further improve the speed of the circuit. The MM circuit is evaluated based on different parameters such as operating frequency, power dissipation and area of occupancy in FPGA board. The schematic designs of the arithmetic components along with the MM architecture are constructed using Quartus II tool, while the simulation is done using Model sim for verification of circuit functionality which has shown improvement on the full adder design with two XOR gate and one 2-to-1 Multiplexer implementation in terms of power dissipation, operating frequency and area

International Journal of Electronics and Telecommunications (Warsaw University of Technology)

SHDL@MMU Digital Repository

Maximizing the Efficiency using Montgomery Multipliers on FPGA in RSA Cryptography for Wireless Sensor Networks

Author: Leelavathi G, Shaila K, Venugopal K R
Publication venue: Auricle Global Society of Education and Research
Publication date: 30/11/2017
Field of study

The architecture and modeling of RSA public key encryption/decryption systems are presented in this work. Two different architectures are proposed, mMMM42 (modified Montgomery Modular Multiplier 4 to 2 Carry Save Architecture) and RSACIPHER128 to check the suitability for implementation in Wireless Sensor Nodes to utilize the same in Wireless Sensor Networks. It can easily be fitting into systems that require different levels of security by changing the key size. The processing time is increased and space utilization is reduced in FPGA due to its reusability. VHDL code is synthesized and simulated using Xilinx-ISE for both the architectures. Architectures are compared in terms of area and time. It is verified that this architecture support for a key size of 128bits. The implementation of RSA encryption/decryption algorithm on FPGA using 128 bits data and key size with RSACIPHER128 gives good result with 50% less utilization of hardware. This design is also implemented for ASIC using Mentor Graphics

International Journal on Future Revolution in Computer Science & Communication Engineering

A high-speed integrated circuit with applications to RSA Cryptography

Author: Onions Paul David
Publication venue: 'University of Plymouth'
Publication date: 01/01/1995
Field of study

Merged with duplicate record 10026.1/833 on 01.02.2017 by CS (TIS)The rapid growth in the use of computers and networks in government, commercial and private communications systems has led to an increasing need for these systems to be secure against unauthorised access and eavesdropping. To this end, modern computer security systems employ public-key ciphers, of which probably the most well known is the RSA ciphersystem, to provide both secrecy and authentication facilities. The basic RSA cryptographic operation is a modular exponentiation where the modulus and exponent are integers typically greater than 500 bits long. Therefore, to obtain reasonable encryption rates using the RSA cipher requires that it be implemented in hardware. This thesis presents the design of a high-performance VLSI device, called the WHiSpER chip, that can perform the modular exponentiations required by the RSA cryptosystem for moduli and exponents up to 506 bits long. The design has an expected throughput in excess of 64kbit/s making it attractive for use both as a general RSA processor within the security function provider of a security system, and for direct use on moderate-speed public communication networks such as ISDN. The thesis investigates the low-level techniques used for implementing high-speed arithmetic hardware in general, and reviews the methods used by designers of existing modular multiplication/exponentiation circuits with respect to circuit speed and efficiency. A new modular multiplication algorithm, MMDDAMMM, based on Montgomery arithmetic, together with an efficient multiplier architecture, are proposed that remove the speed bottleneck of previous designs. Finally, the implementation of the new algorithm and architecture within the WHiSpER chip is detailed, along with a discussion of the application of the chip to ciphering and key generation

Plymouth Electronic Archive and Research Library

OpenGrey Repository

Comparison of Scalable Montgomery Modular Multiplication Implementations Embedded in Reconfigurable Hardware

Author: Drutarovský Milos
Fischer Viktor
Simka Martin
Publication venue: 'Corporacion Universitaria Latinoamericana CUL'
Publication date: 01/01/2006
Field of study

International audienceThis paper presents a comparison of possible approaches for an efficient implementation of Multiple-word radix-2 Montgomery Modular Multiplication (MM) on modern Field Programmable Gate Arrays (FPGAs). The hardware implementation of MM coprocessor is fully scalable what means that it can be reused in order to generate long-precision results independently on the word length of the originally proposed coprocessor. The first of analyzed implementations uses a data path based on traditionally used redundant carry-save adders, the second one exploits, in scalable designs not yet applied, standard carry-propagate adders with fast carry chain logic. As a control unit and a platform for purely software implementation an embedded soft-core processor Altera NIOS is employed. All implementations use large embedded memory blocks available in recent FPGAs. Speed and logic requirements comparisons are performed on the optimized software and combined hardware-software designs in Altera FPGAs. The issues of targeting a design specifically for a FPGA are considered taking into account the underlying architecture imposed by the target FPGA technology. It is shown that the coprocessors based on carry-save adders and carry-propagate adders provide comparable results in constrained FPGA implementations but in case of carry-propagate logic, the solution requires less embedded memory and provides some additional implementation advantages presented in the paper

HAL-UJM

Low-cost, low-power FPGA implementation of ED25519 and CURVE25519 point multiplication

Author: Doche Christophe
Mehrabi Ali (R20330)
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Twisted Edwards curves have been at the center of attention since their introduction by Bernstein et al. in 2007. The curve ED25519, used for Edwards-curve Digital Signature Algorithm (EdDSA), provides faster digital signatures than existing schemes without sacrificing security. The CURVE25519 is a Montgomery curve that is closely related to ED25519. It provides a simple, constant time, and fast point multiplication, which is used by the key exchange protocol X25519. Software implementations of EdDSA and X25519 are used in many web-based PC and Mobile applications. In this paper, we introduce a low-power, low-area FPGA implementation of the ED25519 and CURVE25519 scalar multiplication that is particularly relevant for Internet of Things (IoT) applications. The efficiency of the arithmetic modulo the prime number 2 255 − 19, in particular the modular reduction and modular multiplication, are key to the efficiency of both EdDSA and X25519. To reduce the complexity of the hardware implementation, we propose a high-radix interleaved modular multiplication algorithm. One benefit of this architecture is to avoid the use of large-integer multipliers relying on FPGA DSP modules

Western Sydney ResearchDirect

Hardware-Software Codesign of a Vector Co-processor for Public Key Cryptography

Author: Fournier Jacques Jean-Alain
Moore Simon
Publication venue: IEEE Computer Society
Publication date: 01/01/2006
Field of study

International audienceUntil now, most cryptography implementations on parallel architectures have focused on adapting the software to SIMD architectures initially meant for media applications. In this paper, we review some of the most significant contributions in this area. We then propose a vector architecture to efficiently implement long precision modular multiplications. Having such a data level parallel hardware provides a circuit whose decode and schedule units are at least of the same complexity as those of a scalar processor. The excess transistors are mainly found in the data path. Moreover, the vector approach gives a very modular architecture where resources can be easily redefined. We built a functional simulator onto which we performed a quantitative analysis to study how the resizing of those resources affects the performance of the modular multiplication operation. Hence we not only propose a vector architecture for our Public Key cryptographic operations but also show how we can analyze the impact of design choices on performance. The proposed architecture is also flexible in the sense that the software running on it would offer room for the implementation of counter-measures against side-channel or fault attacks

Crossref

HAL-EMSE

Performance Analysis of Montgomery Multiplier using 32nm CNTFET Technology

Author: Alias Nurul Ezaila
Jayashri S
Mathan N
Tan Michael Loong Peng
Publication venue: IAES Indonesia Section
Publication date: 02/01/2020
Field of study

In VLSI design vacillating the parameters results in variation of critical factors like area, power and delay. The dominant sources of power dissipation in digital systems are the digital multipliers. A digital multiplier plays a major role in a mixture of arithmetic operations in digital signal processing applications hinge on add and shift algorithms. In order to accomplish high execution speed, parallel array multipliers are comprehensively put into application. The crucial drawback of these multipliers is that it exhausts more power than any other multiplier architectures. Montgomery Multiplication is the popularly used algorithm as it is the most efficient technique to perform arithmetic based calculations. A high-speed multiplier is greatly coveted for its extraordinary leverage. The primary blocks of a multiplier are basically comprised of adders. Thus, in order to attain a significant reduction in power consumption at the chip level the power utilization in adders can be decreased. To obtain desired results in performance parameters of the multiplier an efficient and dynamic adder is proposed and incorporated in the Montgomery multiplier. The Carbon Nanotube field effect transistor (CNTFET) is a promising new device that may supersede some of the fundamental limitations of a silicon based MOSFET. The architecture has been designed in 130nm and 32nm CMOS and CNTFET technology in Synopsys HSpice. The analysed parameters that are considered in determining the performance are power delay product, power and delay and comparison is made with both the technologies.The simulation results of this paper affirmed the CNTFET based Montgomery multiplier improved power consumption by 76.47% ,speed by 72.67% and overall energy by 67.76% as compared to MOSFET-based Montgomery multiplier

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)

Efficient Side-Channel Aware Elliptic Curve Cryptosystems over Prime Fields

Author: Karakoyunlu Deniz
Publication venue: Digital WPI
Publication date: 08/08/2010
Field of study

Elliptic Curve Cryptosystems (ECCs) are utilized as an alternative to traditional public-key cryptosystems, and are more suitable for resource limited environments due to smaller parameter size. In this dissertation we carry out a thorough investigation of side-channel attack aware ECC implementations over finite fields of prime characteristic including the recently introduced Edwards formulation of elliptic curves, which have built-in resiliency against simple side-channel attacks. We implement Joye\u27s highly regular add-always scalar multiplication algorithm both with the Weierstrass and Edwards formulation of elliptic curves. We also propose a technique to apply non-adjacent form (NAF) scalar multiplication algorithm with side-channel security using the Edwards formulation. Our results show that the Edwards formulation allows increased area-time performance with projective coordinates. However, the Weierstrass formulation with affine coordinates results in the simplest architecture, and therefore has the best area-time performance as long as an efficient modular divider is available

DigitalCommons@WPI