57 research outputs found

    Maximizing the Efficiency using Montgomery Multipliers on FPGA in RSA Cryptography for Wireless Sensor Networks

    Get PDF
    The architecture and modeling of RSA public key encryption/decryption systems are presented in this work. Two different architectures are proposed, mMMM42 (modified Montgomery Modular Multiplier 4 to 2 Carry Save Architecture) and RSACIPHER128 to check the suitability for implementation in Wireless Sensor Nodes to utilize the same in Wireless Sensor Networks. It can easily be fitting into systems that require different levels of security by changing the key size. The processing time is increased and space utilization is reduced in FPGA due to its reusability. VHDL code is synthesized and simulated using Xilinx-ISE for both the architectures. Architectures are compared in terms of area and time. It is verified that this architecture support for a key size of 128bits. The implementation of RSA encryption/decryption algorithm on FPGA using 128 bits data and key size with RSACIPHER128 gives good result with 50% less utilization of hardware. This design is also implemented for ASIC using Mentor Graphics

    Implementing Homomorphic Encryption Based Secure Feedback Control for Physical Systems

    Full text link
    This paper is about an encryption based approach to the secure implementation of feedback controllers for physical systems. Specifically, Paillier's homomorphic encryption is used to digitally implement a class of linear dynamic controllers, which includes the commonplace static gain and PID type feedback control laws as special cases. The developed implementation is amenable to Field Programmable Gate Array (FPGA) realization. Experimental results, including timing analysis and resource usage characteristics for different encryption key lengths, are presented for the realization of an inverted pendulum controller; as this is an unstable plant, the control is necessarily fast

    Efficient Arithmetic for the Implementation of Elliptic Curve Cryptography

    Get PDF
    The technology of elliptic curve cryptography is now an important branch in public-key based crypto-system. Cryptographic mechanisms based on elliptic curves depend on the arithmetic of points on the curve. The most important arithmetic is multiplying a point on the curve by an integer. This operation is known as elliptic curve scalar (or point) multiplication operation. A cryptographic device is supposed to perform this operation efficiently and securely. The elliptic curve scalar multiplication operation is performed by combining the elliptic curve point routines that are defined in terms of the underlying finite field arithmetic operations. This thesis focuses on hardware architecture designs of elliptic curve operations. In the first part, we aim at finding new architectures to implement the finite field arithmetic multiplication operation more efficiently. In this regard, we propose novel schemes for the serial-out bit-level (SOBL) arithmetic multiplication operation in the polynomial basis over F_2^m. We show that the smallest SOBL scheme presented here can provide about 26-30\% reduction in area-complexity cost and about 22-24\% reduction in power consumptions for F_2^{163} compared to the current state-of-the-art bit-level multiplier schemes. Then, we employ the proposed SOBL schemes to present new hybrid-double multiplication architectures that perform two multiplications with latency comparable to the latency of a single multiplication. Then, in the second part of this thesis, we investigate the different algorithms for the implementation of elliptic curve scalar multiplication operation. We focus our interest in three aspects, namely, the finite field arithmetic cost, the critical path delay, and the protection strength from side-channel attacks (SCAs) based on simple power analysis. In this regard, we propose a novel scheme for the scalar multiplication operation that is based on processing three bits of the scalar in the exact same sequence of five point arithmetic operations. We analyse the security of our scheme and show that its security holds against both SCAs and safe-error fault attacks. In addition, we show how the properties of the proposed elliptic curve scalar multiplication scheme yields an efficient hardware design for the implementation of a single scalar multiplication on a prime extended twisted Edwards curve incorporating 8 parallel multiplication operations. Our comparison results show that the proposed hardware architecture for the twisted Edwards curve model implemented using the proposed scalar multiplication scheme is the fastest secure SCA protected scalar multiplication scheme over prime field reported in the literature

    Low-Latency Elliptic Curve Scalar Multiplication

    Get PDF
    This paper presents a low-latency algorithm designed for parallel computer architectures to compute the scalar multiplication of elliptic curve points based on approaches from cryptographic side-channel analysis. A graphics processing unit implementation using a standardized elliptic curve over a 224-bit prime field, complying with the new 112-bit security level, computes the scalar multiplication in 1.9ms on the NVIDIA GTX 500 architecture family. The presented methods and implementation considerations can be applied to any parallel 32-bit architectur

    VLSI IMPLEMENTATION OF HIGH PERFORMANCE MONTGOMERY MODULAR MULTIPLICATION FOR CRYPTO GRAPHICAL

    Get PDF
    The multiplier factor receives and outputs the data with binary illustration and uses solely one-level Carry Save Adder (CSA) to avoid the carry propagation at each addition operation. This CSA is additionally accustomed perform operand pre-computation and format conversion from the carry save format to the binary illustration, leading to an occasional hardware price and short important path delay at the expense of additional clock cycles for finishing one standard multiplication. To beat the weakness, a Configurable CSA (CCSA), that may be one full-adder or 2 serial half-adders, is projected to scale back the additional clock cycles for quantity pre-computation and format conversion by 0.5. The mechanism which will notice and skip the surplus carry-save addition operations in the one-level CCSA design whereas maintaining the short important path delay is developed. The additional clock cycles for quantity pre-computation and format conversion is hidden and the high turnout is obtained. AES relies on a style principle called a substitution-permutation network, combination of each substitution and permutation, and is quick in each software package and hardware. AES doesn't use a Festal network. AES is a variant of Irondale that encompasses a fastened block size of 128 bits, and a key size of 128, 192, or 256 bits. By contrast, the Irondale specification in and of itself is nominative with block and key sizes that will be any multiple of thirty-two bits, both with a minimum of 128 and a most of 256 bits.AES operates on a 4×4 column-major order matrix of bytes, termed the state, though some versions of Irondale has a bigger block size and have further columns within the state. Most AES calculations square measure tired a special finite field

    Adiabatic Circuits for Power-Constrained Cryptographic Computations

    Get PDF
    This thesis tackles the need for ultra-low power operation in power-constrained cryptographic computations. An example of such an application could be smartcards. One of the techniques which has proven to have the potential of rendering ultra-low power operation is ‘Adiabatic Logic Technique’. However, the adiabatic circuits has associated challenges due to high energy dissipation of the Power-Clock Generator (PCG) and complexity of the multi-phase power-clocking scheme. Energy efficiency of the adiabatic system is often degraded due to the high energy dissipation of the PCG. In this thesis, nstep charging strategy using tank capacitors is considered for the power-clock generation and several design rules and trade-offs between the circuit complexity and energy efficiency of the PCG using n-step charging circuits have been proposed. Since pipelining is inherent in adiabatic logic design, careful selection of architecture is essential, as otherwise overhead in terms of area and energy due to synchronization buffers is induced specifically, in the case of adiabatic designs using 4-phase power-clocking scheme. Several architectures for the Montgomery multiplier using adiabatic logic technique are implemented and compared. An architecture which constitutes an appropriate trade-off between energy efficiency and throughput is proposed along with its methodology. Also, a strategy to reduce the overhead due to synchronization buffers is proposed. A modification in the Montgomery multiplication algorithm is proposed. Furthermore, a problem due to the application of power-clock gating in cascade stages of adiabatic logic is identified. The problem degrades the energy savings that would otherwise be obtained by the application of power-clock gating. A solution to this problem is proposed. Cryptographic implementations also present an obvious target for Power Analysis Attacks (PAA). There are several existing secure adiabatic logic designs which are proposed as a countermeasure against PAA. Shortcomings of the existing logic designs are identified, and two novel secure adiabatic logic designs are proposed as the countermeasures against PAA and improvement over the existing logic designs

    Highly secure cryptographic computations against side-channel attacks

    Get PDF
    Side channel attacks (SCAs) have been considered as great threats to modern cryptosystems, including RSA and elliptic curve public key cryptosystems. This is because the main computations involved in these systems, as the Modular Exponentiation (ME) in RSA and scalar multiplication (SM) in elliptic curve system, are potentially vulnerable to SCAs. Montgomery Powering Ladder (MPL) has been shown to be a good choice for ME and SM with counter-measures against certain side-channel attacks. However, recent research shows that MPL is still vulnerable to some advanced attacks [21, 30 and 34]. In this thesis, an improved sequence masking technique is proposed to enhance the MPL\u27s resistance towards Differential Power Analysis (DPA). Based on the new technique, a modified MPL with countermeasure in both data and computation sequence is developed and presented. Two efficient hardware architectures for original MPL algorithm are also presented by using binary and radix-4 representations, respectively

    Towards Optimized and Constant-Time CSIDH on Embedded Devices

    Get PDF
    We present an optimized, constant-time software library for commutative supersingular isogeny Diffie-Hellman key exchange (CSIDH) proposed by Castryck et al. which targets 64-bit ARM processors. The proposed library is implemented based on highly-optimized field arithmetic operations and computes the entire key exchange in constant-time. The proposed implementation is resistant to timing attacks. We adopt optimization techniques to evaluate the highest performance CSIDH on ARM-powered embedded devices such as cellphones, analyzing the possibility of using such a scheme in the quantum era. To the best of our knowledge, the proposed implementation is the first constant-time implementation of CSIDH and the first evaluation of this scheme on embedded devices. The benchmark result on a Google Pixel 2 smartphone equipped with 64-bit high-performance ARM Cortex-A72 core shows that it takes almost 12 seconds for each party to compute a commutative action operation in constant-time over the 511-bit finite field proposed by Castryck et al. However, using uniform but variable-time Montgomery ladder with security considerations improves these results significantly

    Implementation of a Generic Modular Cryptosystem for the RSA on Reconfigurable Hardware

    Get PDF
    This report summarizes the work that was initiated from the summer of 2008, on the study and analysis of cryptographic design techniques and their implementation on an FPGA board,i.e. the Virtex II pro. The study began with the understanding of a popular HDL language, namely, Verilog. Based on the study an implementation of a modular cryptosystem based on the RSA and generic upto a 256 bit modulus was realized. Optimal techniques for developing a high speed RSA cryptosystem is presented in this work. Through out the thesis the primary tool was the Xilinx based ISE toolkit. However for validation purposes other simulators such as ModelSim was also used. However, the simulations presented in this work utilizes the Xilinx ISE 10.1 Simulator environment. The Xilinx XST 10.1 was used in the synthesis of the implementation. The division technique utilized a modified non-restoring division scheme. The multiplication scheme used the Karatsuba-Ofman technique. The exponentiation scheme used was the Montgomery Modular exponentiation. The inversion scheme used a modified form of the Extended Euclidean Algorithm which involves no division or multiplication as suggested by Laszlo Hars. The thesis concludes with suggestions on extending the present implementation of RSA on FPGA
    corecore