9 research outputs found

    Efficient Arithmetic on ARM-NEON and Its Application for High-Speed RSA Implementation

    Get PDF
    Advanced modern processors support Single Instruction Multiple Data (SIMD) instructions (e.g. Intel-AVX, ARM-NEON) and a massive body of research on vector-parallel implementations of modular arithmetic, which are crucial components for modern public-key cryptography ranging from RSA, ElGamal, DSA and ECC, have been conducted. In this paper, we introduce a novel Double Operand Scanning (DOS) method to speed-up multi-precision squaring with non-redundant representations on SIMD architecture. The DOS technique partly doubles the operands and computes the squaring operation without Read-After-Write (RAW) dependencies between source and destination variables. Furthermore, we presented Karatsuba Cascade Operand Scanning (KCOS) multiplication and Karatsuba Double Operand Scanning (KDOS) squaring by adopting additive and subtractive Karatsuba\u27s methods, respectively. The proposed multiplication and squaring methods are compatible with separated Montgomery algorithms and these are highly efficient for RSA crypto system. Finally, our proposed multiplication/squaring, separated Montgomery multiplication/squaring and RSA encryption outperform the best-known results by 22/41\%, 25/33\% and 30\% on the Cortex-A15 platform

    The complete cost of cofactor h=1

    Get PDF
    This paper presents optimized software for constant-time variable-base scalar multiplication on prime-order Weierstraß curves using the complete addition and doubling formulas presented by Renes, Costello, and Batina in 2016. Our software targets three different microarchitectures: Intel Sandy Bridge, Intel Haswell, and ARM Cortex-M4. We use a 255-bit elliptic curve over F225519\mathbb{F}_{2^{255}-19} that was proposed by Barreto in 2017. The reason for choosing this curve in our software is that it allows most meaningful comparison of our results with optimized software for Curve25519. The goal of this comparison is to get an understanding of the cost of using cofactor-one curves with complete formulas when compared to widely used Montgomery (or twisted Edwards) curves that inherently have a non-trivial cofactor

    RISC-V implementation of the NaCl-library

    Get PDF

    High-speed Curve25519 on 8-bit, 16-bit, and 32-bit microcontrollers

    Get PDF
    This paper presents new speed records for 128-bit secure elliptic-curve Diffie-Hellman key-exchange software on three different popular microcontroller architectures. We consider a 255-bit curve proposed by Bernstein known as Curve25519, which has also been adopted by the IETF. We optimize the X25519 key-exchange protocol proposed by Bernstein in 2006 for AVR ATmega 8-bit microcontrollers, MSP430X 16-bit microcontrollers, and for ARM Cortex-M0 32-bit microcontrollers. Our software for the AVR takes only 13 900 397 cycles for the computation of a Diffe-Hellman shared secret, and is the first to perform this computation in less than a second if clocked at 16 MHz for a security level of 128 bits. Our MSP430X software computes a shared secret in 5 301 792 cycles on MSP430X microcontrollers that have a 32-bit hardware multiplier and in 7 933 296 cycles on MSP430X microcontrollers that have a 16-bit multiplier. It thus outperforms previous constant-time ECDH software at the 128-bit security level on the MSP430X by more than a factor of 1.2 and 1.15, respectively. Our implementation on the Cortex-M0 runs in only 3 589 850 cycles and outperforms previous 128-bit secure ECDH software by a factor of 3

    Efficient Elliptic Curve Cryptography Software Implementation on Embedded Platforms

    Get PDF

    Multiprecision multiplication on AVR revisited

    Get PDF
    Contains fulltext : 141398.pdf (publisher's version ) (Closed access

    Efficient and Side-Channel Resistant Implementations of Next-Generation Cryptography

    Get PDF
    The rapid development of emerging information technologies, such as quantum computing and the Internet of Things (IoT), will have or have already had a huge impact on the world. These technologies can not only improve industrial productivity but they could also bring more convenience to people’s daily lives. However, these techniques have “side effects” in the world of cryptography – they pose new difficulties and challenges from theory to practice. Specifically, when quantum computing capability (i.e., logical qubits) reaches a certain level, Shor’s algorithm will be able to break almost all public-key cryptosystems currently in use. On the other hand, a great number of devices deployed in IoT environments have very constrained computing and storage resources, so the current widely-used cryptographic algorithms may not run efficiently on those devices. A new generation of cryptography has thus emerged, including Post-Quantum Cryptography (PQC), which remains secure under both classical and quantum attacks, and LightWeight Cryptography (LWC), which is tailored for resource-constrained devices. Research on next-generation cryptography is of importance and utmost urgency, and the US National Institute of Standards and Technology in particular has initiated the standardization process for PQC and LWC in 2016 and in 2018 respectively. Since next-generation cryptography is in a premature state and has developed rapidly in recent years, its theoretical security and practical deployment are not very well explored and are in significant need of evaluation. This thesis aims to look into the engineering aspects of next-generation cryptography, i.e., the problems concerning implementation efficiency (e.g., execution time and memory consumption) and security (e.g., countermeasures against timing attacks and power side-channel attacks). In more detail, we first explore efficient software implementation approaches for lattice-based PQC on constrained devices. Then, we study how to speed up isogeny-based PQC on modern high-performance processors especially by using their powerful vector units. Moreover, we research how to design sophisticated yet low-area instruction set extensions to further accelerate software implementations of LWC and long-integer-arithmetic-based PQC. Finally, to address the threats from potential power side-channel attacks, we present a concept of using special leakage-aware instructions to eliminate overwriting leakage for masked software implementations (of next-generation cryptography)
    corecore