Search CORE

34 research outputs found

Efficient Arithmetic on ARM-NEON and Its Application for High-Speed RSA Implementation

Author: Howon Kim
Hwajeong Seo
Johann Groschadl
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 20/05/2015
Field of study

Advanced modern processors support Single Instruction Multiple Data (SIMD) instructions (e.g. Intel-AVX, ARM-NEON) and a massive body of research on vector-parallel implementations of modular arithmetic, which are crucial components for modern public-key cryptography ranging from RSA, ElGamal, DSA and ECC, have been conducted. In this paper, we introduce a novel Double Operand Scanning (DOS) method to speed-up multi-precision squaring with non-redundant representations on SIMD architecture. The DOS technique partly doubles the operands and computes the squaring operation without Read-After-Write (RAW) dependencies between source and destination variables. Furthermore, we presented Karatsuba Cascade Operand Scanning (KCOS) multiplication and Karatsuba Double Operand Scanning (KDOS) squaring by adopting additive and subtractive Karatsuba\u27s methods, respectively. The proposed multiplication and squaring methods are compatible with separated Montgomery algorithms and these are highly efficient for RSA crypto system. Finally, our proposed multiplication/squaring, separated Montgomery multiplication/squaring and RSA encryption outperform the best-known results by 22/41\%, 25/33\% and 30\% on the Cortex-A15 platform

Cryptology ePrint Archive

NEON PQCryto: Fast and Parallel Ring-LWE Encryption on ARM NEON Architecture

Author: Howon Kim
Hwajeong Seo
Reza Azarderakhsh
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 09/11/2015
Field of study

Recently, ARM NEON architecture has occupied a significant share of tablet and smartphone markets due to its low cost and high performance. This paper studies efficient techniques of lattice-based cryptography on ARM processor and presents the first implementation of ring-LWE encryption on ARM NEON architecture. In particular, we propose a vectorized version of Iterative Number Theoretic Transform (NTT) for high-speed computation. We present a 32-bit variant of SAMS2 technique, original proposed in CHES’15, for fast reduction. A combination of proposed and previous optimizations results in a very efficient implementation. For 128-bit security level, our ring-LWE implementation requires only 145; 200 clock cycles for encryption and 32; 800 cycles for decryption. These result are more than 17:6 times faster than the fastest ECC implementation on ARM NEON with same security level

Cryptology ePrint Archive

Compact Implementations of LEA Block Cipher for Low-End Microprocessors

Author: Howon Kim
Hwajeong Seo
Jongseok Choi
Taehwan Park
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 24/07/2015
Field of study

In WISA\u2713, a novel lightweight block cipher named LEA was released. This algorithm has certain useful features for hardware and software implementations, i.e., simple ARX operations, non-S-box architecture, and 32-bit word size. These features are realized in several platforms for practical usage with high performance and low overheads. In this paper, we further improve 128-, 192- and 256-bit LEA encryption for low-end embedded processors. Firstly we present speed optimization methods. The methods split a 32-bit word operation into four byte-wise operations and avoid several rotation operations by taking advantages of efficient byte-wise rotations. Secondly we reduce the code size to ensure minimum code size.We nd the minimum inner loops and optimize them in an instruction set level. After then we construct the whole algorithm in a partly unrolled fashion with reasonable speed. Finally, we achieved the fastest LEA implementations, which improves performance by 10.9% than previous best known results. For size optimization, our implemen- tation only occupies the 280B to conduct LEA encryption. After scaling, our implementation achieved the smallest ARX implementations so far, compared with other state-of-art ARX block ciphers such as SPECK and SIMON

CiteSeerX

Crossref

Cryptology ePrint Archive

Binary Field Multiplication on ARMv8

Author: Howon Kim
Hwajeong Seo
Jongseok Choi
Yasuyuki Nogami
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 24/07/2015
Field of study

In this paper, we show efficient implementations of binary field multiplication over ARMv8. We exploit an advanced 64-bit polynomial multiplication (\texttt{PMULL}) supported by ARMv8 and conduct multiple levels of asymptotically faster Karatsuba multiplication. Finally, our method conducts binary field multiplication within 57 clock cycles for B-251. Our proposed method on ARMv8 improves the performance by a factor of

5.5

times than previous techniques on ARMv7

CiteSeerX

Cryptology ePrint Archive

Secure Number Theoretic Transform and Speed Record for Ring-LWE Encryption on Embedded Processors

Author: Kim Howon
Kwon Hyeokchan
Lee Sokjoon
Liu Zhe
Park Taehwan
Seo Hwajeong
Publication venue
Publication date: 30/12/2017
Field of study

Open Repository and Bibliography - Luxembourg

Secure Binary Field Multiplication

Author: Chien-Ning Chen
Howon Kim
Hwajeong Seo
Jongseok Choi
Taehwan Park
Yasuyuki Nogami
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 10/08/2015
Field of study

Binary eld multiplication is the most fundamental building block of binary eld Elliptic Curve Cryptography (ECC) and Galois/Counter Mode (GCM). Both bit-wise scanning and Look-Up Table (LUT) based methods are commonly used for binary eld multiplication. In terms of Side Channel Attack (SCA), bit-wise scanning exploits insecure branch operations which leaks information in a form of timing and power consumption. On the other hands, LUT based method is regarded as a relatively secure approach because LUT access can be conducted in a regular and atomic form. This ensures a constant time solution as well. In this paper, we conduct the SCA on the LUT based binary eld multiplication. The attack exploits the horizontal Correlation Power Analysis (CPA) on weights of LUT. We identify the operand with only a power trace of binary eld multiplication. In order to prevent SCA, we also suggest a mask based binary eld multiplication which ensures a regular and constant time solution without LUT and branch statements

CiteSeerX

Cryptology ePrint Archive

Efficient Ring-LWE Encryption on 8-bit AVR Processors

Author: Howon Kim
Hwajeong Seo
Ingrid Verbauwhede
Johann Großschädl
Sujoy Sinha Roy
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 20/09/2015
Field of study

Public-key cryptography based on the ``ring-variant\u27\u27 of the Learning with Errors (ring-LWE) problem is both efficient and believed to remain secure in a post-quantum world. In this paper, we introduce a carefully-optimized implementation of a ring-LWE encryption scheme for 8-bit AVR processors like the ATxmega128. Our research contributions include several optimizations for the Number Theoretic Transform (NTT) used for polynomial multiplication. More concretely, we describe the Move-and-Add (MA) and the Shift-Add-Multiply-Subtract-Subtract (SAMS2) technique to speed up the performance-critical multiplication and modular reduction of coefficients, respectively. We take advantage of incompletely-reduced intermediate results to minimize the total number of reduction operations and use a special coefficient-storage method to decrease the RAM footprint of NTT multiplications. In addition, we propose a byte-wise scanning strategy to improve the performance of a discrete Gaussian sampler based on the Knuth-Yao random walk algorithm. For medium-term security, our ring-LWE implementation needs 590k, 672k, and 276k clock cycles for key-generation, encryption, and decryption, respectively. On the other hand, for long-term security, the execution time of key-generation, encryption, and decryption amount to 2.2M, 2.6M, and 686k cycles, respectively. These results set new speed records for ring-LWE encryption on an 8-bit processor and outperform related RSA and ECC implementations by an order of magnitude

Cryptology ePrint Archive

Metastability-based Feedback Method for Enhancing FPGA-based TRNG

Author: Donggeon Lee
Howon Kim
Hwajeong Seo
Publication venue
Publication date
Field of study

This paper presents a novel and efficient method to enhance the randomness of a Programmable Delay Line (PDL)-based True Random Number Generator (TRNG) by introducing Metastability-based Feedback scheme. As a principal tool for the security of sensor network, random number generator is one of important security primitives. At the CHES2011 conference, a new method of generating random numbers by inducing metastability using precise PDL and PI (proportional-Integral) control has been proposed. As the proposed TRNG could not achieve sufficient randomness on its own, a filtering scheme was used for higher entropy. Unfortunately, the intrinsic characteristics of filters made the throughput of the TRNG decrease by approximately 50\%. To preserve the original throughput and attain a high level of randomness, we present a simple solution that analyzes the probability of outputs and assign long time to metastable state. The proposed scheme shows high randomness in the NIST randomness test suite without any of the throughput loss caused by filtering

CiteSeerX

Efficient Implementation of NIST-Compliant Elliptic Curve Cryptography for 8-bit AVR-Based Sensor Nodes

Author: Groszschädl Johann
Kim Howon
Liu Zhe
Seo Hwajeong
Publication venue
Publication date: 01/07/2016
Field of study

In this paper, we introduce a highly optimized software implementation of standards-compliant elliptic curve cryptography (ECC) for wireless sensor nodes equipped with an 8-bit AVR microcontroller. We exploit the state-of-the-art optimizations and propose novel techniques to further push the performance envelope of a scalar multiplication on the NIST P-192 curve. To illustrate the performance of our ECC software, we develope the prototype implementations of different cryptographic schemes for securing communication in a wireless sensor network, including elliptic curve Diffie-Hellman (ECDH) key exchange, the elliptic curve digital signature algorithm (ECDSA), and the elliptic curve Menezes-Qu-Vanstone (ECMQV) protocol. We obtain record-setting execution times for fixed-base, point variable-base, and double-base scalar multiplication. Compared with the related work, our ECDH key exchange achieves a performance gain of roughly 27% over the best previously published result using the NIST P-192 curve on the same platform, while our ECDSA performs twice as fast as the ECDSA implementation of the well-known TinyECC library. We also evaluate the impact of Karatsuba's multiplication technique on the overall execution time of a scalar multiplication. In addition to offering high performance, our implementation of scalar multiplication has a highly regular execution profile, which helps to protect against certain side-channel attacks. Our results show that NIST-compliant ECC can be implemented efficiently enough to be suitable for resource-constrained sensor nodes

Crossref

Open Repository and Bibliography - Luxembourg

Reset Tree-Based Optical Fault Detection

Author: Dong-Geon Lee
Dooho Choi
Howon Kim
Jungtaek Seo
Publication venue: MDPI AG
Publication date: 01/05/2013
Field of study

In this paper, we present a new reset tree-based scheme to protect cryptographic hardware against optical fault injection attacks. As one of the most powerful invasive attacks on cryptographic hardware, optical fault attacks cause semiconductors to misbehave by injecting high-energy light into a decapped integrated circuit. The contaminated result from the affected chip is then used to reveal secret information, such as a key, from the cryptographic hardware. Since the advent of such attacks, various countermeasures have been proposed. Although most of these countermeasures are strong, there is still the possibility of attack. In this paper, we present a novel optical fault detection scheme that utilizes the buffers on a circuit’s reset signal tree as a fault detection sensor. To evaluate our proposal, we model radiation-induced currents into circuit components and perform a SPICE simulation. The proposed scheme is expected to be used as a supplemental security tool

Directory of Open Access Journals