Search CORE

224 research outputs found

Binary Field Multiplication on ARMv8

Author: Howon Kim
Hwajeong Seo
Jongseok Choi
Yasuyuki Nogami
Zhe Liu
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 24/07/2015
Field of study

In this paper, we show efficient implementations of binary field multiplication over ARMv8. We exploit an advanced 64-bit polynomial multiplication (\texttt{PMULL}) supported by ARMv8 and conduct multiple levels of asymptotically faster Karatsuba multiplication. Finally, our method conducts binary field multiplication within 57 clock cycles for B-251. Our proposed method on ARMv8 improves the performance by a factor of

5.5

times than previous techniques on ARMv7

CiteSeerX

Cryptology ePrint Archive

Efficient Implementations of Pairing-Based Cryptography on Embedded Systems

Author: Verma Rajeev
Publication venue: RIT Scholar Works
Publication date: 04/12/2015
Field of study

Many cryptographic applications use bilinear pairing such as identity based signature, instance identity-based key agreement, searchable public-key encryption, short signature scheme, certificate less encryption and blind signature. Elliptic curves over finite field are the most secure and efficient way to implement bilinear pairings for the these applications. Pairing based cryptosystems are being implemented on different platforms such as low-power and mobile devices. Recently, hardware capabilities of embedded devices have been emerging which can support efficient and faster implementations of pairings on hand-held devices. In this thesis, the main focus is optimization of Optimal Ate-pairing using special class of ordinary curves, Barreto-Naehring (BN), for different security levels on low-resource devices with ARM processors. Latest ARM architectures are using SIMD instructions based NEON engine and are helpful to optimize basic algorithms. Pairing implementations are being done using tower field which use field multiplication as the most important computation. This work presents NEON implementation of two multipliers (Karatsuba and Schoolbook) and compare the performance of these multipliers with different multipliers present in the literature for different field sizes. This work reports the fastest implementation timing of pairing for BN254, BN446 and BN638 curves for ARMv7 architecture which have security levels as 128-, 164-, and 192-bit, respectively. This work also presents comparison of code performance for ARMv8 architectures

RIT Scholar Works

Faster ECC over F2571 (feat. PMULL)

Author: Hwajeong Seo
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 14/09/2016
Field of study

In this paper, we show efficient elliptic curve cryptography implementations for B-571 over ARMv8. We improve the previous binary field multiplication with finely aligned multiplication and incomplete reduction techniques by taking advantages of advanced 64-bit polynomial multiplication (\texttt{PMULL}) supported by ARMv8. This approach shows performance enhancements by a factor of 1.34 times than previous binary field implementations. For the point addition and doubling, the special types of multiplication, squaring and addition operations are combined together and optimized, where one reduction operation is optimized in each case. The scalar multiplication is implemented in constant-time Montgomery ladder algorithm, which is secure against timing attacks. Finally the proposed implementations achieved 759,630/331,944 clock cycles for random/fixed scalar multiplications for B-571 over ARMv8, respectively

Cryptology ePrint Archive

2DT-GLS: Faster and exception-free scalar multiplication in the GLS254 binary curve

Author: Diego F. Aranha
Marius A. Aardal
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 01/10/2022
Field of study

We revisit and improve performance of arithmetic in the binary GLS254 curve by introducing the 2DT-GLS scalar multiplication algorithm. The algorithm includes theoretical and practice-oriented contributions of potential independent interest: (i) for the first time, a proof that the GLS scalar multiplication algorithm does not incur exceptions, such that faster incomplete formulas can be used; (ii) faster dedicated atomic formulas that alleviate the cost of precomputation; (iii) a table compression technique that reduces the storage needed for precomputed points; (iv) a refined constant-time scalar decomposition algorithm that is more robust to rounding. We also present the first GLS254 implementation for Armv8. With our contributions, we set new speed records for constant-time scalar multiplication by

34.5\%

and

6\%

on 64-bit Arm and Intel platforms, respectively

Cryptology ePrint Archive

DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization

Author: Courville Vanessa
Li Xinlin
Liu Bang
Nia Vahid Partovi
Xing Chao
Yang Rui Heng
Publication venue
Publication date: 23/09/2023
Field of study

Efficiently deploying deep neural networks on low-resource edge devices is challenging due to their ever-increasing resource requirements. To address this issue, researchers have proposed multiplication-free neural networks, such as Power-of-Two quantization, or also known as Shift networks, which aim to reduce memory usage and simplify computation. However, existing low-bit Shift networks are not as accurate as their full-precision counterparts, typically suffering from limited weight range encoding schemes and quantization loss. In this paper, we propose the DenseShift network, which significantly improves the accuracy of Shift networks, achieving competitive performance to full-precision networks for vision and speech applications. In addition, we introduce a method to deploy an efficient DenseShift network using non-quantized floating-point activations, while obtaining 1.6X speed-up over existing methods. To achieve this, we demonstrate that zero-weight values in low-bit Shift networks do not contribute to model capacity and negatively impact inference computation. To address this issue, we propose a zero-free shifting mechanism that simplifies inference and increases model capacity. We further propose a sign-scale decomposition design to enhance training efficiency and a low-variance random initialization strategy to improve the model's transfer learning performance. Our extensive experiments on various computer vision and speech tasks demonstrate that DenseShift outperforms existing low-bit multiplication-free networks and achieves competitive performance compared to full-precision networks. Furthermore, our proposed approach exhibits strong transfer learning performance without a drop in accuracy. Our code was released on GitHub

arXiv.org e-Print Archive

Optimized Implementation of Encapsulation and Decapsulation of Classic McEliece on ARMv8

Author: Hwajeong Seo
Hyeokdong Kwon
Hyunjun Kim
Minjoo Sim
Siwoo Eum
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 09/12/2022
Field of study

Recently, the results of the NIST PQC contest were announced. Classic McEliece, one of the 3rd round candidates, was selected as the fourth round candidate. Classic McEliece is the only code-based cipher in the NIST PQC finalists in third round and the algorithm is regarded as secure. However, it has low efficiency. In this paper, we propose an efficient software implementation of Classic McEliece, a code-based cipher, on 64-bit ARMv8 processors. Classic McEliece can be divided into Key Generation, Encapsulation, and Decapsulation. Among them, we propose an optimal implementation for Encapsulation and Decapsulation. Optimized Encapsulation implementation utilizes vector registers to perform 16-byte parallel operations, and optimize using the specificity of the identity matrix. Decapsulation implemented efficient Multiplication and Inversion on

F_2^m

field. Compared with the previous results, Encapsulation showed the performance improvement of up-to 1.99× than the-state-of-art works

Cryptology ePrint Archive

Multiprecision Multiplication on ARMv8

Author: Järvinen Kimmo
Liu Weiqiang
Liu Zhe
Seo Hwajeong
Publication venue: IEEE
Publication date: 01/08/2017
Field of study

Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Open Repository and Bibliography - Luxembourg