Search CORE

10 research outputs found

Saber on ARM CCA-secure module lattice-based key encapsulation on ARM

Author: Angshuman Karmakar
Ingrid Verbauwhede
Jose Maria Bermudo Mera
Sujoy Sinha Roy
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 05/03/2020
Field of study

The CCA-secure lattice-based post-quantum key encapsulation scheme Saber is a candidate in the NIST\u27s post-quantum cryptography standardization process. In this paper, we study the implementation aspects of Saber in resource-constrained microcontrollers from the ARM Cortex-M series which are very popular for realizing IoT applications. In this work, we carefully optimize various parts of Saber for speed and memory. We exploit digital signal processing instructions and efficient memory access for a fast implementation of polynomial multiplication. We also use memory efficient Karatsuba and just-in-time strategy for generating the public matrix of the module lattice to reduce the memory footprint. We also show that our optimizations can be combined with each other seamlessly to provide various speed-memory trade-offs. Our speed optimized software takes just 1,147K, 1,444K, and 1,543K clock cycles on a Cortex-M4 platform for key generation, encapsulation and decapsulation respectively. Our memory efficient software takes 4,786K, 6,328K, and 7,509K clock cycles on an ultra resource-constrained Cortex-M0 platform for key generation, encapsulation, and decapsulation respectively while consuming only 6.2 KB of memory at most. These results show that lattice-based key encapsulation schemes are perfectly practical for securing IoT devices from quantum computing attacks

Cryptology ePrint Archive

Faster NTRU on ARM Cortex-M4 with TMVP-based multiplication

Author: Irem Keskinkurt Paksoy
Murat Cenk
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 07/03/2022
Field of study

The Number Theoretic Transform (NTT), Toom-Cook, and Karatsuba are the most commonly used algorithms for implementing lattice-based ?nalists of the NIST PQC competition. In this paper, we propose Toeplitz matrix-vector product (TMVP) based algorithms for multiplication for all parameter sets of NTRU. We implement the pro- posed algorithms on ARM Cortex-M4. The results show that TMVP- based multiplication algorithms using the four-way TMVP formula are more e?cient for NTRU. Our algorithms outperform the Toom-Cook method by up to 25.3%, and the NTT method by up to 19.8%. More- over, our algorithms require less stack space than the others in most cases. We also observe the impact of these improvements on the overall performance of NTRU. We speed up the encryption, decryption, en- capsulation, and decapsulation by up to 13.7%,17.5%,3.5%, and 14.1%, respectively, compared to state-of-the-art implementation

Cryptology ePrint Archive

Pushing the speed limit of constant-time discrete Gaussian sampling. A case study on the Falcon signature scheme

Author: Alkim Erdem
Cheon Jung Hee
Du Chaohui
Fouque Pierre-Alain
Peikert Chris
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/06/2019
Field of study

Crossref

University of Birmingham Research Portal

Memory-Efficient High-Speed Implementation of Kyber on Cortex-M4

Author: Leon Botros
Matthias J. Kannwischer
Peter Schwabe
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 20/05/2019
Field of study

This paper presents an optimized software implementation of the module-lattice-based key-encapsulation mechanism Kyber for the ARM Cortex-M4 microcontroller. Kyber is one of the round-2 candidates in the NIST post-quantum project. In the center of our work are novel optimization techniques for the number-theoretic transform (NTT) inside Kyber, which make very efficient use of the computational power offered by the “vector” DSP instructions of the target architecture. We also present results for the recently updated parameter sets of Kyber which equally benefit from our optimizations. As a result of our efforts we present software that is 18% faster than an earlier implementation of Kyber optimized for the Cortex-M4 by the Kyber submitters. Our NTT is more than twice as fast as the NTT in that software. Our software runs at about the same speed as the latest speed-optimized implementation of the other module-lattice based round-2 NIST PQC candidate Saber. However, for our Kyber software, this performance is achieved with a much smaller RAM footprint. Kyber needs less than half of the RAM of what the considerably slower RAM-optimized version of Saber uses. Our software does not make use of any secret-dependent branches or memory access and thus offers state-of-the-art protection against timing attack

Cryptology ePrint Archive

Saber on ARM: CCA-secure module lattice-based key encapsulation on ARM

Author: Bermudo Mera Jose Maria
Karmakar Angshuman
Sinha Roy Sujoy
Verbauwhede Ingrid
Publication venue: 'Universitatsbibliothek der Ruhr-Universitat Bochum'
Publication date: 14/08/2018
Field of study

The CCA-secure lattice-based post-quantum key encapsulation scheme Saber is a candidate in the NIST’s post-quantum cryptography standardization process. In this paper, we study the implementation aspects of Saber in resourceconstrained microcontrollers from the ARM Cortex-M series which are very popular for realizing IoT applications. In this work, we carefully optimize various parts of Saber for speed and memory. We exploit digital signal processing instructions and efficient memory access for a fast implementation of polynomial multiplication. We also use memory efficient Karatsuba and just-in-time strategy for generating the public matrix of the module lattice to reduce the memory footprint. We also show that our optimizations can be combined with each other seamlessly to provide various speed-memory trade-offs. Our speed optimized software takes just 1,147K, 1,444K, and 1,543K clock cycles on a Cortex-M4 platform for key generation, encapsulation and decapsulation respectively. Our memory efficient software takes 4,786K, 6,328K, and 7,509K clock cycles on an ultra resource-constrained Cortex-M0 platform for key generation, encapsulation, and decapsulation respectively while consuming only 6.2 KB of memory at most. These results show that lattice-based key encapsulation schemes are perfectly practical for securing IoT devices from quantum computing attacks

Ruhr-Universität Bochum (RUB): Open Journal Systems

Saber on ARM. CCA-secure module lattice-based key encapsulation on ARM

Author: Bermudo Mera Jose maria
Karmakar Angshuman
Sinha Roy sujoy
Verbauwhede Ingrid
Publication venue: 'Society of Rubber Industry, Japan'
Publication date: 07/09/2018
Field of study

status: publishe

Lirias

Saber on ESP32

Author: A Karmakar
A Schönhage
D Harvey
D Hofheinz
H Nussbaumer
JP D’Anvers
MJ Kannwischer
MR Albrecht
PW Shor
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 17/12/2019
Field of study

Saber, a CCA-secure lattice-based post-quantum key encapsulation scheme, is one of the second round candidate algorithms in the post-quantum cryptography standardization process of the US National Institute of Standards and Technology (NIST) in 2019. In this work, we provide an efficient implementation of Saber on ESP32, an embedded microcontroller designed for IoT environment with WiFi and Bluetooth support. RSA coprocessor was used to speed up the polynomial multiplications for Kyber variant in a CHES 2019 paper. We propose an improved implementation utilizing the big integer coprocessor for the polynomial multiplications in Saber, which contains significant lower software overhead and takes a better advantage of the big integer coprocessor on ESP32. By using the fast implementation of polynomial multiplications, our single-core version implementation of Saber takes 1639K, 2123K, 2193K clock cycles on ESP32 for key generation, encapsulation and decapsulation respectively. Benefiting from the dual core feature on ESP32, we speed up the implementation of Saber by rearranging the computing steps and assigning proper tasks to two cores executing in parallel. Our dual-core version implementation takes 1176K, 1625K, 1514K clock cycles for key generation, encapsulation and decapsulation respectively

Crossref

Cryptology ePrint Archive

Analysis of Implementations and Side-Channel Security of Frodo on Embedded Devices

Author: Martinoli Marco
Publication venue
Publication date: 29/09/2020
Field of study

Explore Bristol Research

Post-Quantum Cryptography: Cryptanalysis and Implementation

Author: Virdia Fernando
Publication venue
Publication date: 01/01/2021
Field of study

Royal Holloway - Pure