59 research outputs found

    Algorithmic Security is Insufficient: A Comprehensive Survey on Implementation Attacks Haunting Post-Quantum Security

    Full text link
    This survey is on forward-looking, emerging security concerns in post-quantum era, i.e., the implementation attacks for 2022 winners of NIST post-quantum cryptography (PQC) competition and thus the visions, insights, and discussions can be used as a step forward towards scrutinizing the new standards for applications ranging from Metaverse, Web 3.0 to deeply-embedded systems. The rapid advances in quantum computing have brought immense opportunities for scientific discovery and technological progress; however, it poses a major risk to today's security since advanced quantum computers are believed to break all traditional public-key cryptographic algorithms. This has led to active research on PQC algorithms that are believed to be secure against classical and powerful quantum computers. However, algorithmic security is unfortunately insufficient, and many cryptographic algorithms are vulnerable to side-channel attacks (SCA), where an attacker passively or actively gets side-channel data to compromise the security properties that are assumed to be safe theoretically. In this survey, we explore such imminent threats and their countermeasures with respect to PQC. We provide the respective, latest advancements in PQC research, as well as assessments and providing visions on the different types of SCAs

    A Side-Channel Resistant Implementation of SABER

    Get PDF
    The candidates for the NIST Post-Quantum Cryptography standardization have undergone extensive studies on efficiency and theoretical security, but research on their side-channel security is largely lacking. This remains a considerable obstacle for their real-world deployment, where side-channel security can be a critical requirement. This work describes a side-channel resistant instance of Saber, one of the lattice-based candidates, using masking as a countermeasure. Saber proves to be very efficient to mask due to two specific design choices: power-of-two moduli, and limited noise sampling of learning with rounding. A major challenge in masking lattice-based cryptosystems is the integration of bit-wise operations with arithmetic masking, requiring algorithms to securely convert between masked representations. The described design includes a novel primitive for masked logical shifting on arithmetic shares, as well as adapts an existing masked binomial sampler for Saber. An implementation is provided for an ARM Cortex-M4 microcontroller, and its side-channel resistance is experimentally demonstrated. The masked implementation features a 2.5x overhead factor, significantly lower than the 5.7x previously reported for a masked variant of NewHope. Masked key decapsulation requires less than 3,000,000 cycles on the Cortex-M4 and consumes less than 12kB of dynamic memory, making it suitable for deployment in embedded platforms. We have made our implementation available at https://github.com/KULeuven-COSIC/SABER-masking

    Higher-order masked Saber

    Get PDF
    Side-channel attacks are formidable threats to the cryptosystems deployed in the real world. An effective and provably secure countermeasure against side-channel attacks is masking. In this work, we present a detailed study of higher-order masking techniques for the key-encapsulation mechanism Saber. Saber is one of the lattice-based finalist candidates in the National Institute of Standards of Technology\u27s post-quantum standardization procedure. We provide a detailed analysis of different masking algorithms proposed for Saber in the recent past and propose an optimized implementation of higher-order masked Saber. Our proposed techniques for first-, second-, and third-order masked Saber have performance overheads of 2.7x, 5x, and 7.7x respectively compared to the unmasked Saber. We show that compared to Kyber which is another lattice-based finalist scheme, Saber\u27s performance degrades less with an increase in the order of masking. We also show that higher-order masked Saber needs fewer random bytes than higher-order masked Kyber. Additionally, we adapt our masked implementation to uSaber, a variant of Saber that was specifically designed to allow an efficient masked implementation. We present the first masked implementation of uSaber, showing that it indeed outperforms masked Saber by at least 12% for any order. We provide optimized implementations of all our proposed masking schemes on ARM Cortex-M4 microcontrollers

    Efficiently Masking Polynomial Inversion at Arbitrary Order

    Get PDF
    Physical side-channel analysis poses a huge threat to post-quantum cryptographic schemes implemented on embedded devices. Still, secure implementations are missing for many schemes. In this paper, we present an efficient solution for masked polynomial inversion, a main component of the key generation of multiple post-quantum KEMs. For this, we introduce a polynomial-multiplicative masking scheme with efficient arbitrary order conversions from and to additive masking. Furthermore, we show how to integrate polynomial inversion and multiplication into the masking schemes to reduce costs considerably. We demonstrate the performance of our algorithms for two different post-quantum cryptographic schemes on the Cortex-M4. For NTRU, we measure an overhead of 35% for the first-order masked inversion compared to the unmasked inversion while for BIKE the overhead is as little as 11%. Lastly, we verify the security of our algorithms for the first masking order by measuring and performing a TVLA based side-channel analysis

    On the Masking-Friendly Designs for Post-Quantum Cryptography

    Get PDF
    Masking is a well-known and provably secure countermeasure against side-channel attacks. However, due to additional redundant computations, integrating masking schemes is expensive in terms of performance. The performance overhead of integrating masking countermeasures is heavily influenced by the design choices of a cryptographic algorithm and is often not considered during the design phase. In this work, we deliberate on the effect of design choices on integrating masking techniques into lattice-based cryptography. We select Scabbard, a suite of three lattice-based post-quantum key-encapsulation mechanisms (KEM), namely Florete, Espada, and Sable. We provide arbitrary-order masked implementations of all the constituent KEMs of the Scabbard suite by exploiting their specific design elements. We show that the masked implementations of Florete, Espada, and Sable outperform the masked implementations of Kyber in terms of speed for any order masking. Masked Florete exhibits a 73%73\%, 71%71\%, and 70%70\% performance improvement over masked Kyber corresponding to the first-, second-, and third-order. Similarly, Espada exhibits 56%56\%, 59%59\%, and 60%60\% and Sable exhibits 75%75\%, 74%74\%, and 73%73\% enhanced performance for first-, second-, and third-order masking compared to Kyber respectively. Our results show that the design decisions have a significant impact on the efficiency of integrating masking countermeasures into lattice-based cryptography

    When Bad News Become Good News

    Get PDF
    Hard physical learning problems have been introduced as an alternative option to implement cryptosystems based on hard learning problems. Their high-level idea is to use inexact computing to generate erroneous computations directly, rather than to first compute correctly and add errors afterwards. Previous works focused on the applicability of this idea to the Learning Parity with Noise (LPN) problem as a first step, and formalized it as Learning Parity with Physical Noise (LPPN). In this work, we generalize it to the Learning With Errors (LWE) problem, formalized as Learning With Physical Errors (LWPE). We first show that the direct application of the design ideas used for LPPN prototypes leads to a new source of (mathematical) data dependencies in the error distributions that can reduce the security of the underlying problem. We then show that design tweaks can be used to avoid this issue, making LWPE samples natively robust against such data dependencies. We additionally put forward that these ideas open a quite wide design space that could make hard physical learning problems relevant in various applications. And we conclude by presenting a first prototype FPGA design confirming our claims

    When Bad News Become Good News Towards Usable Instances of Learning with Physical Errors

    Get PDF
    Hard physical learning problems have been introduced as an alternative option to implement cryptosystems based on hard learning problems. Their high-level idea is to use inexact computing to generate erroneous computations directly, rather than to first compute correctly and add errors afterwards. Previous works focused on the applicability of this idea to the Learning Parity with Noise (LPN) problem as a first step, and formalized it as Learning Parity with Physical Noise (LPPN). In this work, we generalize it to the Learning With Errors (LWE) problem, formalized as Learning With Physical Errors (LWPE). We first show that the direct application of the design ideas used for LPPN prototypes leads to a new source of (mathematical) data dependencies in the error distributions that can reduce the security of the underlying problem. We then show that design tweaks can be used to avoid this issue, making LWPE samples natively robust against such data dependencies. We additionally put forward that these ideas open a quite wide design space that could make hard physical learning problems relevant in various applications. And we conclude by presenting a first prototype FPGA design confirming our claims

    Sapphire: A Configurable Crypto-Processor for Post-Quantum Lattice-based Protocols (Extended Version)

    Get PDF
    Public key cryptography protocols, such as RSA and elliptic curve cryptography, will be rendered insecure by Shor’s algorithm when large-scale quantum computers are built. Cryptographers are working on quantum-resistant algorithms, and lattice-based cryptography has emerged as a prime candidate. However, high computational complexity of these algorithms makes it challenging to implement lattice-based protocols on low-power embedded devices. To address this challenge, we present Sapphire – a lattice cryptography processor with configurable parameters. Efficient sampling, with a SHA-3-based PRNG, provides two orders of magnitude energy savings; a single-port RAM-based number theoretic transform memory architecture is proposed, which provides 124k-gate area savings; while a low-power modular arithmetic unit accelerates polynomial computations. Our test chip was fabricated in TSMC 40nm low-power CMOS process, with the Sapphire cryptographic core occupying 0.28 mm2 area consisting of 106k logic gates and 40.25 KB SRAM. Sapphire can be programmed with custom instructions for polynomial arithmetic and sampling, and it is coupled with a low-power RISC-V micro-processor to demonstrate NIST Round 2 lattice-based CCA-secure key encapsulation and signature protocols Frodo, NewHope, qTESLA, CRYSTALS-Kyber and CRYSTALS-Dilithium, achieving up to an order of magnitude improvement in performance and energy-efficiency compared to state-of-the-art hardware implementations. All key building blocks of Sapphire are constant-time and secure against timing and simple power analysis side-channel attacks. We also discuss how masking-based DPA countermeasures can be implemented on the Sapphire core without any changes to the hardware

    A Side-Channel Attack on a Bitsliced Higher-Order Masked CRYSTALS-Kyber Implementation

    Get PDF
    In response to side-channel attacks on masked implementations of post-quantum cryptographic algorithms, a new bitsliced higher-order masked implementation of CRYSTALS-Kyber has been presented at CHES\u272022. The bitsliced implementations are typically more difficult to break by side-channel analysis because they execute a single instruction across multiple bits in parallel. However, in this paper, we reveal new vulnerabilities in the masked Boolean to arithmetic conversion procedure of this implementation that make the shared and secret key recovery possible. We also present a new chosen ciphertext construction method which maximizes secret key recovery probability for a given message bit recovery probability. We demonstrate practical shared and secret key recovery attacks on the first-, second- and third-order masked implementations of Kyber-768 in ARM Cortex-M4 using profiled deep learning-based power analysis

    Formal Verification of Arithmetic Masking in Hardware and Software

    Get PDF
    Masking is a popular secret-sharing technique that is used to protect cryptographic implementations against physical attacks like differential power analysis. So far, most research in this direction has focused on finding efficient Boolean masking schemes for well-known symmetric cryptographic algorithms like AES and Keccak. However, especially with the advent of post-quantum cryptography (PQC), arithmetic masking has received increasing attention from the research community. In practice, many PQC algorithms require a combination of arithmetic and Boolean masking, which makes the search for secure and efficient conversion algorithms between these domains (A2B/B2A) an interesting but very challenging research topic. While there already exist lots of tools that can help with the formal verification of Boolean masked implementations, the same cannot be said about arithmetic masking and accompanying mask conversion algorithms. In this work, we demonstrate the first formal verification approach for (any-order) Boolean and arithmetic masking which can be applied to both hardware and software, while considering side-effects such as glitches and transitions. First, we show how a formal verification approach for Boolean masking can be used in the context of arithmetic masking such that we can verify A2B/B2A conversions for arbitrary masking orders. We investigate various conversion algorithms in hardware and software, and point out several new findings such as glitch-based issues for straightforward implementations of [CGV14]-A2B in hardware, transition-based leakage in Goubin-A2B in software, and more general implementation pitfalls when utilizing common optimization techniques in PQC. We provide the first formal analysis of table-based A2Bs from a probing security perspective and point out that they might not be easy to implement securely on processors that use of memory buffers or caches
    • …
    corecore