46 research outputs found

    Bitslicing Arithmetic/Boolean Masking Conversions for Fun and Profit

    Get PDF
    The performance of higher-order masked implementations of lattice-based based key encapsulation mechanisms (KEM) is currently limited by the costly conversions between arithmetic and Boolean masking. While bitslicing has been shown to strongly speed up masked implementations of symmetric primitives, its use in arithmetic-to-Boolean and Boolean-to-arithmetic masking conversion gadgets has never been thoroughly investigated. In this paper, we first show that bitslicing can indeed accelerate existing conversion gadgets. We then optimize these gadgets, exploiting the degrees of freedom offered by bitsliced implementations. As a result, we introduce new arbitrary-order Boolean masked addition, arithmetic-to-Boolean and Boolean-to-arithmetic masking conversion gadgets, each in two variants: modulo 2k and modulo p (for any integers k and p). Practically, our new gadgets achieve a speedup of up to 25x over the state of the art. Turning to the KEM application, we develop the first open-source embedded (Cortex-M4) implementations of Kyber768 and Saber masked at arbitrary order. The implementations based on the new bitsliced gadgets achieve a speedup of 1.8x for Kyber and 3x for Saber, compared to the implementation based on state-of-the-art gadgets. The bottleneck of the bitslice implementations is the masked Keccak-f[1600] permutation

    Give Me 5 Minutes: Attacking ASCAD with a Single Side-Channel Trace

    Get PDF
    In this note, we describe an attack against the ANSSI Side-Channel Analysis Database (ASCAD), which recovers the full key using the leakage of a single masked block cipher execution. The attack uses a new open-source Side-Channel Analysis Library (SCALib), which allows running the leakage profiling and attacking in less than 5 minutes. It exploits well-known techniques, yet improves significantly over the best known attacks against ASCAD. We conclude by questioning the impact of these experimental findings for side-channel security evaluations

    Handcrafting: Improving Automated Masking in Hardware with Manual Optimizations

    Get PDF
    Masking is an important countermeasure against side-channel attacks, but its secure implementation is known to be error-prone. The automated verification and generation of masked designs is therefore an important theoretical and practical challenge. In a recent work, Knichel et al. proposed a tool for the automated generation of masked hardware implementations satisfying strong security properties (e.g., glitch-freeness and composability). In this paper, we study the possibility to improve their results based on manual performance optimizations for the AES algorithm. Our main conclusion is that as the target architecture becomes more serial, such a handcrafted approach gains interest. For example, we reach latency reductions by a factor six for 8-bit architectures. We conclude the paper by discussing the extent to which such optimizations could be integrated in the tool of Knichel et al. As a bonus, we adapt a composition-based verification tool to check that our implementations are robust against glitches & transitions, and confirm the security order of exemplary implementations with preliminary leakage assessment

    Towards Tight Random Probing Security

    Get PDF
    Proving the security of masked implementations in theoretical models that are relevant to practice and match the best known attacks of the side-channel literature is a notoriously hard problem. The random probing model is a good candidate to contribute to this challenge, due to its ability to capture the continuous nature of physical leakage (contrary to the threshold probing model), while also being convenient to manipulate in proofs and to automate with verification tools. Yet, despite recent progresses in the design of masked circuits with good asymptotic security guarantees in this model, existing results still fall short when it comes to analyze the security of concretely useful circuits under realistic noise levels and with low number of shares. In this paper, we contribute to this issue by introducing a new composability notion, the Probe Distribution Table (PDT), and a new tool (called STRAPS, for the Sampled Testing of the RAndom Probing Security). Their combination allows us to significantly improve the tightness of existing analyses in the most practical (low noise, low number of shares) region of the design space. We illustrate these improvements by quantifying the random probing security of an AES S-box circuit, masked with the popular multiplication gadget of Ishai, Sahai and Wagner from Crypto 2003, with up to six shares

    Unifying Freedom and Separation for Tight Probing-Secure Composition

    Get PDF
    The masking countermeasure is often analyzed in the probing model. Proving the probing security of large circuits at high masking orders is achieved by composing gadgets that satisfy security definitions such as non-interference (NI), strong non-interference (SNI) or free SNI. The region probing model is a variant of the probing model, where the probing capabilities of the adversary scale with the number of regions in a masked circuit. This model is of interest as it allows better reductions to the more realistic noisy leakage model. The efficiency of composable region probing secure masking has been recently improved with the introduction of the input-output separation (IOS) definition. In this paper, we first establish equivalences between the non-interference framework and the IOS formalism. We also generalize the security definitions to multiple-input gadgets and systematically show implications and separations between these notions. Then, we study which gadgets from the literature satisfy these. We give new security proofs for some well-known arbitrary-order gadgets, and also some automated proofs for fixed-order, special-case gadgets. To this end, we introduce a new automated formal verification algorithm that solves the open problem of verifying free SNI, which is not a purely simulation-based definition. Using the relationships between the security notions, we adapt this algorithm to further verify IOS. Finally, we look at composition theorems. In the probing model, we use the link between free SNI and the IOS formalism to generalize and improve the efficiency of the tight private circuit (Asiacrypt 2018) construction, also fixing a flaw in the original proof. In the region probing model, we relax the assumptions for IOS composition (TCHES 2021), which allows to save many refresh gadgets, hence improving the efficiency

    Information Bounds and Convergence Rates for Side-Channel Security Evaluators

    Get PDF
    Current side-channel evaluation methodologies exhibit a gap between inefficient tools offering strong theoretical guarantees and efficient tools only offering heuristic (sometimes case-specific) guarantees. Profiled attacks based on the empirical leakage distribution correspond to the first category. Bronchain et al. showed at Crypto 2019 that they allow bounding the worst-case security level of an implementation, but the bounds become loose as the leakage dimensionality increases. Template attacks and machine learning models are examples of the second category. In view of the increasing popularity of such parametric tools in the literature, a natural question is whether the information they can extract can be bounded. In this paper, we first show that a metric conjectured to be useful for this purpose, the hypothetical information, does not offer such a general bound. It only does when the assumptions exploited by a parametric model match the true leakage distribution. We therefore introduce a new metric, the training information, that provides the guarantees that were conjectured for the hypothetical information for practically-relevant models. We next initiate a study of the convergence rates of profiled side-channel distinguishers which clarifies, to the best of our knowledge for the first time, the parameters that influence the complexity of a profiling. On the one hand, the latter has practical consequences for evaluators as it can guide them in choosing the appropriate modeling tool depending on the implementation (e.g., protected or not) and contexts (e.g., granting them access to the countermeasures’ randomness or not). It also allows anticipating the amount of measurements needed to guarantee a sufficient model quality. On the other hand, our results connect and exhibit differences between side-channel analysis and statistical learning theory

    Compress: Reducing Area and Latency of Masked Pipelined Circuits

    Get PDF
    Masking is an effective countermeasure against side-channel attacks. It replaces every logic gate in a computation by a gadget that performs the operation over secret sharings of the circuit\u27s variables. When masking is implemented in hardware, care should be taken to protect against leakage from glitches, which could otherwise undermine the security of masking. This is generally done by adding registers, which stop the propagation of glitches, but introduce additional latency and area cost. In masked pipeline circuits, a high latency further increases the area overheads of masking, due to the need for additional registers that synchronize signals between pipeline stages. In this work, we propose a technique to minimize the number of such pipelining registers, which relies on optimizing the scheduling of the computations across the pipeline stages. We release an implementation of this technique as an open-source tool, COMPRESS. Further, we introduce other optimizations to deduplicate logic between gadgets, perform an optimal selection of masked gadgets, and introduce new gadgets with smaller area. Overall, our optimizations lead to circuits that improve the state-of-the art in area and achieve minimal latency. For example, a masked AES based on an S-box generated by COMPRESS reduces latency by 19% and area by 10% over the state of the art

    Triplex: an Efficient and One-Pass Leakage-Resistant Mode of Operation

    Get PDF
    This paper introduces and analyzes Triplex, a leakage-resistant mode of operation based on Tweakable Block Ciphers (TBCs) with 2n-bit tweaks. Triplex enjoys beyond-birthday ciphertext integrity in the presence of encryption and decryption leakage in a liberal model where all intermediate computations are leaked in full and only two TBC calls operating a long-term secret are protected with implementationlevel countermeasures. It provides beyond-birthday confidentiality guarantees without leakage, and standard confidentiality guarantees with leakage for a single-pass mode embedding a re-keying process for the bulk of its computations (i.e., birthday confidentiality with encryption leakage under a bounded leakage assumption). Triplex improves leakage-resistant modes of operation relying on TBCs with n-bit tweaks when instantiated with large-tweak TBCs like Deoxys-TBC (a CAESAR competition laureate) or Skinny (used by the Romulus finalist of the NIST lightweight crypto competition). Its security guarantees are maintained in the multi-user setting

    Randomness Generation for Secure Hardware Masking - Unrolled Trivium to the Rescue

    Get PDF
    Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed over the past decades. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency and performance of secure implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform all competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle while maintaining high performance, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a high rate per cycle even more efficiently than Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking scheme, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating n fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as 20n to 30n ASIC gate equivalents (GE) or 3n to 4n FPGA look-up tables (LUTs), where n is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable and potentially moving low randomness usage in hardware masking research from a primary to secondary design goal

    Towards Achieving Provable Side-Channel Security in Practice

    Get PDF
    Physical side-channel attacks are powerful attacks that exploit a device\u27s physical emanations to break the security of cryptographic implementations. Many countermeasures have been proposed against these attacks, especially the widely-used and efficient masking countermeasure. Nevertheless, proving the security of masked implementations is challenging. Current techniques rely on empirical approaches to validate the security of such implementations. On the other hand, the theoretical community introduced leakage models to provide formal proofs of the security of masked implementations. Meanwhile, these leakage models rely on physical assumptions that are difficult to satisfy in practice, and the literature lacks a clear framework to implement proven secure constructions on a physical device while preserving the proven security. In this paper, we present a complete methodology describing the steps to turn an abstract masking scheme proven secure in a theoretical leakage model into a physical implementation satisfying provable security against side-channel attacks in practice. We propose new tools to enforce or relax the physical assumptions the indeal noisy leakage model rely on and provide novel ways of including them in a physical implementation. We also highlight the design goals for an embedded device to reach high levels of proven security, discussing the limitations and open problems of the practical usability of the leakage models. Our goal is to show that it is possible to bridge theory and practice and to motivate further research to fully close the gap and get practical implementations proven secure against side-channel attacks on a physical device without any ideal assumption about the leakage
    corecore