120 research outputs found

    Implementing and Optimizing Matrix Triples with Homomorphic Encryption

    Get PDF
    In today’s interconnected world, data has become a valuable asset, leading to a growing interest in protecting it through techniques such as privacy-preserving computation. Two well-known approaches are multi-party computation and homomorphic encryption with use cases such as privacy-preserving machine learning evaluating or training neural networks. For multi-party computation, one of the fundamental arithmetic operations is the secure multiplication in the malicious security model and by extension the multiplication of matrices which is expensive to compute in the malicious model. Transferring the problem of secure matrix multiplication to the homomorphic domain enables savings in communication complexity, reducing the main bottleneck. In this work, we implement and optimize the homomorphic generation of matrix triples. We provide an open-source implementation for the leveled BGV (Brakerski Gentry Vaikuntanathan) scheme supporting plaintext moduli of arbitrary size using state-of-the-art implementation techniques. We also provide a new, use-case specific approach to parameter generation for leveled BGV-like schemes heuristically optimizing for computation time and taking into account architecture-specific constraints. Finally, we provide an in-depth analysis of the homomorphic circuit enabling the re-use of key switching keys and eliminating constant multiplications, combining our results in an implementation to generate homomorphic matrix triples for arbitrary plaintext moduli. Our implementation is publicly available and up to 2.1×2.1\times faster compared to previous work while also providing new time-memory trade-offs for different computing environments. Furthermore, we implement and evaluate additional, use-case specific optimization opportunities such as matrix slicing for the matrix triple generation

    Closing the Gap in RFC 7748: Implementing Curve448 in Hardware

    Get PDF
    With the evidence on comprised cryptographic standards in the context of elliptic curves, the IETF TLS working group has issued a request to the IETF Crypto Forum Research Group (CFRG) to recommend new elliptic curves that do not leave a doubt regarding their rigidity or any backdoors. This initiative has recently published RFC 7748 proposing two elliptic curves, known as Curve25519 and Curve448, for use with the next generation of TLS. This choice of elliptic curves was already picked up by the IETF working group curdle for adoption in further security protocols, such as DNSSEC. Hence it can be expected that these two curves will become predominant in the Internet and will form one basis for future secure communication. Unfortunately, both curves were solely designed and optimized for pure software implementation; their implementation in hardware or their physical protection against side-channel attacks were not considered at any time. However, for Curve25519 it has been shown recently that efficient implementations in hardware along with side-channel protection are possible. In this work we aim to close this gap and demonstrate that fortunately the second curve can be efficiently implemented in hardware as well. More precisely, we demonstrate that the high-security Curve448 can be implemented on a Xilinx XC7Z7020 at moderate costs of just 963 logic and 30 DSP slices and performs a scalar multiplication in 2.5ms

    A New Perspective on Key Switching for BGV-like Schemes

    Get PDF
    Fully homomorphic encryption is a promising solution for privacy-preserving computation. For BFV, BGV, and CKKS, three state-of-the-art fully homomorphic encryption schemes, the so-called key switching is one of the primary bottlenecks when evaluating homomorphic circuits. While a large body of work explores optimal selection for scheme parameters such as the polynomial degree or the ciphertext modulus, the realm of key switching parameters is relatively unexplored. This work closes this gap, formally exploring the parameter space for BGV-like key switching. We introduce a new asymptotic bound for key switching complexity, thereby providing a new perspective on this crucial operation. We also explore the parameter space for the recently proposed double-decomposition technique by Kim et al. [24], which outperforms current state-of-the-art only in very specific circumstances. Furthermore, we revisit an idea by Gentry, Halevi, and Smart [19] switching primes in and out of the ciphertext and find novel opportunities for constant folding, speeding up key switching by up to 50% and up to 11.6%, respectively

    Improved Side-Channel Resistance by Dynamic Fault-Injection Countermeasures

    Get PDF
    Side-channel analysis and fault-injection attacks are known as serious threats to cryptographic hardware implementations and the combined protection against both is currently an open line of research. A promising countermeasure with considerable implementation overhead appears to be a mix of first-order secure Threshold Implementations and linear Error-Correcting Codes. In this paper we employ for the first time the inherent structure of non-systematic codes as fault countermeasure which dynamically mutates the applied generator matrices to achieve a higher-order side-channel and fault-protected design. As a case study, we apply our scheme to the PRESENT block cipher that do not show any higher-order side-channel leakage after measuring 150 million power traces

    A Configurable Hardware Implementation of XMSS

    Get PDF
    Quantum computers are about to herald a new age of cryptography. As a fundamental building block in today’s digitalized world, Digital Signature Schemes (DSS) provide the ability to authenticate messages exchanged over untrusted channels. Unfortunately, virtually all currently used DSS are built upon mathematical problems that can efficiently be solved using quantum computers, thus rendering schemes such as RSA and ECC insecure. Due to its conservative security properties, the eXtended Merkle Signature Scheme (XMSS) is an outstanding candidate for a quantum-secure DSS which has already been standardized by NIST and IETF. In this paper we present the first full hardware accelerator for XMSS whose generic design approach allows matching the requirements of several projected use-cases. In particular, we provide a full design exploration regarding the choice of parameters and hash functions to identify configurations for optimal performance and area utilization

    A Hard Crystal - Implementing Dilithium on Reconfigurable Hardware

    Get PDF
    CRYSTALS-Dilithium as a lattice-based digital signature scheme has been selected as a finalist in the PQC standardization process of NIST. As part of this selection, a variety of software implementations have been evaluated regarding their performance and memory requirements for platforms like x86 or ARM Cortex-M4. In this work, we present a first set of FPGA implementations for the low-end Xilinx Artix-7 platform, evaluating the peculiarities of the scheme in hardware, reflecting all available round-3 parameter sets. As a key component in our analysis, we present results for a specifically adapted NTT core for the Dilithium cryptosystem, optimizing this component for an optimal LUT and FF utilization by efficient use of special purpose DSPs. Presenting our results, we aim to shed further light on the performance of lattice-based cryptography in low-cost and high-throughput configurations and their respective potential use-cases in practice

    Affine Equivalence and its Application to Tightening Threshold Implementations

    Get PDF
    Motivated by the development of Side-Channel Analysis (SCA) countermeasures which can provide security up to a certain order, defeating higher-order attacks has become amongst the most challenging issues. For instance, Threshold Implementation (TI) which nicely solves the problem of glitches in masked hardware designs is able to avoid first-order leakages. Hence, its extension to higher orders aims at counteracting SCA attacks at higher orders, that might be limited to univariate scenarios. Although with respect to the number of traces as well as sensitivity to noise the higher the order, the harder it is to mount the attack, a d-order TI design is vulnerable to an attack at order d+1. In this work we look at the feasibility of higher-order attacks on first-order TI from another perspective. Instead of increasing the order of resistance by employing higher-order TIs, we go toward introducing structured randomness into the implementation. Our construction, which is a combination of masking and hiding, is dedicated to TI designs and deals with the concept of affine equivalence of Boolean functions. Such a combination hardens a design practically against higher-order attacks so that these attacks cannot be successfully mounted. We show that the area overhead of our construction is paid off by its ability to avoid higher-order leakages to be practically exploitable

    Robust and One-Pass Parallel Computation of Correlation-Based Attacks at Arbitrary Order - Extended Version

    Get PDF
    The protection of cryptographic implementations against higher-order attacks has risen to an important topic in the side-channel community after the advent of enhanced measurement equipment that enables the capture of millions of power traces in reasonably short time. However, the preprocessing of multi-million traces for such an attack is still challenging, in particular when in the case of (multivariate) higher-order attacks all traces need to be parsed at least two times. Even worse, partitioning the captured traces into smaller groups to parallelize computations is hardly possible with current techniques. In this work we introduce procedures that allow iterative computation of correlation in a side-channel analysis attack at any arbitrary order in both univariate and multivariate settings. The advantages of our proposed solutions are manifold: i) they provide stable results, i.e., by increasing the number of used traces high accuracy of the estimations is still maintained, ii) each trace needs to be processed only once and at any time the result of the attack can be obtained (without requiring to reparse the whole trace pull when adding more traces), and iii) the computations can be efficiently parallelized, e.g., by splitting the trace pull into smaller subsets and processing each by a single thread on a multi-threading or cloud-computing platform. In short, our constructions allow efficiently performing higher-order side-channel analysis attacks (e.g., on hundreds of million traces) which is of crucial importance when practical evaluation of the masking schemes need to be performed

    Quantitative Fault Injection Analysis

    Get PDF
    Active fault injection is a credible threat to real-world digital systems computing on sensitive data. Arguing about security in the presence of faults is non-trivial, and state-of-the-art criteria are overly conservative and lack the ability of fine-grained comparison. However, comparing two alternative implementations for their security is required to find a satisfying compromise between security and performance. In addition, the comparison of alternative fault scenarios can help optimize the implementation of effective countermeasures. In this work, we use quantitative information flow analysis to establish a vulnerability metric for hardware circuits under fault injection that measures the severity of an attack in terms of information leakage. Potential use cases range from comparing implementations with respect to their vulnerability to specific fault scenarios to optimizing countermeasures. We automate the computation of our metric by integrating it into a state-of-the-art evaluation tool for physical attacks and provide new insights into the security under an active fault attacker
    corecore