40 research outputs found

    Combined Fault and DPA Protection for Lattice-Based Cryptography

    Get PDF
    The progress on constructing quantum computers and the ongoing standardization of post-quantum cryptography (PQC) have led to the development and refinement of promising new digital signature schemes and key encapsulation mechanisms (KEM). Especially lattice-based schemes have gained some popularity in the research community, presumably due to acceptable key, ciphertext, and signature sizes as well as good performance results and cryptographic strength. However, in some practical applications like smart cards, it is also crucial to secure cryptographic implementations against side-channel and fault attacks. In this work, we analyze the so-called redundant number representation (RNR) that can be used to counter side-channel attacks. We show how to avoid security issues with the RNR due to unexpected de-randomization and we apply it to the Kyber KEM and show that the RNR has a very low overhead. We then verify the RNR methodology by practical experiments, using the non-specific t-test methodology and the ChipWhisperer platform. Furthermore, we present a novel countermeasure against fault attacks based on the Chinese remainder theorem (CRT). On an ARM Cortex-M4, our implementation of the RNR and fault countermeasure offers better performance than masking and redundant calculation. Our methods thus have the potential to expand the toolbox of a defender implementing lattice-based cryptography with protection against two common physical attacks

    Fault-enabled chosen-ciphertext attacks on Kyber

    Get PDF
    NIST\u27s PQC standardization process is in the third round, and a first final choice between one of three remaining lattice-based key encapsulation mechanisms is expected by the end of 2021. This makes studying the implementation-security aspect of the candidates a pressing matter. However, while the development of side-channel attacks and corresponding countermeasures has seen continuous interest, fault attacks are still a vastly underdeveloped field. In fact, a first practical fault attack on lattice-based KEMs was demonstrated just very recently by Pessl and Prokop. However, while their attack can bypass some standard fault countermeasures, it may be defeated using shuffling, and their use of skipping faults makes it also highly implementation dependent. Thus, the vulnerability of implementations against fault attacks and the concrete need for countermeasures is still not well understood. In this work, we shine light on this problem and demonstrate new attack paths. Concretely, we show that the combination of fault injections with chosen-ciphertext attacks is a significant threat to implementations and can bypass several countermeasures. We state an attack on Kyber which combines ciphertext manipulation - flipping a single bit of an otherwise valid ciphertext - with a fault that corrects the ciphertext again during decapsulation. By then using the Fujisaki-Okamoto transform as an oracle, i.e., observing whether or not decapsulation fails, we derive inequalities involving secret data, from which we may recover the private key. Our attack is not defeated by many standard countermeasures such as shuffling in time or Boolean masking, and the fault may be introduced over a large execution-time interval at several places. In addition, we improve a known recovery technique to efficiently and practically recover the secret key from a smaller number of inequalities compared to the previous method

    Enhanced Lattice-Based Signatures on Reconfigurable Hardware

    Get PDF
    The recent Bimodal Lattice Signature Scheme (BLISS) showed that lattice-based constructions have evolved to practical alternatives to RSA or ECC. It offers small signatures of 5600 bits for a 128-bit level of security, and proved to be very fast in software. However, due to the complex sampling of Gaussian noise with high precision, it is not clear whether this scheme can be mapped efficiently to embedded devices. Even though the authors of BLISS also proposed a new sampling algorithm using Bernoulli variables this approach is more complex than previous methods using large precomputed tables. The clear disadvantage of using large tables for high performance is that they cannot be used on constrained computing environments, such as FPGAs, with limited memory. In this work we thus present techniques for an efficient Cumulative Distribution Table (CDT) based Gaussian sampler on reconfigurable hardware involving Peikert\u27s convolution lemma and the Kullback-Leibler divergence. Based on our enhanced sampler design, we provide a scalable implementation of BLISS signing and verification on a Xilinx Spartan-6 FPGA supporting either 128-bit, 160-bit, or 192-bit security. For high speed we integrate fast FFT/NTT-based polynomial multiplication, parallel sparse multiplication, Huffman compression of signatures, and Keccak as hash function. Additionally, we compare the CDT with the Bernoulli approach and show that for the particular BLISS-I parameter set the improved CDT approach is faster with lower area consumption. Our BLISS-I core uses 2,291 slices, 5.5 BRAMs, and 5 DSPs and performs a signing operation in 114.1 us on average. Verification is even faster with a latency of 61.2 us and 17,101 supported verification operations per second

    High-Performance Ideal Lattice-Based Cryptography on 8-bit ATxmega Microcontrollers

    Get PDF
    Over the last years lattice-based cryptography has received much attention due to versatile average-case problems like Ring-LWE or Ring-SIS that appear to be intractable by quantum computers. But despite of promising constructions, only few results have been published on implementation issues on very constrained platforms. In this work we therefore study and compare implementations of Ring-LWE encryption and the bimodal lattice signature scheme (BLISS) on an 8-bit Atmel ATxmega128 microcontroller. Since the number theoretic transform (NTT) is one of the core components in implementations of lattice-based cryptosystems, we review the application of the NTT in previous implementations and present an improved approach that significantly lowers the runtime for polynomial multiplication. Our implementation of Ring-LWE encryption takes 27 ms for encryption and 6.7 ms for decryption. To compute a BLISS signature, our software takes 329 ms and 88 ms for verification. These results outperform implementations on similar platforms and underline the feasibility of lattice-based cryptography on constrained devices

    Post-quantum key exchange - a new hope

    Get PDF
    In 2015, Bos, Costello, Naehrig, and Stebila (IEEE Security & Privacy 2015) proposed an instantiation of Ding\u27s ring-learning-with-errors (Ring-LWE) based key-exchange protocol (also including the tweaks proposed by Peikert from PQCrypto 2014), together with an implementation integrated into OpenSSL, with the affirmed goal of providing post-quantum security for TLS. In this work we revisit their instantiation and stand-alone implementation. Specifically, we propose new parameters and a better suited error distribution, analyze the scheme\u27s hardness against attacks by quantum computers in a conservative way, introduce a new and more efficient error-reconciliation mechanism, and propose a defense against backdoors and all-for-the-price-of-one attacks. By these measures and for the same lattice dimension, we more than double the security parameter, halve the communication overhead, and speed up computation by more than a factor of 8 in a portable C implementation and by more than a factor of 27 in an optimized implementation targeting current Intel CPUs. These speedups are achieved with comprehensive protection against timing attacks

    NewHope without reconciliation

    Get PDF
    In this paper we introduce NewHope-Simple, a variant of the NewHope Ring-LWE-based key exchange that is using a straight-forward transformation from Ring-LWE encryption to a passively secure KEM (or key-exchange scheme). The main advantage of NewHopeLP-Simple over NewHope is simplicity. In particular, it avoids the error-reconciliation mechanism originally proposed by Ding. The explanation of his method, combined with other tricks, like unbiasing the key following Peikert\u27s tweak and using the quantizer D4D_4 to extract one key bit from multiple coefficients, takes more than three pages in the NewHope-Simple paper. The price for that simplicity is small: one of the exchanged messages increases in size by 6.25%6.25\% from 20482048 bytes to 21762176 bytes. The security of NewHopeLP is the same as the security of NewHope; the performance is very similar

    Accelerating Homomorphic Evaluation on Reconfigurable Hardware

    Get PDF
    Homomorphic encryption allows computation on encrypted data and makes it possible to securely outsource computational tasks to untrusted environments. However, all proposed schemes are quite inefficient and homomorphic evaluation of ciphertexts usually takes several seconds on high-end CPUs, even for evaluating simple functions. In this work we investigate the potential of FPGAs for speeding up those evaluation operations. We propose an architecture to accelerate schemes based on the ring learning with errors (RLWE) problem and specifically implemented the somewhat homomorphic encryption scheme YASHE, which was proposed by Bos, Lauter, Loftus, and Naehrig in 2013. Due to the large size of ciphertexts and evaluation keys, on-chip storage of all data is not possible and external memory is required. For efficient utilization of the external memory we propose an efficient double-buffered memory access scheme and a polynomial multiplier based on the number theoretic transform (NTT). For the parameter set (n=16384,log_2(q)=512) capable of evaluating 9 levels of multiplications, we can perform a homomorphic addition in 48.67 and a homomorphic multiplication in 0.94 ms

    A Lightweight Identification Protocol Based on Lattices

    Get PDF
    In this work we present a lightweight lattice-based identification protocol based on the CPA-secured public key encryption scheme Kyber. It is designed as a replacement for existing classical ECC- or RSA-based identification protocols in IoT, smart card applications, or for device authentication. The proposed protocol is simple, efficient, and implementations are supposed to be easy to harden against side-channel attacks. Compared to standard constructions for identification protocols based on lattice-based KEMs, our construction achieves this by avoiding the Fujisaki-Okamoto transform and its impact on implementation security. Moreover, contrary to prior lattice-based identification protocols or standard constructions using signatures, our work does not require rejection sampling and can use more efficient parameters than signature schemes. We provide a generic construction from CPA-secured public key encryption schemes to identification protocols and give a security proof of the protocol in the ROM. Moreover, we instantiate the generic construction with Kyber, for which we use the proposed parameter sets for NIST security levels I, III, and V. To show that the protocol is suitable for constrained devices, we implemented one selected parameter set on an ARM Cortex-M4 microcontroller. As the protocol is based on existing algorithms for Kyber, we make use of existing SW components (e.g., fast NTT implementations) for our implementation

    Implementing RLWE-based Schemes Using an RSA Co-Processor

    Get PDF
    We repurpose existing RSA/ECC co-processors for (ideal) lattice-based cryptography by exploiting the availability of fast long integer multiplication. Such co-processors are deployed in smart cards in passports and identity cards, secured microcontrollers and hardware security modules (HSM). In particular, we demonstrate an implementation of a variant of the Module-LWE-based Kyber Key Encapsulation Mechanism (KEM) that is tailored for high performance on a commercially available smart card chip (SLE 78). To benefit from the RSA/ECC co-processor we use Kronecker substitution in combination with schoolbook and Karatsuba polynomial multiplication. Moreover, we speed-up symmetric operations in our Kyber variant using the AES co-processor to implement a PRNG and a SHA-256 co-processor to realise hash functions. This allows us to execute CCA-secure Kyber768 key generation in 79.6 ms, encapsulation in 102.4 ms and decapsulation in 132.7 ms
    corecore