14 research outputs found

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

    Post-Quantum Secure Architectures for Automotive Hardware Secure Modules

    Get PDF
    The rapid development of information technology in the automotive industry has driven increasing requirements on incorporating security functionalities in the in-vehicle architecture, which is usually realized by adding a Hardware Secure Module (HSM) in the Electronic Central Unit (ECU). Therefore, secure communications can be enforced by carrying out secret cryptographic computations within the HSM by use of the embedded hardware accelerators. However, there is no common standard for designing the architecture for an automotive HSM. A future design of a common automotive HSM is desired by the automotive industry which not only fits to the increasing performance demand, but also further defends against future attacks by attackers exploiting large-scale quantum computers. The arrival of future quantum computers motivates the investigation into post-quantum cryptography (PQC), which will retain the security of an HSM in the future. We analyzed the candidates in NIST\u27s PQC standardization process, and proposed new sets of hardware accelerators for the future generation of the automotive HSMs. Our evaluation results show that building a post-quantum secure automotive HSM is feasible and can meet the hard requirements imposed by a modern vehicle ECU

    To Be, or Not to Be Stateful: Post-Quantum Secure Boot using Hash-Based Signatures

    Get PDF
    While research in post-quantum cryptography (PQC) has gained significant momentum, it is only slowly adopted for real-world products. This is largely due to concerns about practicability and maturity. The secure boot process of embedded devices is one s- cenario where such restraints can result in fundamental security problems. In this work, we present a flexible hardware/software co-design for hash-based signature (HBS) schemes which enables the move to a post-quantum secure boot today. These signature schemes stand out due to their straightforward security proofs and are on the fast track to standardisation. In contrast to previous works, we exploit the performance intensive similarities of the s- tateful LMS and XMSS schemes as well as the stateless SPHINCS+ scheme. Thus, we enable designers to use a stateful or stateless scheme depending on the constraints of each individual application. To show the feasibility of our approach, we compare our results with hardware accelerated implementations of classical asymmetric algorithms. Further, we lay out the usage of different HBS schemes during the boot process. We compare different schemes, show the importance of parameter choices, and demonstrate the performance gain with different levels of hardware acceleration

    LMS vs XMSS: Comparison of Stateful Hash-Based Signature Schemes on ARM Cortex-M4

    Get PDF
    Stateful hash-based signature schemes are among the most efficient approaches for post-quantum signature schemes. Although not suitable for general use, they may be suitable for some use cases on constrained devices. LMS and XMSS are hash-based signature schemes that are conjectured to be quantum secure. In this work, we compared multiple instantiations of both schemes on an ARM Cortex-M4. More precisely, we compared performance, stack consumption, and other figures for key generation, signing and verifying. To achieve this, we evaluated LMS and XMSS using optimised implementations of SHA-256, SHAKE256, Gimli-Hash, and different variants of Keccak. Furthermore, we present slightly optimised implementations of XMSS achieving speedups of up to 3.11x for key generation, 3.11x for signing, and 4.32x for verifying

    Faulting Winternitz One-Time Signatures to forge LMS, XMSS, or SPHINCS+ signatures

    Get PDF
    Hash-based signature (HBS) schemes are an efficient method of guaranteeing the authenticity of data in a post-quantum world. The stateful schemes LMS and XMSS and the stateless scheme SPHINCS+ are already standardised or will be in the near future. The Winternitz one-time signature (WOTS) scheme is one of the fundamental building blocks used in all these HBS standardisation proposals. We present a new fault injection attack targeting WOTS that allows an adversary to forge signatures for arbitrary messages. The attack affects both the signing and verification processes of all current stateful and stateless schemes. Our attack renders the checksum calculation within WOTS useless. A successful fault injection allows at least an existential forgery attack and, in more advanced settings, a universal forgery attack. While checksum computation is clearly a critical point in WOTS, and thus in any of the relevant HBS schemes, its resilience against a fault attack has never been considered. To fill this gap, we theoretically explain the attack, estimate its practicability, and derive the brute-force complexity to achieve signature forgery for a variety of parameter sets. We analyse the reference implementations of LMS, XMSS and SPHINCS+ and pinpoint the vulnerable points. To harden these implementations, we propose countermeasures and evaluate their effectiveness and efficiency. Our work shows that exposed devices running signature generation or verification with any of these three schemes must have countermeasures in place

    A Survey of Recent Developments in Testability, Safety and Security of RISC-V Processors

    Get PDF
    With the continued success of the open RISC-V architecture, practical deployment of RISC-V processors necessitates an in-depth consideration of their testability, safety and security aspects. This survey provides an overview of recent developments in this quickly-evolving field. We start with discussing the application of state-of-the-art functional and system-level test solutions to RISC-V processors. Then, we discuss the use of RISC-V processors for safety-related applications; to this end, we outline the essential techniques necessary to obtain safety both in the functional and in the timing domain and review recent processor designs with safety features. Finally, we survey the different aspects of security with respect to RISC-V implementations and discuss the relationship between cryptographic protocols and primitives on the one hand and the RISC-V processor architecture and hardware implementation on the other. We also comment on the role of a RISC-V processor for system security and its resilience against side-channel attacks

    Erforschung und Implementierung von Zeitmess- und Zeitkontrollinstruktionen für RISC-V-Cores

    Get PDF
    Real Time Systems are integrated with physical processes such as sensors and actuators. Real Time Systems are time critical and hence they must be able to handle upper time bounds along with the correct functional behavior. The correct implementation of the functionality specified in the software is done using the Instruction Set Architecture (ISA) by the processor. But, neither the software nor ISA has a time measuring or time controlling role here and if time properties have to be guaranteed designers are required to reach beneath the abstraction layers which increases design complexity and effort. This thesis proposes a solution to bring control over time to the software by examining the Instruction Set Architecture (ISA) layer. The ISA defines the contract between software instructions and hardware implementations. But ISAs usually do not have timing properties imbibed in them. Hence, this work will investigate the instructions extension feature offered by RISC-V platform to allow programs to specify execution time properties in software. These include the ability to specify a minimum execution time for code blocks, and the ability to detect and handle missed deadlines from code blocks that exhibit variable execution times. This thesis investigates the RISC-V CPU named VexRiscV built by Charles Papon on the platform SpinalHDL where in which Custom Instructions would be added into the RISC-V ISA through the instruction extension featured allowed by the RISC-V platform. This was implemented and tested on an Arty A7-100 FPGA with the help of Xilinx tools

    Assembly or Optimized C for Lightweight Cryptography on RISC-V?

    Get PDF
    A major challenge when applying cryptography on constrained environments is the trade-off between performance and security. In this work, we analyzed different strategies for the optimization of several candidates of NIST\u27s lightweight cryptography standardization project on a RISC-V architecture. In particular, we studied the general impact of optimizing symmetric-key algorithms in assembly and in plain C. Furthermore, we present optimized implementations, achieving a speed-up of up to 81% over available implementations, and discuss general implementation strategies

    Implementation and Benchmarking of Round 2 Candidates in the NIST Post-Quantum Cryptography Standardization Process Using Hardware and Software/Hardware Co-design Approaches

    Get PDF
    Performance in hardware has typically played a major role in differentiating among leading candidates in cryptographic standardization efforts. Winners of two past NIST cryptographic contests (Rijndael in case of AES and Keccak in case of SHA-3) were ranked consistently among the two fastest candidates when implemented using FPGAs and ASICs. Hardware implementations of cryptographic operations may quite easily outperform software implementations for at least a subset of major performance metrics, such as speed, power consumption, and energy usage, as well as in terms of security against physical attacks, including side-channel analysis. Using hardware also permits much higher flexibility in trading one subset of these properties for another. A large number of candidates at the early stages of the standardization process makes the accurate and fair comparison very challenging. Nevertheless, in all major past cryptographic standardization efforts, future winners were identified quite early in the evaluation process and held their lead until the standard was selected. Additionally, identifying some candidates as either inherently slow or costly in hardware helped to eliminate a subset of candidates, saving countless hours of cryptanalysis. Finally, early implementations provided a baseline for future design space explorations, paving a way to more comprehensive and fairer benchmarking at the later stages of a given cryptographic competition. In this paper, we first summarize, compare, and analyze results reported by other groups until mid-May 2020, i.e., until the end of Round 2 of the NIST PQC process. We then outline our own methodology for implementing and benchmarking PQC candidates using both hardware and software/hardware co-design approaches. We apply our hardware approach to 6 lattice-based CCA-secure Key Encapsulation Mechanisms (KEMs), representing 4 NIST PQC submissions. We then apply a software-hardware co-design approach to 12 lattice-based CCA-secure KEMs, representing 8 Round 2 submissions. We hope that, combined with results reported by other groups, our study will provide NIST with helpful information regarding the relative performance of a significant subset of Round 2 PQC candidates, assuming that at least their major operations, and possibly the entire algorithms, are off-loaded to hardware
    corecore