6 research outputs found

    Low-cost, low-power FPGA implementation of ED25519 and CURVE25519 point multiplication

    Get PDF
    Twisted Edwards curves have been at the center of attention since their introduction by Bernstein et al. in 2007. The curve ED25519, used for Edwards-curve Digital Signature Algorithm (EdDSA), provides faster digital signatures than existing schemes without sacrificing security. The CURVE25519 is a Montgomery curve that is closely related to ED25519. It provides a simple, constant time, and fast point multiplication, which is used by the key exchange protocol X25519. Software implementations of EdDSA and X25519 are used in many web-based PC and Mobile applications. In this paper, we introduce a low-power, low-area FPGA implementation of the ED25519 and CURVE25519 scalar multiplication that is particularly relevant for Internet of Things (IoT) applications. The efficiency of the arithmetic modulo the prime number 2 255 − 19, in particular the modular reduction and modular multiplication, are key to the efficiency of both EdDSA and X25519. To reduce the complexity of the hardware implementation, we propose a high-radix interleaved modular multiplication algorithm. One benefit of this architecture is to avoid the use of large-integer multipliers relying on FPGA DSP modules

    Fast, Small, and Area-Time Efficient Architectures for Key-Exchange on Curve25519

    Get PDF
    Abstract--- This paper demonstrates fast and compact implementations of Elliptic Curve Cryptography (ECC) for efficient key agreement over Curve25519. Curve25519 has been recently adopted as a key exchange method for several applications such as connected small devices as well as cloud, and included in the National Institute of Standards and Technology (NIST) recommendations for public key cryptography. This paper presents three different performance level designs including lightweight, area-time efficient, and high-performance architectures. Lightweight hardware implementations are used for several Internet of Things (IoT) applications due to their resources being at premium. Our lightweight architecture utilizes 90% less resources compared to the best previous work while it is still more optimized in term of A\cdot T (area\timestime). For efficient implementation from either time or utilized resources, our area-time efficient architecture can establish almost 7,000 key sessions per second which is 64% faster than the previous works. The area-time efficient architecture uses well scheduled interleaved multiplication combined with a reduction algorithm. Additionally, we offer a fast architecture for high performance applications based on the 4-level Karatsuba method and Carry-Compact Addition (CCA). Our high-performance architecture also outperforms previous work in terms of A\cdot T. The results show 9% and 29% improvement in A\cdot T and A_{d}\cdot T (DSP_count\timestime), respectively. All architectures are variable-base-point implemented on the Xilinx Zynq-7020 FPGA family where performance and implementation metrics are reported and compared. Finally, various side-channel attack countermeasures are embedded in the proposed architectures

    Time-Efficient Finite Field Microarchitecture Design for Curve448 and Ed448 on Cortex-M4

    Get PDF
    The elliptic curve family of schemes has the lowest computational latency, memory use, energy consumption, and bandwidth requirements, making it the most preferred public key method for adoption into network protocols. Being suitable for embedded devices and applicable for key exchange and authentication, ECC is assuming a prominent position in the field of IoT cryptography. The attractive properties of the relatively new curve Curve448 contribute to its inclusion in the TLS1.3 protocol and pique the interest of academics and engineers aiming at studying and optimizing the schemes. When addressing low-end IoT devices, however, the literature indicates little work on these curves. In this paper, we present an efficient design for both protocols based on Montgomery curve Curve448 and its birationally equivalent Edwards curve Ed448 used for key agreement and digital signature algorithm, specifically the X448 function and the Ed448 DSA, relying on efficient low-level arithmetic operations targeting the ARM-based Cortex-M4 platform. Our design performs point multiplication, the base of the Elliptic Curve Diffie-Hellman (ECDH), in 3,2KCCs, resulting in more than 48% improvement compared to the best previous work based on Curve448, and performs sign and verify, the main operations of the Edwards-curves Digital Signature Algorithm (EdDSA), in 6,038KCCs and 7,404KCCs, showing a speedup of around 11% compared to the counterparts. We present novel modular multiplication and squaring architectures reaching ~25% and ~35% faster runtime than the previous best-reported results, respectively, based on Curve448 key exchange counterparts, and ~13% and ~25% better latency results than the Ed448-based digital signature counterparts targeting Cortex-M4 platform

    Towards attack-tolerant trusted execution environments: Secure remote attestation in the presence of side channels

    Get PDF
    In recent years, trusted execution environments (TEEs) have seen increasing deployment in computing devices to protect security-critical software from run-time attacks and provide isolation from an untrustworthy operating system (OS). A trusted party verifies the software that runs in a TEE using remote attestation procedures. However, the publication of transient execution attacks such as Spectre and Meltdown revealed fundamental weaknesses in many TEE architectures, including Intel Software Guard Exentsions (SGX) and Arm TrustZone. These attacks can extract cryptographic secrets, thereby compromising the integrity of the remote attestation procedure. In this work, we design and develop a TEE architecture that provides remote attestation integrity protection even when confidentiality of the TEE is compromised. We use the formally verified seL4 microkernel to build the TEE, which ensures strong isolation and integrity. We offload cryptographic operations to a secure co-processor that does not share any vulnerable microarchitectural hardware units with the main processor, to protect against transient execution attacks. Our design guarantees integrity of the remote attestation procedure. It can be extended to leverage co-processors from Google and Apple, for wide-scale deployment on mobile devices

    Compact and Flexible FPGA Implementation of Ed25519 and X25519

    No full text
    © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. This article describes a field-programmable gate array (FPGA) cryptographic architecture, which combines the elliptic curve-based Ed25519 digital signature algorithm and the X25519 key establishment scheme in a single module. Cryptographically, these are high-security elliptic curve cryptography algorithms with short key sizes and impressive execution times in software. Our goal is to provide a lightweight FPGA module that enables them on resource-constrained devices, specifically for Internet of Things (IoT) applications. In addition, we aim at extensibility with customisable countermeasures against timing and differential power analysis side-channel attacks and fault-injection attacks. For the former, we offer a choice between time-optimised versus constant-time execution, with or without Z-coordinate randomisation and base-point blinding; and for the latter, we offer enabling or disabling default-case statements in the Finite State Machine (FSM) descriptions. To obtain compactness and at the same time fast execution times, we make maximum use of the Digital Signal Processing (DSP) slices on the FPGA. We designed a single arithmetic unit that is flexible to support operations with two moduli and non-modulus arithmetic. In addition, our design benefits in-place memory management and the local storage of inputs into DSP slices' pipeline registers and takes advantage of distributed memory. These eliminate a memory access bottleneck. The flexibility is offered by a microcode supported instruction-set architecture. Our design targets 7-Series Xilinx FPGAs and is prototyped on a Zynq System-on-Chip (SoC). The base design combining Ed25519 and X25519 in a single module, and its implementation requires only around 11.1K Lookup Tables (LUTs), 2.6K registers, and 16 DSP slices. Also, it achieves performance of 1.6ms for a signature generation and 3.6ms for a signature verification for a 1024-bit message with an 82MHz clock. Moreover, the design can be optimised only for X25519, which gives the most compact FPGA implementation compared to previously published X25519 implementations.status: publishe
    corecore