453 research outputs found

    Design and realization of an embedded processor for cryptographic applications

    Get PDF
    Architectural enhancements are a set of modifications in a general-purpose processor to improve the processing of a given workload such as multimedia applications and cryptographic operations. Employing faster/enhanced arithmetic units for the existing instruction set architecture (ISA), introducing application-specific instructions to the ISA, and adding a new set of registers are common practices employed as architectural enhancements. In this thesis, we introduce and implement a set of relatively low-cost enhancement techniques to accelerate certain arithmetic operations common in cryptographic applications on a configurable and extensible embedded processor core. The proposed enhancements are generic in the sense that they can profitably be applied in many RISC processors. These enhancements are organized into, what we prefer to call as, cryptographic unit (CU) that offers an extended ISA to the programmer. We then present the speedup values obtained for various arithmetic operations and public key cryptography algorithms through these enhancements. Furthermore, hardware overhead of introducing the enhancements to the embedded extensible processor is provided in terms of chip area. Our experimental results show that the proposed architectural enhancements provides significant amount of speedup (up to one order of magnitude) in elliptic curve cryptography and RSA with a conservative increase in hardware. Last but not the least, we demonstrate that the proposed enhancements facilitate protection of cryptographic algorithms against certain side-channel attacks by reporting our case study of AES implementation hardened against cache-based attacks

    Efficient Elliptic Curve Cryptography Software Implementation on Embedded Platforms

    Get PDF

    Hardware-Software Codesign of a Vector Co-processor for Public Key Cryptography

    Get PDF
    International audienceUntil now, most cryptography implementations on parallel architectures have focused on adapting the software to SIMD architectures initially meant for media applications. In this paper, we review some of the most significant contributions in this area. We then propose a vector architecture to efficiently implement long precision modular multiplications. Having such a data level parallel hardware provides a circuit whose decode and schedule units are at least of the same complexity as those of a scalar processor. The excess transistors are mainly found in the data path. Moreover, the vector approach gives a very modular architecture where resources can be easily redefined. We built a functional simulator onto which we performed a quantitative analysis to study how the resizing of those resources affects the performance of the modular multiplication operation. Hence we not only propose a vector architecture for our Public Key cryptographic operations but also show how we can analyze the impact of design choices on performance. The proposed architecture is also flexible in the sense that the software running on it would offer room for the implementation of counter-measures against side-channel or fault attacks

    Low-Weight Primes for Lightweight Elliptic Curve Cryptography on 8-bit AVR Processors

    Get PDF
    Small 8-bit RISC processors and micro-controllers based on the AVR instruction set architecture are widely used in the embedded domain with applications ranging from smartcards over control systems to wireless sensor nodes. Many of these applications require asymmetric encryption or authentication, which has spurred a body of research into implementation aspects of Elliptic Curve Cryptography (ECC) on the AVR platform. In this paper, we study the suitability of a special class of finite fields, the so-called Optimal Prime Fields (OPFs), for a "lightweight" implementation of ECC with a view towards high performance and security. An OPF is a finite field Fp defined by a prime of the form p = u*2^k + v, whereby both u and v are "small" (in relation to 2^k) so that they fit into one or two registers of an AVR processor. OPFs have a low Hamming weight, which allows for a very efficient implementation of the modular reduction since only the non-zero words of p need to be processed. We describe a special variant of Montgomery multiplication for OPFs that does not execute any input-dependent conditional statements (e.g. branch instructions) and is, hence, resistant against certain side-channel attacks. When executed on an Atmel ATmega processor, a multiplication in a 160-bit OPF takes just 3237 cycles, which compares favorably with other implementations of 160-bit modular multiplication on an 8-bit processor. We also describe a performance-optimized and a security-optimized implementation of elliptic curve scalar multiplication over OPFs. The former uses a GLV curve and executes in 4.19M cycles (over a 160-bit OPF), while the latter is based on a Montgomery curve and has an execution time of approximately 5.93M cycles. Both results improve the state-of-the-art in lightweight ECC on 8-bit processors

    Optimization of Supersingular Isogeny Cryptography for Deeply Embedded Systems

    Get PDF
    Public-key cryptography in use today can be broken by a quantum computer with sufficient resources. Microsoft Research has published an open-source library of quantum-secure supersingular isogeny (SI) algorithms including Diffie-Hellman key agreement and key encapsulation in portable C and optimized x86 and x64 implementations. For our research, we modified this library to target a deeply-embedded processor with instruction set extensions and a finite-field coprocessor originally designed to accelerate traditional elliptic curve cryptography (ECC). We observed a 6.3-7.5x improvement over a portable C implementation using instruction set extensions and a further 6.0-6.1x improvement with the addition of the coprocessor. Modification of the coprocessor to a wider datapath further increased performance 2.6-2.9x. Our results show that current traditional ECC implementations can be easily refactored to use supersingular elliptic curve arithmetic and achieve post-quantum security

    Compressed SIKE Round 3 on ARM Cortex-M4

    Get PDF
    In 2016, the National Institute of Standards and Technology (NIST) initiated a standardization process among the post-quantum secure algorithms. Forming part of the alternate group of candidates after Round 2 of the process is the Supersingular Isogeny Key Encapsulation (SIKE) mechanism which attracts with the smallest key sizes offering post-quantum security in scenarios of limited bandwidth and memory resources. Even further reduction of the exchanged information is offered by the compression mechanism, proposed by Azarderakhsh et al., which, however, introduces a significant time overhead and increases the memory requirements of the protocol, making it challenging to integrate it into an embedded system. In this paper, we propose the first compressed SIKE implementation for a resource-constrained device, where we targeted the NIST recommended platform STM32F407VG featuring ARM Cortex-M4 processor. We integrate the isogeny-based implementation strategies described previously in the literature into the compressed version of SIKE. Additionally, we propose a new assembly design for the finite field operations particular for the compressed SIKE, and observe a speedup of up to 16% and up to 25% compared to the last best-reported assembly implementations for p434, p503, and p610

    A Vector Approach to Cryptography Implementation

    Get PDF
    International audienceThe current deployment of Digital Right Management (DRM) schemes to distribute protected contents and rights is leading the way to massive use of sophisticated embedded cryptographic applications. Embedded microprocessors have been equipped with bulky and power-consuming co-processors designed to suit particular data sizes. However, flexible cryptographic platforms are more desirable than devices dedicated to a particular cryptographic algorithm as the increasing cost of fabrication chips favors large volume production. This paper proposes a novel approach to embedded cryptography whereby we propose a vector-based general purpose machine capable of implementing a range of cryptographic algorithms. We show that vector processing ideas can be used to perform cryptography in an e±cient manner which we believe is appropriate for high performance, flexible and power efficient embedded systems

    Implementação eficiente da Curve25519 para microcontroladores ARM

    Get PDF
    Orientador: Diego de Freitas AranhaDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Com o advento da computação ubíqua, o fenômeno da Internet das Coisas (de Internet of Things) fará que com inúmeros dispositivos conectem-se um com os outros, enquanto trocam dados muitas vezes sensíveis pela sua natureza. Danos irreparáveis podem ser causados caso o sigilo destes seja quebrado. Isso causa preocupações acerca da segurança da comunicação e dos próprios dispositivos, que geralmente têm carência de mecanismos de proteção contra interferências físicas e pouca ou nenhuma medida de segurança. Enquanto desenvolver criptografia segura e eficiente como um meio de prover segurança à informação não é inédito, esse novo ambiente, com uma grande superfície de ataque, tem imposto novos desafios para a engenharia criptográfica. Uma abordagem segura para resolver este problema é utilizar blocos bem conhecidos e profundamente analisados, tal como o protocolo Segurança da Camada de Transporte (de Transport Layer Security, TLS). Na última versão desse padrão, as opções para Criptografia de Curvas Elípticas (de Elliptic Curve Cryptography - ECC) são expandidas para além de parâmetros estabelecidos por governos, tal como a proposta Curve25519 e protocolos criptográficos relacionados. Esse trabalho pesquisa implementações seguras e eficientes de Curve25519 para construir um esquema de troca de chaves em um microcontrolador ARM Cortex-M4, além do esquema de assinatura digital Ed25519 e a proposta de esquema de assinaturas digitais qDSA. Como resultado, operações de desempenho crítico, tal como o multiplicador de 256 bits, foram otimizadas; em particular, aceleração de 50% foi alcançada, impactando o desempenho de protocolos em alto nívelAbstract: With the advent of ubiquitous computing, the Internet of Things will undertake numerous devices connected to each other, while exchanging data often sensitive by nature. Breaching the secrecy of this data may cause irreparable damage. This raises concerns about the security of their communication and the devices themselves, which usually lack tamper resistance mechanisms or physical protection and even low to no security mesures. While developing efficient and secure cryptography as a mean to provide information security services is not a new problem, this new environment, with a wide attack surface, imposes new challenges to cryptographic engineering. A safe approach to solve this problem is reusing well-known and thoroughly analyzed blocks, such as the Transport Layer Security (TLS) protocol. In the last version of this standard, Elliptic Curve Cryptography options were expanded beyond government-backed parameters, such as the Curve25519 proposal and related cryptographic protocols. This work investigates efficient and secure implementations of Curve25519 to build a key exchange protocol on an ARM Cortex-M4 microcontroller, along the related signature scheme Ed25519 and a digital signature scheme proposal called qDSA. As result, performance-critical operations, such as a 256-bit multiplier, are greatly optimized; in this particular case, a 50% speedup is achieved, impacting the performance of higher-level protocolsMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPESFuncam
    corecore