360 research outputs found

    Automatic generation of high speed elliptic curve cryptography code

    Get PDF
    Apparently, trust is a rare commodity when power, money or life itself are at stake. History is full of examples. Julius Caesar did not trust his generals, so that: ``If he had anything confidential to say, he wrote it in cipher, that is, by so changing the order of the letters of the alphabet, that not a word could be made out. If anyone wishes to decipher these, and get at their meaning, he must substitute the fourth letter of the alphabet, namely D, for A, and so with the others.'' And so the history of cryptography began moving its first steps. Nowadays, encryption has decayed from being an emperor's prerogative and became a daily life operation. Cryptography is pervasive, ubiquitous and, the best of all, completely transparent to the unaware user. Each time we buy something on the Internet we use it. Each time we search something on Google we use it. Everything without (almost) realizing that it silently protects our privacy and our secrets. Encryption is a very interesting instrument in the "toolbox of security" because it has very few side effects, at least on the user side. A particularly important one is the intrinsic slow down that its use imposes in the communications. High speed cryptography is very important for the Internet, where busy servers proliferate. Being faster is a double advantage: more throughput and less server overhead. In this context, however, the public key algorithms starts with a big handicap. They have very bad performances if compared to their symmetric counterparts. Due to this reason their use is often reduced to the essential operations, most notably key exchanges and digital signatures. The high speed public key cryptography challenge is a very practical topic with serious repercussions in our technocentric world. Using weak algorithms with a reduced key length to increase the performances of a system can lead to catastrophic results. In 1985, Miller and Koblitz independently proposed to use the group of rational points of an elliptic curve over a finite field to create an asymmetric algorithm. Elliptic Curve Cryptography (ECC) is based on a problem known as the ECDLP (Elliptic Curve Discrete Logarithm Problem) and offers several advantages with respect to other more traditional encryption systems such as RSA and DSA. The main benefit is that it requires smaller keys to provide the same security level since breaking the ECDLP is much harder. In addition, a good ECC implementation can be very efficient both in time and memory consumption, thus being a good candidate for performing high speed public key cryptography. Moreover, some elliptic curve based techniques are known to be extremely resilient to quantum computing attacks, such as the SIDH (Supersingular Isogeny Diffie-Hellman). Traditional elliptic curve cryptography implementations are optimized by hand taking into account the mathematical properties of the underlying algebraic structures, the target machine architecture and the compiler facilities. This process is time consuming, requires a high degree of expertise and, ultimately, error prone. This dissertation' ultimate goal is to automatize the whole optimization process of cryptographic code, with a special focus on ECC. The framework presented in this thesis is able to produce high speed cryptographic code by automatically choosing the best algorithms and applying a number of code-improving techniques inspired by the compiler theory. Its central component is a flexible and powerful compiler able to translate an algorithm written in a high level language and produce a highly optimized C code for a particular algebraic structure and hardware platform. The system is generic enough to accommodate a wide array of number theory related algorithms, however this document focuses only on optimizing primitives based on elliptic curves defined over binary fields

    Elliptic Curve Cryptography on Modern Processor Architectures

    Get PDF
    Abstract Elliptic Curve Cryptography (ECC) has been adopted by the US National Security Agency (NSA) in Suite "B" as part of its "Cryptographic Modernisation Program ". Additionally, it has been favoured by an entire host of mobile devices due to its superior performance characteristics. ECC is also the building block on which the exciting field of pairing/identity based cryptography is based. This widespread use means that there is potentially a lot to be gained by researching efficient implementations on modern processors such as IBM's Cell Broadband Engine and Philip's next generation smart card cores. ECC operations can be thought of as a pyramid of building blocks, from instructions on a core, modular operations on a finite field, point addition & doubling, elliptic curve scalar multiplication to application level protocols. In this thesis we examine an implementation of these components for ECC focusing on a range of optimising techniques for the Cell's SPU and the MIPS smart card. We show significant performance improvements that can be achieved through of adoption of EC

    Efficient software implementation of elliptic curves and bilinear pairings

    Get PDF
    Orientador: Júlio César Lopez HernándezTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O advento da criptografia assimétrica ou de chave pública possibilitou a aplicação de criptografia em novos cenários, como assinaturas digitais e comércio eletrônico, tornando-a componente vital para o fornecimento de confidencialidade e autenticação em meios de comunicação. Dentre os métodos mais eficientes de criptografia assimétrica, a criptografia de curvas elípticas destaca-se pelos baixos requisitos de armazenamento para chaves e custo computacional para execução. A descoberta relativamente recente da criptografia baseada em emparelhamentos bilineares sobre curvas elípticas permitiu ainda sua flexibilização e a construção de sistemas criptográficos com propriedades inovadoras, como sistemas baseados em identidades e suas variantes. Porém, o custo computacional de criptossistemas baseados em emparelhamentos ainda permanece significativamente maior do que os assimétricos tradicionais, representando um obstáculo para sua adoção, especialmente em dispositivos com recursos limitados. As contribuições deste trabalho objetivam aprimorar o desempenho de criptossistemas baseados em curvas elípticas e emparelhamentos bilineares e consistem em: (i) implementação eficiente de corpos binários em arquiteturas embutidas de 8 bits (microcontroladores presentes em sensores sem fio); (ii) formulação eficiente de aritmética em corpos binários para conjuntos vetoriais de arquiteturas de 64 bits e famílias mais recentes de processadores desktop dotadas de suporte nativo à multiplicação em corpos binários; (iii) técnicas para implementação serial e paralela de curvas elípticas binárias e emparelhamentos bilineares simétricos e assimétricos definidos sobre corpos primos ou binários. Estas contribuições permitiram obter significativos ganhos de desempenho e, conseqüentemente, uma série de recordes de velocidade para o cálculo de diversos algoritmos criptográficos relevantes em arquiteturas modernas que vão de sistemas embarcados de 8 bits a processadores com 8 coresAbstract: The development of asymmetric or public key cryptography made possible new applications of cryptography such as digital signatures and electronic commerce. Cryptography is now a vital component for providing confidentiality and authentication in communication infra-structures. Elliptic Curve Cryptography is among the most efficient public-key methods because of its low storage and computational requirements. The relatively recent advent of Pairing-Based Cryptography allowed the further construction of flexible and innovative cryptographic solutions like Identity-Based Cryptography and variants. However, the computational cost of pairing-based cryptosystems remains significantly higher than traditional public key cryptosystems and thus an important obstacle for adoption, specially in resource-constrained devices. The main contributions of this work aim to improve the performance of curve-based cryptosystems, consisting of: (i) efficient implementation of binary fields in 8-bit microcontrollers embedded in sensor network nodes; (ii) efficient formulation of binary field arithmetic in terms of vector instructions present in 64-bit architectures, and on the recently-introduced native support for binary field multiplication in the latest Intel microarchitecture families; (iii) techniques for serial and parallel implementation of binary elliptic curves and symmetric and asymmetric pairings defined over prime and binary fields. These contributions produced important performance improvements and, consequently, several speed records for computing relevant cryptographic algorithms in modern computer architectures ranging from embedded 8-bit microcontrollers to 8-core processorsDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

    CryptOpt: Verified Compilation with Random Program Search for Cryptographic Primitives

    Full text link
    Most software domains rely on compilers to translate high-level code to multiple different machine languages, with performance not too much worse than what developers would have the patience to write directly in assembly language. However, cryptography has been an exception, where many performance-critical routines have been written directly in assembly (sometimes through metaprogramming layers). Some past work has shown how to do formal verification of that assembly, and other work has shown how to generate C code automatically along with formal proof, but with consequent performance penalties vs. the best-known assembly. We present CryptOpt, the first compilation pipeline that specializes high-level cryptographic functional programs into assembly code significantly faster than what GCC or Clang produce, with mechanized proof (in Coq) whose final theorem statement mentions little beyond the input functional program and the operational semantics of x86-64 assembly. On the optimization side, we apply randomized search through the space of assembly programs, with repeated automatic benchmarking on target CPUs. On the formal-verification side, we connect to the Fiat Cryptography framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker, incorporating a modest subset of known features of SMT solvers and symbolic-execution engines. The overall prototype is quite practical, e.g. producing new fastest-known implementations for the relatively new Intel i9 12G, of finite-field arithmetic for both Curve25519 (part of the TLS standard) and the Bitcoin elliptic curve secp256k1

    Efficient Implementations of Pairing-Based Cryptography on Embedded Systems

    Get PDF
    Many cryptographic applications use bilinear pairing such as identity based signature, instance identity-based key agreement, searchable public-key encryption, short signature scheme, certificate less encryption and blind signature. Elliptic curves over finite field are the most secure and efficient way to implement bilinear pairings for the these applications. Pairing based cryptosystems are being implemented on different platforms such as low-power and mobile devices. Recently, hardware capabilities of embedded devices have been emerging which can support efficient and faster implementations of pairings on hand-held devices. In this thesis, the main focus is optimization of Optimal Ate-pairing using special class of ordinary curves, Barreto-Naehring (BN), for different security levels on low-resource devices with ARM processors. Latest ARM architectures are using SIMD instructions based NEON engine and are helpful to optimize basic algorithms. Pairing implementations are being done using tower field which use field multiplication as the most important computation. This work presents NEON implementation of two multipliers (Karatsuba and Schoolbook) and compare the performance of these multipliers with different multipliers present in the literature for different field sizes. This work reports the fastest implementation timing of pairing for BN254, BN446 and BN638 curves for ARMv7 architecture which have security levels as 128-, 164-, and 192-bit, respectively. This work also presents comparison of code performance for ARMv8 architectures

    Implementing efficient 384-Bit NIST elliptic curves over prime fields on an ARM946E

    Get PDF
    This thesis presents a performance evaluation of a 384-bit NIST elliptic curve over prime fields on a 32-bit ARM946E microprocessor running at 100-MHz. While adhering to the constraints of an embedded system, the following items were investigated to decrease computation time: the importance of the underlying finite arithmetic, the use of hardware accelerators, the use of memory options, and the use of available processor features. The elliptic curve implementation utilized existing finite arithmetic C code to interface to an AiMEC Montgomery Exponentiator Core. The exponentiator core supports modular addition, modular multiplication, and exponentiation. The finite arithmetic C code also contained functions to perform operations which are not performed by the exponentiator such as non-modular multiplication, non-modular addition, and modular subtraction. Multiple enhancements were made to the finite field arithmetic. These provided a 22% time reduction in execution time of the 384-bit elliptic curve multiplication. Enhancements included writing assembly functions, adding checks prior to performing a modular reduction, utilizing the exponentiator core only when modulus reduction was necessary, using multiplication if more than two additions are required and placing the finite arithmetic into its own library and using ARM mode. Other optimizations investigated including: cache usage, compiler options (speed vs. size), and Thumb instruction set vs. ARM instruction set provided minimal reduction, 3.6%, in the execution time

    Reconfigurable elliptic curve cryptography

    Get PDF
    Elliptic Curve Cryptosystems (ECC) have been proposed as an alternative to other established public key cryptosystems such as RSA (Rivest Shamir Adleman). ECC provide more security per bit than other known public key schemes based on the discrete logarithm problem. Smaller key sizes result in faster computations, lower power consumption and memory and bandwidth savings, thus making ECC a fast, flexible and cost-effective solution for providing security in constrained environments. Implementing ECC on reconfigurable platform combines the speed, security and concurrency of hardware along with the flexibility of the software approach. This work proposes a generic architecture for elliptic curve cryptosystem on a Field Programmable Gate Array (FPGA) that performs an elliptic curve scalar multiplication in 1.16milliseconds for GF (2163), which is considerably faster than most other documented implementations. One of the benefits of the proposed processor architecture is that it is easily reprogrammable to use different algorithms and is adaptable to any field order. Also through reconfiguration the arithmetic unit can be optimized for different area/speed requirements. The mathematics involved uses binary extension field of the form GF (2n) as the underlying field and polynomial basis for the representation of the elements in the field. A significant gain in performance is obtained by using projective coordinates for the points on the curve during the computation process

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

    Studies on high-speed hardware implementation of cryptographic algorithms

    Get PDF
    Cryptographic algorithms are ubiquitous in modern communication systems where they have a central role in ensuring information security. This thesis studies efficient implementation of certain widely-used cryptographic algorithms. Cryptographic algorithms are computationally demanding and software-based implementations are often too slow or power consuming which yields a need for hardware implementation. Field Programmable Gate Arrays (FPGAs) are programmable logic devices which have proven to be highly feasible implementation platforms for cryptographic algorithms because they provide both speed and programmability. Hence, the use of FPGAs for cryptography has been intensively studied in the research community and FPGAs are also the primary implementation platforms in this thesis. This thesis presents techniques allowing faster implementations than existing ones. Such techniques are necessary in order to use high-security cryptographic algorithms in applications requiring high data rates, for example, in heavily loaded network servers. The focus is on Advanced Encryption Standard (AES), the most commonly used secret-key cryptographic algorithm, and Elliptic Curve Cryptography (ECC), public-key cryptographic algorithms which have gained popularity in the recent years and are replacing traditional public-key cryptosystems, such as RSA. Because these algorithms are well-defined and widely-used, the results of this thesis can be directly applied in practice. The contributions of this thesis include improvements to both algorithms and techniques for implementing them. Algorithms are modified in order to make them more suitable for hardware implementation, especially, focusing on increasing parallelism. Several FPGA implementations exploiting these modifications are presented in the thesis including some of the fastest implementations available in the literature. The most important contributions of this thesis relate to ECC and, specifically, to a family of elliptic curves providing faster computations called Koblitz curves. The results of this thesis can, in their part, enable increasing use of cryptographic algorithms in various practical applications where high computation speed is an issue
    corecore