54 research outputs found

    Optimizing MAKWA on GPU and CPU

    Get PDF
    We present here optimized implementations of the MAKWA password hashing function on an AMD Radeon HD 7990 GPU, and compare its efficiency with an Intel i7 4770K CPU for systematic dictionary attacks. We find that the GPU seems to get more hashing done for a given budget, but not by a large amount (the GPU is less than twice as efficient as the CPU). Raising the MAKWA modulus size to 4096 bits, instead of the default 2048 bits, should restore the balance in favour of the CPU. We also find that power consumption, not hardware retail price, is likely to become the dominant factor for industrialized, long-term attacking efforts

    EcGFp5: a Specialized Elliptic Curve

    Get PDF
    We present here the design and implementation of ecGFp5, an elliptic curve meant for a specific compute model in which operations modulo a given 64-bit prime are especially efficient. This model is primarily intended for running operations in a virtual machine that produces and verifies zero-knowledge STARK proofs. We describe here the choice of a secure curve, amenable to safe cryptographic operations such as digital signatures, that maps to such models, while still providing reasonable performance on general purpose computers

    Optimized Discrete Logarithm Computation for Faster Square Roots in Finite Fields

    Get PDF
    For computing square roots in a finite field GF(q)GF(q) where qβˆ’1=2nmq - 1 = 2^n m for an odd integer mm and some integer nn, the classic Tonelli-Shanks algorithm starts with an exponentiation (the exponent has size about log⁑2qβˆ’n\log_2 q - n bits), followed by a discrete logarithm computation in the subgroup of 2n2^n-th roots of unity in GF(q)GF(q); the latter operation has cost O(n2)O(n^2) multiplications in the field, which is prohibitive when nn is large. Bernstein proposed an optimized variant with lookup tables, leading to a runtime cost of O((n/w)2)O((n/w)^2), using ww-bit tables of cumulative size O(2wn/w)O(2^w n/w). Sarkar recently improved on the runtime cost, down to O((n/w)1.5)O((n/w)^{1.5}), with the same overall storage cost. In this short note, we explore the use of a straightforward divide-and-conquer variant of the Pohlig-Hellman algorithm, bringing the asymptotic cost down to O(nlog⁑n)O(n\log n), and further study some additional optimizations. The result appears to be competitive, at least in terms of number of multiplications, for some well-known fields such that the 224-bit field used in NIST standard elliptic curve P-224 (for which n=96n = 96)

    Paradoxical Compression with Verifiable Delay Functions

    Get PDF
    Lossless compression algorithms such as DEFLATE strive to reliably process arbitrary inputs, while achieving compressed sizes as low as possible for commonly encountered data inputs. It is well-known that it is mathematically impossible for a compression algorithm to simultaneously achieve non-trivial compression on some inputs (i.e. compress these inputs into strictly shorter outputs) and to never expand any other input (i.e. guaranteeing that all inputs will be compressed into an output which is no longer than the input); this is a direct application of the pigeonhole principle . Despite their mathematical impossibility, we show in this paper how to build such paradoxical compression and decompression algorithms, with the aid of some tools from cryptography, notably verifiable delay functions, and, of course, by slightly cheating

    Truncated EdDSA/ECDSA Signatures

    Get PDF
    This note presents some techniques to slightly reduce the size of EdDSA and ECDSA signatures without lowering their security or breaking compatibility with existing signers, at the cost of an increase in signature verification time; verifying a 64-byte Ed25519 signature truncated to 60 bytes has an average cost of 4.1 million cycles on 64-bit x86 (i.e. about 35 times the cost of verifying a normal, untruncated signature)

    Double-Odd Elliptic Curves

    Get PDF
    This article explores the use of elliptic curves with order 2r = 2 mod 4, which we call double-odd elliptic curves. This is a very large class, comprising about 1/4th of all curves over a given field. On such curves, we manage to define a prime order group with appropriate characteristics for building cryptographic protocols: - Element encoding is canonical, and verified upon decoding. For a 2n-bit group (with n-bit security), encoding size is 2n + 1 bits, i.e. as good as compressed points on classic prime order curves. - Unified and complete formulas allow secure and efficient computations in the group. - Efficiency is on par with twisted Edwards curves, and in some respects slightly better; e.g. half of double-odd curves have formulas for computing point doublings with only six multiplications (down to 1M+5S per doubling on some curves). We describe here various formulas and discuss implementations. We also define two specific parameter choices for curves with 128-bit security, called do255e and do255s. Our own implementations on 64-bit x86 (Coffee Lake) and low-end ARM Cortex M0+ achieve generic point multiplication in 76696 and 2.19 million cycles, respectively, with curve do255e

    More Efficient Algorithms for the NTRU Key Generation using the Field Norm

    Get PDF
    NTRU lattices are a class of polynomial rings which allow for compact and efficient representations of the lattice basis, thereby offering very good performance characteristics for the asymmetric algorithms that use them. Signature algorithms based on NTRU lattices have fast signature generation and verification, and relatively small signatures, public keys and private keys. A few lattice-based cryptographic schemes entail, generally during the key generation, solving the NTRU equation: fGβˆ’gF=qmod  xn+1 f G - g F = q \mod x^n + 1 Here ff and gg are fixed, the goal is to compute solutions FF and GG to the equation, and all the polynomials are in Z[x]/(xn+1)\mathbb{Z}[x]/(x^n + 1). The existing methods for solving this equation are quite cumbersome: their time and space complexities are at least cubic and quadratic in the dimension nn, and for typical parameters they therefore require several megabytes of RAM and take more than a second on a typical laptop, precluding onboard key generation in embedded systems such as smart cards. In this work, we present two new algorithms for solving the NTRU equation. Both algorithms make a repeated use of the field norm in tower of fields; it allows them to be faster and more compact than existing algorithms by factors O~(n)\tilde O(n). For lattice-based schemes considered in practice, this reduces both the computation time and RAM usage by factors at least 100, making key pair generation within range of smart card abilities

    Improved Key Pair Generation for Falcon, BAT and Hawk

    Get PDF
    In this short note, we describe a few implementation techniques that allow performing key pair generation for the Falcon and Hawk lattice-based signature schemes, and for the BAT key encapsulation scheme, in a fully constant-time way and without any use of floating-point operations. Our new code is faster than previously published implementations, especially when running on small embedded systems, and uses less RAM

    Optimized Binary GCD for Modular Inversion

    Get PDF
    In this short note, we describe a practical optimization of the well-known extended binary GCD algorithm, for the purpose of computing modular inverses. The method is conceptually simple and is applicable to all odd moduli (including non-prime moduli). When implemented for inversion in the field of integers modulo the prime 2255βˆ’192^{255}-19, on a recent x86 CPU (Coffee Lake core), we compute the inverse in 6253 cycles, with a fully constant-time implementation

    Efficient and Complete Formulas for Binary Curves

    Get PDF
    Binary elliptic curves are elliptic curves defined over finite fields of characteristic 2. On software platforms that offer carryless multiplication opcodes (e.g. pclmul on x86), they have very good performance. However, they suffer from some drawbacks, in particular that non-supersingular binary curves have an even order, and that most known formulas for point operations have exceptional cases that are detrimental to safe implementation. In this paper, we show how to make a prime order group abstraction out of standard binary curves. We describe a new canonical compression scheme that yields a canonical and compact encoding. We also describe complete formulas for operations on the group. The formulas have no exceptional case, and are furthermore faster than previously known complete and incomplete formulas (general point addition in cost 8M+2S+2mb on all curves, 7M+2S+2mb on half of the curves). We also show how the same formulas can be applied to computations on the entire original curve, if full backward compatibility with standard curves is needed. Finally, we implemented our method over the standard NIST curves B-233 and K-233. Our strictly constant-time code achieves generic point multiplication by a scalar on curve K-233 in as little as 29600 clock cycles on an Intel x86 CPU (Coffee Lake core)
    • …
    corecore