28 research outputs found

    Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and their Implementation on GLV-GLS Curves (Extended Version)

    Get PDF
    We propose efficient algorithms and formulas that improve the performance of side-channel protected elliptic curve computations with special focus on scalar multiplication exploiting the Gallant-Lambert-Vanstone (CRYPTO 2001) and Galbraith-Lin-Scott (EUROCRYPT 2009) methods. Firstly, by adapting Feng et al.\u27s recoding to the GLV setting, we derive new regular algorithms for variable-base scalar multiplication that offer protection against simple side-channel and timing attacks. Secondly, we propose an efficient, side-channel protected algorithm for fixed-base scalar multiplication which combines Feng et al.\u27s recoding with Lim-Lee\u27s comb method. Thirdly, we propose an efficient technique that interleaves ARM and NEON-based multiprecision operations over an extension field to improve performance of GLS curves on modern ARM processors. Finally, we showcase the efficiency of the proposed techniques by implementing a state-of-the-art GLV-GLS curve in twisted Edwards form defined over GF(p^2), which supports a four dimensional decomposition of the scalar and is fully protected against timing attacks. Analysis and performance results are reported for modern x64 and ARM processors. For instance, we compute a variable-base scalar multiplication in 89,000 and 244,000 cycles on an Intel Ivy Bridge and an ARM Cortex-A15 processor (respect.); using a precomputed table of 6KB, we compute a fixed-base scalar multiplication in 49,000 and 116,000 cycles (respect.); and using a precomputed table of 3KB, we compute a double scalar multiplication in 115,000 and 285,000 cycles (respect.). The proposed techniques represent an important improvement of the state-of-the-art performance of elliptic curve computations, and allow us to set new speed records in several modern processors. The techniques also reduce the cost of adding protection against timing attacks in the computation of GLV-based variable-base scalar multiplication to below 10%

    Families of fast elliptic curves from Q-curves

    Get PDF
    We construct new families of elliptic curves over \FF_{p^2} with efficiently computable endomorphisms, which can be used to accelerate elliptic curve-based cryptosystems in the same way as Gallant-Lambert-Vanstone (GLV) and Galbraith-Lin-Scott (GLS) endomorphisms. Our construction is based on reducing \QQ-curves-curves over quadratic number fields without complex multiplication, but with isogenies to their Galois conjugates-modulo inert primes. As a first application of the general theory we construct, for every p>3p > 3, two one-parameter families of elliptic curves over \FF_{p^2} equipped with endomorphisms that are faster than doubling. Like GLS (which appears as a degenerate case of our construction), we offer the advantage over GLV of selecting from a much wider range of curves, and thus finding secure group orders when pp is fixed. Unlike GLS, we also offer the possibility of constructing twist-secure curves. Among our examples are prime-order curves equipped with fast endomorphisms, with almost-prime-order twists, over \FF_{p^2} for p=21271p = 2^{127}-1 and p=225519p = 2^{255}-19

    The Q-curve construction for endomorphism-accelerated elliptic curves

    Get PDF
    We give a detailed account of the use of Q\mathbb{Q}-curve reductions to construct elliptic curves over F_p2\mathbb{F}\_{p^2} with efficiently computable endomorphisms, which can be used to accelerate elliptic curve-based cryptosystems in the same way as Gallant--Lambert--Vanstone (GLV) and Galbraith--Lin--Scott (GLS) endomorphisms. Like GLS (which is a degenerate case of our construction), we offer the advantage over GLV of selecting from a much wider range of curves, and thus finding secure group orders when pp is fixed for efficient implementation. Unlike GLS, we also offer the possibility of constructing twist-secure curves. We construct several one-parameter families of elliptic curves over F_p2\mathbb{F}\_{p^2} equipped with efficient endomorphisms for every p \textgreater{} 3, and exhibit examples of twist-secure curves over F_p2\mathbb{F}\_{p^2} for the efficient Mersenne prime p=21271p = 2^{127}-1.Comment: To appear in the Journal of Cryptology. arXiv admin note: text overlap with arXiv:1305.540

    2DT-GLS: Faster and exception-free scalar multiplication in the GLS254 binary curve

    Get PDF
    We revisit and improve performance of arithmetic in the binary GLS254 curve by introducing the 2DT-GLS scalar multiplication algorithm. The algorithm includes theoretical and practice-oriented contributions of potential independent interest: (i) for the first time, a proof that the GLS scalar multiplication algorithm does not incur exceptions, such that faster incomplete formulas can be used; (ii) faster dedicated atomic formulas that alleviate the cost of precomputation; (iii) a table compression technique that reduces the storage needed for precomputed points; (iv) a refined constant-time scalar decomposition algorithm that is more robust to rounding. We also present the first GLS254 implementation for Armv8. With our contributions, we set new speed records for constant-time scalar multiplication by 34.5%34.5\% and 6%6\% on 64-bit Arm and Intel platforms, respectively

    Four-Dimensional Gallant-Lambert-Vanstone Scalar Multiplication

    Get PDF
    The GLV method of Gallant, Lambert and Vanstone~(CRYPTO 2001) computes any multiple kPkP of a point PP of prime order nn lying on an elliptic curve with a low-degree endomorphism Φ\Phi (called GLV curve) over Fp\mathbb{F}_p as kP=k1P+k2Φ(P)kP = k_1P + k_2\Phi(P), with max{k1,k2}C1n\max\{|k_1|,|k_2|\}\leq C_1\sqrt n for some explicit constant C1>0C_1>0. Recently, Galbraith, Lin and Scott (EUROCRYPT 2009) extended this method to all curves over Fp2\mathbb{F}_{p^2} which are twists of curves defined over Fp\mathbb{F}_p. We show in this work how to merge the two approaches in order to get, for twists of any GLV curve over Fp2\mathbb{F}_{p^2}, a four-dimensional decomposition together with fast endomorphisms Φ,Ψ\Phi, \Psi over Fp2\mathbb{F}_{p^2} acting on the group generated by a point PP of prime order nn, resulting in a proven decomposition for any scalar k[1,n]k\in[1,n] given by kP=k1P+k2Φ(P)+k3Ψ(P)+k4ΨΦ(P)kP=k_1P+ k_2\Phi(P)+ k_3\Psi(P) + k_4\Psi\Phi(P), with maxi(ki)0\max_i (|k_i|)0. Remarkably, taking the best C1,C2C_1, C_2, we obtain C2/C1<412C_2/C_1<412, independently of the curve, ensuring in theory an almost constant relative speedup. In practice, our experiments reveal that the use of the merged GLV-GLS approach supports a scalar multiplication that runs up to 50\% faster than the original GLV method. We then improve this performance even further by exploiting the Twisted Edwards model and show that curves originally slower may become extremely efficient on this model. In addition, we analyze the performance of the method on a multicore setting and describe how to efficiently protect GLV-based scalar multiplication against several side-channel attacks. Our implementations improve the state-of-the-art performance of point multiplication for a variety of scenarios including side-channel protected and unprotected cases with sequential and multicore execution

    Two is the fastest prime: lambda coordinates for binary elliptic curves

    Get PDF
    In this work, we present new arithmetic formulas for a projective version of the affine point representation (x,x+y/x),(x,x+y/x), for x0,x\ne 0, which leads to an efficient computation of the scalar multiplication operation over binary elliptic curves.A software implementation of our formulas applied to a binary Galbraith-Lin-Scott elliptic curve defined over the field F2254\mathbb{F}_{2^{254}} allows us to achieve speed records for protected/unprotected single/multi-core random-point elliptic curve scalar multiplication at the 127-bit security level. When executed on a Sandy Bridge 3.4GHz Intel Xeon processor, our software is able to compute a single/multi-core unprotected scalar multiplication in 69,50069,500 and 47,90047,900 clock cycles, respectively; and a protected single-core scalar multiplication in 114,800114,800 cycles. These numbers are improved by around 2\% and 46\% on the newer Ivy Bridge and Haswell platforms, respectively, achieving in the latter a protected random-point scalar multiplication in 60,000 clock cycles

    FourQNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors

    Get PDF
    We present a high-speed, high-security implementation of the recently proposed elliptic curve FourQ (ASIACRYPT 2015) for 32-bit ARM processors with NEON support. Exploiting the versatile and compact arithmetic of this curve, we design a vectorized implementation that achieves high-performance across a large variety of ARM platforms. Our software is fully protected against timing and cache attacks, and showcases the impressive speed of FourQ when compared with other curve-based alternatives. For example, one single variable-base scalar multiplication is computed in about 235,000 Cortex-A8 cycles or 132,000 Cortex-A15 cycles which, compared to the results of the fastest genus 2 Kummer and Curve25519 implementations on the same platforms, offer speedups between 1.3x-1.7x and between 2.1x-2.4x, respectively. In comparison with the NIST standard curve K-283, we achieve speedups above 4x and 5.5x

    FourQ: four-dimensional decompositions on a Q-curve over the Mersenne prime

    Get PDF
    We introduce FourQ, a high-security, high-performance elliptic curve that targets the 128-bit security level. At the highest arithmetic level, cryptographic scalar multiplications on FourQ can use a four-dimensional Gallant-Lambert-Vanstone decomposition to minimize the total number of elliptic curve group operations. At the group arithmetic level, FourQ admits the use of extended twisted Edwards coordinates and can therefore exploit the fastest known elliptic curve addition formulas over large prime characteristic fields. Finally, at the finite field level, arithmetic is performed modulo the extremely fast Mersenne prime p=21271p=2^{127}-1. We show that this powerful combination facilitates scalar multiplications that are significantly faster than all prior works. On Intel\u27s Broadwell, Haswell, Ivy Bridge and Sandy Bridge architectures, our software computes a variable-base scalar multiplication in 50,000, 56,000, 69,000 cycles and 72,000 cycles, respectively; and, on the same platforms, our software computes a Diffie-Hellman shared secret in 80,000, 88,000, 104,000 cycles and 112,000 cycles, respectively. These results show that, in practice, FourQ is around four to five times faster than the original NIST P-256 curve and between two and three times faster than curves that are currently under consideration as NIST alternatives, such as Curve25519

    High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition

    Get PDF
    This paper explores the potential for using genus~2 curves over quadratic extension fields in cryptography, motivated by the fact that they allow for an 8-dimensional scalar decomposition when using a combination of the GLV/GLS algorithms. Besides lowering the number of doublings required in a scalar multiplication, this approach has the advantage of performing arithmetic operations in a 64-bit ground field, making it an attractive candidate for embedded devices. We found cryptographically secure genus 2 curves which, although susceptible to index calculus attacks, aim for the standardized 112-bit security level. Our implementation results on both high-end architectures (Ivy Bridge) and low-end ARM platforms (Cortex-A8) highlight the practical benefits of this approach

    Don’t Forget Pairing-Friendly Curves with Odd Prime Embedding Degrees

    Get PDF
    Pairing-friendly curves with odd prime embedding degrees at the 128-bit security level, such as BW13-310 and BW19-286, sparked interest in the field of public-key cryptography as small sizes of the prime fields. However, compared to mainstream pairing-friendly curves at the same security level, i.e., BN446 and BLS12-446, the performance of pairing computations on BW13-310 and BW19-286 is usually considered ineffcient. In this paper we investigate high performance software implementations of pairing computation on BW13-310 and corresponding building blocks used in pairing-based protocols, including hashing, group exponentiations and membership testings. Firstly, we propose effcient explicit formulas for pairing computation on this curve. Moreover, we also exploit the state-of-art techniques to implement hashing in G1 and G2, group exponentiations and membership testings. In particular, for exponentiations in G2 and GT , we present new optimizations to speed up computational effciency. Our implementation results on a 64-bit processor show that the gap in the performance of pairing computation between BW13-310 and BN446 (resp. BLS12-446) is only up to 4.9% (resp. 26%). More importantly, compared to BN446 and BLS12-446, BW13- 310 is about 109.1% − 227.3%, 100% − 192.6%, 24.5% − 108.5% and 68.2% − 145.5% faster in terms of hashing to G1, exponentiations in G1 and GT , and membership testing for GT , respectively. These results reveal that BW13-310 would be an interesting candidate in pairing-based cryptographic protocols
    corecore