23 research outputs found

    Efficient Arithmetic for the Implementation of Elliptic Curve Cryptography

    Get PDF
    The technology of elliptic curve cryptography is now an important branch in public-key based crypto-system. Cryptographic mechanisms based on elliptic curves depend on the arithmetic of points on the curve. The most important arithmetic is multiplying a point on the curve by an integer. This operation is known as elliptic curve scalar (or point) multiplication operation. A cryptographic device is supposed to perform this operation efficiently and securely. The elliptic curve scalar multiplication operation is performed by combining the elliptic curve point routines that are defined in terms of the underlying finite field arithmetic operations. This thesis focuses on hardware architecture designs of elliptic curve operations. In the first part, we aim at finding new architectures to implement the finite field arithmetic multiplication operation more efficiently. In this regard, we propose novel schemes for the serial-out bit-level (SOBL) arithmetic multiplication operation in the polynomial basis over F_2^m. We show that the smallest SOBL scheme presented here can provide about 26-30\% reduction in area-complexity cost and about 22-24\% reduction in power consumptions for F_2^{163} compared to the current state-of-the-art bit-level multiplier schemes. Then, we employ the proposed SOBL schemes to present new hybrid-double multiplication architectures that perform two multiplications with latency comparable to the latency of a single multiplication. Then, in the second part of this thesis, we investigate the different algorithms for the implementation of elliptic curve scalar multiplication operation. We focus our interest in three aspects, namely, the finite field arithmetic cost, the critical path delay, and the protection strength from side-channel attacks (SCAs) based on simple power analysis. In this regard, we propose a novel scheme for the scalar multiplication operation that is based on processing three bits of the scalar in the exact same sequence of five point arithmetic operations. We analyse the security of our scheme and show that its security holds against both SCAs and safe-error fault attacks. In addition, we show how the properties of the proposed elliptic curve scalar multiplication scheme yields an efficient hardware design for the implementation of a single scalar multiplication on a prime extended twisted Edwards curve incorporating 8 parallel multiplication operations. Our comparison results show that the proposed hardware architecture for the twisted Edwards curve model implemented using the proposed scalar multiplication scheme is the fastest secure SCA protected scalar multiplication scheme over prime field reported in the literature

    N-term Karatsuba Algorithm and its Application to Multiplier designs for Special Trinomials

    Get PDF
    In this paper, we propose a new type of non-recursive Mastrovito multiplier for GF(2m)GF(2^m) using a nn-term Karatsuba algorithm (KA), where GF(2m)GF(2^m) is defined by an irreducible trinomial, xm+xk+1,m=nkx^m+x^k+1, m=nk. We show that such a type of trinomial combined with the nn-term KA can fully exploit the spatial correlation of entries in related Mastrovito product matrices and lead to a low complexity architecture. The optimal parameter nn is further studied. As the main contribution of this study, the lower bound of the space complexity of our proposal is about O(m22+m3/2)O(\frac{m^2}{2}+m^{3/2}). Meanwhile, the time complexity matches the best Karatsuba multiplier known to date. To the best of our knowledge, it is the first time that Karatsuba-based multiplier has reached such a space complexity bound while maintaining relatively low time delay

    Mastrovito Form of Non-recursive Karatsuba Multiplier for All Trinomials

    Get PDF
    We present a new type of bit-parallel non-recursive Karatsuba multiplier over GF(2m)GF(2^m) generated by an arbitrary irreducible trinomial. This design effectively exploits Mastrovito approach and shifted polynomial basis (SPB) to reduce the time complexity and Karatsuba algorithm to reduce its space complexity. We show that this type of multiplier is only one TXT_X slower than the fastest bit-parallel multiplier for all trinomials, where TXT_X is the delay of one 2-input XOR gate. Meanwhile, its space complexity is roughly 3/4 of those multipliers. To the best of our knowledge, it is the first time that our scheme has reached such a time delay bound. This result outperforms previously proposed non-recursive Karatsuba multipliers

    On Space-Time Trade-Off for Montgomery Multipliers over Finite Fields

    Get PDF
    La multiplication dans le corps de Galois à 2^m éléments (i.e. GF(2^m)) est une opérations très importante pour les applications de la théorie des correcteurs et de la cryptographie. Dans ce mémoire, nous nous intéressons aux réalisations parallèles de multiplicateurs dans GF(2^m) lorsque ce dernier est généré par des trinômes irréductibles. Notre point de départ est le multiplicateur de Montgomery qui calcule A(x)B(x)x^(-u) efficacement, étant donné A(x), B(x) in GF(2^m) pour u choisi judicieusement. Nous étudions ensuite l'algorithme diviser pour régner PCHS qui permet de partitionner les multiplicandes d'un produit dans GF(2^m) lorsque m est impair. Nous l'appliquons pour la partitionnement de A(x) et de B(x) dans la multiplication de Montgomery A(x)B(x)x^(-u) pour GF(2^m) même si m est pair. Basé sur cette nouvelle approche, nous construisons un multiplicateur dans GF(2^m) généré par des trinôme irréductibles. Une nouvelle astuce de réutilisation des résultats intermédiaires nous permet d'éliminer plusieurs portes XOR redondantes. Les complexités de temps (i.e. le délais) et d'espace (i.e. le nombre de portes logiques) du nouveau multiplicateur sont ensuite analysées: 1. Le nouveau multiplicateur demande environ 25% moins de portes logiques que les multiplicateurs de Montgomery et de Mastrovito lorsque GF(2^m) est généré par des trinômes irréductible et m est suffisamment grand. Le nombre de portes du nouveau multiplicateur est presque identique à celui du multiplicateur de Karatsuba proposé par Elia. 2. Le délai de calcul du nouveau multiplicateur excède celui des meilleurs multiplicateurs d'au plus deux évaluations de portes XOR. 3. Nous determinons le délai et le nombre de portes logiques du nouveau multiplicateur sur les deux corps de Galois recommandés par le National Institute of Standards and Technology (NIST). Nous montrons que notre multiplicateurs contient 15% moins de portes logiques que les multiplicateurs de Montgomery et de Mastrovito au coût d'un délai d'au plus une porte XOR supplémentaire. De plus, notre multiplicateur a un délai d'une porte XOR moindre que celui du multiplicateur d'Elia au coût d'une augmentation de moins de 1% du nombre total de portes logiques.The multiplication in a Galois field with 2^m elements (i.e. GF(2^m)) is an important arithmetic operation in coding theory and cryptography. In this thesis, we focus on the bit- parallel multipliers over the Galois fields generated by trinomials. We start by introducing the GF(2^m) Montgomery multiplication, which calculates A(x)B(x)x^{-u} in GF(2^m) with two polynomials A(x), B(x) in GF(2^m) and a properly chosen u. Then, we investigate the rule for multiplicand partition used by a divide-and-conquer algorithm PCHS originally proposed for the multiplication over GF(2^m) with odd m. By adopting similar rules for splitting A(x) and B(x) in A(x)B(x)x^{-u}, we develop new Montgomery multiplication formulae for GF(2^m) with m either odd or even. Based on this new approach, we develop the corresponding bit-parallel Montgomery multipliers for the Galois fields generated by trinomials. A new bit-reusing trick is applied to eliminate redundant XOR gates from the new multiplier. The time complexity (i.e. the delay) and the space complexity (i.e. the logic gate number) of the new multiplier are explicitly analysed: 1. This new multiplier is about 25% more efficient in the number of logic gates than the previous trinomial-based Montgomery multipliers or trinomial-based Mastrovito multipliers on GF(2^m) with m big enough. It has a number of logic gates very close to that of the Karatsuba multiplier proposed by Elia. 2. While having a significantly smaller number of logic gates, this new multiplier is at most two T_X larger in the total delay than the fastest bit-parallel multiplier on GF(2^m), where T_X is the XOR gate delay. 3. We determine the space and time complexities of our multiplier on the two fields recommended by the National Institute of Standards and Technology (NIST). Having at most one more T_X in the total delay, our multiplier has a more-than-15% reduced logic gate number compared with the other Montgomery or Mastrovito multipliers. Moreover, our multiplier is one T_X smaller in delay than the Elia's multiplier at the cost of a less-than-1% increase in the logic gate number

    A Chinese Remainder Theorem Approach to Bit-Parallel GF(2^n) Polynomial Basis Multipliers for Irreducible Trinomials

    Get PDF
    We show that the step “modulo the degree-n field generating irreducible polynomial” in the classical definition of the GF(2^n) multiplication operation can be avoided. This leads to an alternative representation of the finite field multiplication operation. Combining this representation and the Chinese Remainder Theorem, we design bit-parallel GF(2^n) multipliers for irreducible trinomials u^n + u^k + 1 on GF(2) where 1 < k &#8804; n=2. For some values of n, our architectures have the same time complexity as the fastest bit-parallel multipliers – the quadratic multipliers, but their space complexities are reduced. Take the special irreducible trinomial u^(2k) + u^k + 1 for example, the space complexity of the proposed design is reduced by about 1=8, while the time complexity matches the best result. Our experimental results show that among the 539 values of n such that 4 < n < 1000 and x^n+x^k+1 is irreducible over GF(2) for some k in the range 1 < k &#8804; n=2, the proposed multipliers beat the current fastest parallel multipliers for 290 values of n when (n &#8722; 1)=3 &#8804; k &#8804; n=2: they have the same time complexity, but the space complexities are reduced by 8.4% on average

    Efficient Bit-parallel Multiplication with Subquadratic Space Complexity in Binary Extension Field

    Get PDF
    Bit-parallel multiplication in GF(2^n) with subquadratic space complexity has been explored in recent years due to its lower area cost compared with traditional parallel multiplications. Based on \u27divide and conquer\u27 technique, several algorithms have been proposed to build subquadratic space complexity multipliers. Among them, Karatsuba algorithm and its generalizations are most often used to construct multiplication architectures with significantly improved efficiency. However, recursively using one type of Karatsuba formula may not result in an optimal structure for many finite fields. It has been shown that improvements on multiplier complexity can be achieved by using a combination of several methods. After completion of a detailed study of existing subquadratic multipliers, this thesis has proposed a new algorithm to find the best combination of selected methods through comprehensive search for constructing polynomial multiplication over GF(2^n). Using this algorithm, ameliorated architectures with shortened critical path or reduced gates cost will be obtained for the given value of n, where n is in the range of [126, 600] reflecting the key size for current cryptographic applications. With different input constraints the proposed algorithm can also yield subquadratic space multiplier architectures optimized for trade-offs between space and time. Optimized multiplication architectures over NIST recommended fields generated from the proposed algorithm are presented and analyzed in detail. Compared with existing works with subquadratic space complexity, the proposed architectures are highly modular and have improved efficiency on space or time complexity. Finally generalization of the proposed algorithm to be suitable for much larger size of fields discussed
    corecore