6 research outputs found

    Novel Single and Hybrid Finite Field Multipliers over GF(2m) for Emerging Cryptographic Systems

    Get PDF
    With the rapid development of economic and technical progress, designers and users of various kinds of ICs and emerging embedded systems like body-embedded chips and wearable devices are increasingly facing security issues. All of these demands from customers push the cryptographic systems to be faster, more efficient, more reliable and safer. On the other hand, multiplier over GF(2m) as the most important part of these emerging cryptographic systems, is expected to be high-throughput, low-complexity, and low-latency. Fortunately, very large scale integration (VLSI) digital signal processing techniques offer great facilities to design efficient multipliers over GF(2m). This dissertation focuses on designing novel VLSI implementation of high-throughput low-latency and low-complexity single and hybrid finite field multipliers over GF(2m) for emerging cryptographic systems. Low-latency (latency can be chosen without any restriction) high-speed pentanomial basis multipliers are presented. For the first time, the dissertation also develops three high-throughput digit-serial multipliers based on pentanomials. Then a novel realization of digit-level implementation of multipliers based on redundant basis is introduced. Finally, single and hybrid reordered normal basis bit-level and digit-level high-throughput multipliers are presented. To the authors knowledge, this is the first time ever reported on multipliers with multiple throughput rate choices. All the proposed designs are simple and modular, therefore suitable for VLSI implementation for various emerging cryptographic systems

    Bit Serial Systolic Architectures for Multiplicative Inversion and Division over GF(2<sup>m</sup>)

    Get PDF
    Systolic architectures are capable of achieving high throughput by maximizing pipelining and by eliminating global data interconnects. Recursive algorithms with regular data flows are suitable for systolization. The computation of multiplicative inversion using algorithms based on EEA (Extended Euclidean Algorithm) are particularly suitable for systolization. Implementations based on EEA present a high degree of parallelism and pipelinability at bit level which can be easily optimized to achieve local data flow and to eliminate the global interconnects which represent most important bottleneck in todays sub-micron design process. The net result is to have high clock rate and performance based on efficient systolic architectures. This thesis examines high performance but also scalable implementations of multiplicative inversion or field division over Galois fields GF(2m) in the specific case of cryptographic applications where field dimension m may be very large (greater than 400) and either m or defining irreducible polynomial may vary. For this purpose, many inversion schemes with different basis representation are studied and most importantly variants of EEA and binary (Stein's) GCD computation implementations are reviewed. A set of common as well as contrasting characteristics of these variants are discussed. As a result a generalized and optimized variant of EEA is proposed which can compute division, and multiplicative inversion as its subset, with divisor in either polynomial or triangular basis representation. Further results regarding Hankel matrix formation for double-basis inversion is provided. The validity of using the same architecture to compute field division with polynomial or triangular basis representation is proved. Next, a scalable unidirectional bit serial systolic array implementation of this proposed variant of EEA is implemented. Its complexity measures are defined and these are compared against the best known architectures. It is shown that assuming the requirements specified above, this proposed architecture may achieve a higher clock rate performance w. r. t. other designs while being more flexible, reliable and with minimum number of inter-cell interconnects. The main contribution at system level architecture is the substitution of all counter or adder/subtractor elements with a simpler distributed and free of carry propagation delays structure. Further a novel restoring mechanism for result sequences of EEA is proposed using a double delay element implementation. Finally, using this systolic architecture a CMD (Combined Multiplier Divider) datapath is designed which is used as the core of a novel systolic elliptic curve processor. This EC processor uses affine coordinates to compute scalar point multiplication which results in having a very small control unit and negligible with respect to the datapath for all practical values of m. The throughput of this EC based on this bit serial systolic architecture is comparable with designs many times larger than itself reported previously

    Efficient Arithmetic for the Implementation of Elliptic Curve Cryptography

    Get PDF
    The technology of elliptic curve cryptography is now an important branch in public-key based crypto-system. Cryptographic mechanisms based on elliptic curves depend on the arithmetic of points on the curve. The most important arithmetic is multiplying a point on the curve by an integer. This operation is known as elliptic curve scalar (or point) multiplication operation. A cryptographic device is supposed to perform this operation efficiently and securely. The elliptic curve scalar multiplication operation is performed by combining the elliptic curve point routines that are defined in terms of the underlying finite field arithmetic operations. This thesis focuses on hardware architecture designs of elliptic curve operations. In the first part, we aim at finding new architectures to implement the finite field arithmetic multiplication operation more efficiently. In this regard, we propose novel schemes for the serial-out bit-level (SOBL) arithmetic multiplication operation in the polynomial basis over F_2^m. We show that the smallest SOBL scheme presented here can provide about 26-30\% reduction in area-complexity cost and about 22-24\% reduction in power consumptions for F_2^{163} compared to the current state-of-the-art bit-level multiplier schemes. Then, we employ the proposed SOBL schemes to present new hybrid-double multiplication architectures that perform two multiplications with latency comparable to the latency of a single multiplication. Then, in the second part of this thesis, we investigate the different algorithms for the implementation of elliptic curve scalar multiplication operation. We focus our interest in three aspects, namely, the finite field arithmetic cost, the critical path delay, and the protection strength from side-channel attacks (SCAs) based on simple power analysis. In this regard, we propose a novel scheme for the scalar multiplication operation that is based on processing three bits of the scalar in the exact same sequence of five point arithmetic operations. We analyse the security of our scheme and show that its security holds against both SCAs and safe-error fault attacks. In addition, we show how the properties of the proposed elliptic curve scalar multiplication scheme yields an efficient hardware design for the implementation of a single scalar multiplication on a prime extended twisted Edwards curve incorporating 8 parallel multiplication operations. Our comparison results show that the proposed hardware architecture for the twisted Edwards curve model implemented using the proposed scalar multiplication scheme is the fastest secure SCA protected scalar multiplication scheme over prime field reported in the literature

    Studies on high-speed hardware implementation of cryptographic algorithms

    Get PDF
    Cryptographic algorithms are ubiquitous in modern communication systems where they have a central role in ensuring information security. This thesis studies efficient implementation of certain widely-used cryptographic algorithms. Cryptographic algorithms are computationally demanding and software-based implementations are often too slow or power consuming which yields a need for hardware implementation. Field Programmable Gate Arrays (FPGAs) are programmable logic devices which have proven to be highly feasible implementation platforms for cryptographic algorithms because they provide both speed and programmability. Hence, the use of FPGAs for cryptography has been intensively studied in the research community and FPGAs are also the primary implementation platforms in this thesis. This thesis presents techniques allowing faster implementations than existing ones. Such techniques are necessary in order to use high-security cryptographic algorithms in applications requiring high data rates, for example, in heavily loaded network servers. The focus is on Advanced Encryption Standard (AES), the most commonly used secret-key cryptographic algorithm, and Elliptic Curve Cryptography (ECC), public-key cryptographic algorithms which have gained popularity in the recent years and are replacing traditional public-key cryptosystems, such as RSA. Because these algorithms are well-defined and widely-used, the results of this thesis can be directly applied in practice. The contributions of this thesis include improvements to both algorithms and techniques for implementing them. Algorithms are modified in order to make them more suitable for hardware implementation, especially, focusing on increasing parallelism. Several FPGA implementations exploiting these modifications are presented in the thesis including some of the fastest implementations available in the literature. The most important contributions of this thesis relate to ECC and, specifically, to a family of elliptic curves providing faster computations called Koblitz curves. The results of this thesis can, in their part, enable increasing use of cryptographic algorithms in various practical applications where high computation speed is an issue

    Low complexity implementation of unified systolic multipliers for NIST pentanomials and trinomials over GF(2\u3csup\u3em\u3c/sup\u3e)

    No full text
    Systolic finite field multiplier over GF(2m) based on the National Institute of Standards and Technology (NIST) recommended pentanomials or trinomials can be used as a critical component in many cryptosystems. In this paper, for the first time, we propose a novel low-complexity unified (hybrid field size) systolic multiplier for NIST pentanomials and trinomials over GF(2m). We have proposed a computation-core-based design strategy to obtain the desired low-complexity unified multiplier for both NIST pentanomials and trinomials. The proposed multiplier can swift between pentanomial-based and trinomial-based multipliers through a control signal. First of all, a novel strategy is briefly introduced to implement a certain matrix-vector multiplication, which can be packed as a standard computation core (or computation core like). Then, based on the computation-core concept, a novel unified multiplication algorithm is derived that it can realize both the pentanomial-based and trinomial-based multiplications. After that, an efficient systolic structure is presented that it can fully employ the introduced computation core. A detailed example of the proposed unified multiplier (for GF(2163) and GF(2233) is also presented. Both the theoretical and field-programmable gate array implementation results show that the proposed design has efficient performance in area-time-power complexities, e.g., the proposed design (the one performs GF(2163) and GF(2233) multiplications) is found to have at least 14.2% and 13.3% less area-delay product and power-delay product than the combination of the existing individual GF(2163) and GF(2233) multipliers (best among all competing designs), respectively. Because of its structural regularity and functional flexibility, the proposed unified multiplier can be used as an intellectual property core for various cryptosystems

    Low complexity implementation of unified systolic multipliers for NIST pentanomials and trinomials over GF(2\u3csup\u3em\u3c/sup\u3e)

    No full text
    Systolic finite field multiplier over GF(2m) based on the National Institute of Standards and Technology (NIST) recommended pentanomials or trinomials can be used as a critical component in many cryptosystems. In this paper, for the first time, we propose a novel low-complexity unified (hybrid field size) systolic multiplier for NIST pentanomials and trinomials over GF(2m). We have proposed a computation-core-based design strategy to obtain the desired low-complexity unified multiplier for both NIST pentanomials and trinomials. The proposed multiplier can swift between pentanomial-based and trinomial-based multipliers through a control signal. First of all, a novel strategy is briefly introduced to implement a certain matrix-vector multiplication, which can be packed as a standard computation core (or computation core like). Then, based on the computation-core concept, a novel unified multiplication algorithm is derived that it can realize both the pentanomial-based and trinomial-based multiplications. After that, an efficient systolic structure is presented that it can fully employ the introduced computation core. A detailed example of the proposed unified multiplier (for GF(2163) and GF(2233) is also presented. Both the theoretical and field-programmable gate array implementation results show that the proposed design has efficient performance in area-time-power complexities, e.g., the proposed design (the one performs GF(2163) and GF(2233) multiplications) is found to have at least 14.2% and 13.3% less area-delay product and power-delay product than the combination of the existing individual GF(2163) and GF(2233) multipliers (best among all competing designs), respectively. Because of its structural regularity and functional flexibility, the proposed unified multiplier can be used as an intellectual property core for various cryptosystems
    corecore