69 research outputs found

    Area- Efficient VLSI Implementation of Serial-In Parallel-Out Multiplier Using Polynomial Representation in Finite Field GF(2m)

    Full text link
    Finite field multiplier is mainly used in elliptic curve cryptography, error-correcting codes and signal processing. Finite field multiplier is regarded as the bottleneck arithmetic unit for such applications and it is the most complicated operation over finite field GF(2m) which requires a huge amount of logic resources. In this paper, a new modified serial-in parallel-out multiplication algorithm with interleaved modular reduction is suggested. The proposed method offers efficient area architecture as compared to proposed algorithms in the literature. The reduced finite field multiplier complexity is achieved by means of utilizing logic NAND gate in a particular architecture. The efficiency of the proposed architecture is evaluated based on criteria such as time (latency, critical path) and space (gate-latch number) complexity. A detailed comparative analysis indicates that, the proposed finite field multiplier based on logic NAND gate outperforms previously known resultsComment: 19 pages, 4 figure

    Novel digit-serial systolic array implementation of Euclid's algorithm for division in GF(2m)

    Get PDF
    [[abstract]]In this paper, a novel digit-serial systolic array for computing divisions in GF(2m) over the standard basis is presented. To the authors' knowledge, this is the very first digit-serial systolic divider for GF(2m). The proposed architecture possesses the features of regularity, modularity, and unidirectional data flow. Thus, it is well suited to be implemented using VLSI techniques with fault-tolerant design. One important feature of the proposed architecture is that different throughput performances can be easily achieved by varying the digit size. By choosing the digit size appropriately, the proposed digit-serial architecture can meet the throughput requirement of a certain application with minimum hardware.[[fileno]]2030102030012[[department]]電機工程學

    Throughput/Area-efficient ECC Processor Using Montgomery Point Multiplication on FPGA

    Get PDF
    High throughput while maintaining low resource is a key issue for elliptic curve cryptography (ECC) hardware implementations in many applications. In this brief, an ECC processor architecture over Galois fields is presented, which achieves the best reported throughput/area performance on field-programmable gate array (FPGA) to date. A novel segmented pipelining digit serial multiplier is developed to speed up ECC point multiplication. To achieve low latency, a new combined algorithm is developed for point addition and point doubling with careful scheduling. A compact and flexible distributed-RAM-based memory unit design is developed to increase speed while keeping area low. Further optimizations were made via timing constraints and logic level modifications at the implementation level. The proposed architecture is implemented on Virtex4 (V4), Virtex5 (V5), and Virtex7 (V7) FPGA technologies and, respectively, achieved throughout/slice figures of 19.65, 65.30, and 64.48 (106/(Seconds × Slices))

    Novel Single and Hybrid Finite Field Multipliers over GF(2m) for Emerging Cryptographic Systems

    Get PDF
    With the rapid development of economic and technical progress, designers and users of various kinds of ICs and emerging embedded systems like body-embedded chips and wearable devices are increasingly facing security issues. All of these demands from customers push the cryptographic systems to be faster, more efficient, more reliable and safer. On the other hand, multiplier over GF(2m) as the most important part of these emerging cryptographic systems, is expected to be high-throughput, low-complexity, and low-latency. Fortunately, very large scale integration (VLSI) digital signal processing techniques offer great facilities to design efficient multipliers over GF(2m). This dissertation focuses on designing novel VLSI implementation of high-throughput low-latency and low-complexity single and hybrid finite field multipliers over GF(2m) for emerging cryptographic systems. Low-latency (latency can be chosen without any restriction) high-speed pentanomial basis multipliers are presented. For the first time, the dissertation also develops three high-throughput digit-serial multipliers based on pentanomials. Then a novel realization of digit-level implementation of multipliers based on redundant basis is introduced. Finally, single and hybrid reordered normal basis bit-level and digit-level high-throughput multipliers are presented. To the authors knowledge, this is the first time ever reported on multipliers with multiple throughput rate choices. All the proposed designs are simple and modular, therefore suitable for VLSI implementation for various emerging cryptographic systems

    Efficient finite field computations for elliptic curve cryptography

    Get PDF
    Finite field multiplication and inversion are two basic operations involved in Elliptic Cure Cryptosystem (ECC), high performance of field operations can be applied to provide efficient computation of ECC. In this thesis, two classes of fields are proposed for multipliers with much reduced time delay. A most-significant-digit first and a least-significant-digit first digit-serial Montgomery multiplications are also proposed, using novel fixed elements R(x) which are different from x m and x m-1 . Architectures of the proposed Montgomery multipliers are studied and obtained for the fields generated by the irreducible pentanomials, which are selected based on the proposed special finite fields. Complexities of the Montgomery multipliers in term of critical path delay and gate count of the architectures are investigated; the critical path delay of the proposed multipliers are found to be as good as or better than the existing works for the same class of fields. Then, implementation of the proposed multipliers (m=233) using Field Programmable Gate Array (FPGA) is provided. In addition, an FPGA implementation of an efficient normal basis inversion algorithm is also presented (m=163). The normal basis multiplication unit is implemented using a digit-level structure, and a C-code is written to generate the first coordinate of the product of two normal basis elements for all field size m

    Bit Serial Systolic Architectures for Multiplicative Inversion and Division over GF(2<sup>m</sup>)

    Get PDF
    Systolic architectures are capable of achieving high throughput by maximizing pipelining and by eliminating global data interconnects. Recursive algorithms with regular data flows are suitable for systolization. The computation of multiplicative inversion using algorithms based on EEA (Extended Euclidean Algorithm) are particularly suitable for systolization. Implementations based on EEA present a high degree of parallelism and pipelinability at bit level which can be easily optimized to achieve local data flow and to eliminate the global interconnects which represent most important bottleneck in todays sub-micron design process. The net result is to have high clock rate and performance based on efficient systolic architectures. This thesis examines high performance but also scalable implementations of multiplicative inversion or field division over Galois fields GF(2m) in the specific case of cryptographic applications where field dimension m may be very large (greater than 400) and either m or defining irreducible polynomial may vary. For this purpose, many inversion schemes with different basis representation are studied and most importantly variants of EEA and binary (Stein's) GCD computation implementations are reviewed. A set of common as well as contrasting characteristics of these variants are discussed. As a result a generalized and optimized variant of EEA is proposed which can compute division, and multiplicative inversion as its subset, with divisor in either polynomial or triangular basis representation. Further results regarding Hankel matrix formation for double-basis inversion is provided. The validity of using the same architecture to compute field division with polynomial or triangular basis representation is proved. Next, a scalable unidirectional bit serial systolic array implementation of this proposed variant of EEA is implemented. Its complexity measures are defined and these are compared against the best known architectures. It is shown that assuming the requirements specified above, this proposed architecture may achieve a higher clock rate performance w. r. t. other designs while being more flexible, reliable and with minimum number of inter-cell interconnects. The main contribution at system level architecture is the substitution of all counter or adder/subtractor elements with a simpler distributed and free of carry propagation delays structure. Further a novel restoring mechanism for result sequences of EEA is proposed using a double delay element implementation. Finally, using this systolic architecture a CMD (Combined Multiplier Divider) datapath is designed which is used as the core of a novel systolic elliptic curve processor. This EC processor uses affine coordinates to compute scalar point multiplication which results in having a very small control unit and negligible with respect to the datapath for all practical values of m. The throughput of this EC based on this bit serial systolic architecture is comparable with designs many times larger than itself reported previously

    A systolic array implementation of a Reed-Solomon encoder and decoder.

    Get PDF
    A systolic array is a natural architecture for the implementation of a Reed- Solomon (RS) encoder and decoder. It possesses many of the properties desired for a special-purpose application: simple and regular design, concurrency, modular expansibility, fast response time, cost- effectiveness, and high reliability. As a result, it is very well suited for the simple and regular design essential for VLSI implementation . This thesis takes a modular approach to the design of a systolic array based RS encoder and decoder. Initially, the concept of systolic arrays is discussed followed by an introduction to finite field theory and Reed- Solomon codes. Then it is shown how RS codes can be encoded and decoded with primitive shift registers and implemented using a systolic architecture. In this way, the reader can gain valuable insight and comprehension into how these entities are coalesced together to produce the overall implementation.http://archive.org/details/systolicarrayimp00mckeLieutenant, United States NavyApproved for public release; distribution is unlimited

    Serial-serial finite field multiplication

    Get PDF
    corecore