661 research outputs found
Area- Efficient VLSI Implementation of Serial-In Parallel-Out Multiplier Using Polynomial Representation in Finite Field GF(2m)
Finite field multiplier is mainly used in elliptic curve cryptography,
error-correcting codes and signal processing. Finite field multiplier is
regarded as the bottleneck arithmetic unit for such applications and it is the
most complicated operation over finite field GF(2m) which requires a huge
amount of logic resources. In this paper, a new modified serial-in parallel-out
multiplication algorithm with interleaved modular reduction is suggested. The
proposed method offers efficient area architecture as compared to proposed
algorithms in the literature. The reduced finite field multiplier complexity is
achieved by means of utilizing logic NAND gate in a particular architecture.
The efficiency of the proposed architecture is evaluated based on criteria such
as time (latency, critical path) and space (gate-latch number) complexity. A
detailed comparative analysis indicates that, the proposed finite field
multiplier based on logic NAND gate outperforms previously known resultsComment: 19 pages, 4 figure
High speed world level finite field multipliers in F2m
Finite fields have important applications in number theory, algebraic geometry, Galois theory, cryptography, and coding theory. Recently, the use of finite field arithmetic in the area of cryptography has increasingly gained importance. Elliptic curve and El-Gamal cryptosystems are two important examples of public key cryptosystems widely used today based on finite field arithmetic. Research in this area is moving toward finding new architectures to implement the arithmetic operations more efficiently.
Two types of finite fields are commonly used in practice, prime field GF(p) and the binary extension field GF(2 m). The binary extension fields are attractive for high speed cryptography applications since they are suitable for hardware implementations. Hardware implementation of finite field multipliers can usually be categorized into three categories: bit-serial, bit-parallel, and word-level architectures. The word-level multipliers provide architectural flexibility and trade-off between the performance and limitations of VLSI implementation and I/O ports, thus it is of more practical significance.
In this work, different word level architectures for multiplication using binary field are proposed. It has been shown that the proposed architectures are more efficient compared to similar proposals considering area/delay complexities as a measure of performance. Practical size multipliers for cryptography applications have been realized in hardware using FPGA or standard CMOS technology, to similar proposals considering area/delay complexities as a measure of performance. Practical size multipliers for cryptography applications have been realized in hardware using FPGA or standard CMOS technology. Also different VLSI implementations for multipliers were explored which resulted in more efficient implementations for some of the regular architectures. The new implementations use a simple module designed in domino logic as the main building block for the multiplier. Significant speed improvements was achieved designing practical size multipliers using the proposed methodology
An Efficient hardware implementation of the tate pairing in characteristic three
DL systems with bilinear structure recently became an important base for cryptographic protocols such as identity-based encryption (IBE). Since the main
computational task is the evaluation of the bilinear pairings over elliptic curves, known to be prohibitively expensive, efficient implementations are required to render them applicable in real life scenarios. We present an efficient accelerator for computing the Tate Pairing in characteristic 3, using the Modified Duursma-Lee algorithm. Our accelerator shows that it is possible to improve the area-time product by 12 times on FPGA, compared to estimated values from one of the best known hardware architecture [6] implemented on the same type of FPGA. Also the computation time is improved upto 16 times compared to software applications reported in [17]. In addition, we present the result of an ASIC implementation of the algorithm, which is the first hitherto
Efficient Bit-parallel Multiplication with Subquadratic Space Complexity in Binary Extension Field
Bit-parallel multiplication in GF(2^n) with subquadratic space complexity has been explored in recent years due to its lower area cost compared with traditional parallel multiplications. Based on \u27divide and conquer\u27 technique, several algorithms have been proposed to build subquadratic space complexity multipliers. Among them, Karatsuba algorithm and its generalizations are most often used to construct multiplication architectures with significantly improved efficiency. However, recursively using one type of Karatsuba formula may not result in an optimal structure for many finite fields. It has been shown that improvements on multiplier complexity can be achieved by using a combination of several methods. After completion of a detailed study of existing subquadratic multipliers, this thesis has proposed a new algorithm to find the best combination of selected methods through comprehensive search for constructing polynomial multiplication over GF(2^n). Using this algorithm, ameliorated architectures with shortened critical path or reduced gates cost will be obtained for the given value of n, where n is in the range of [126, 600] reflecting the key size for current cryptographic applications. With different input constraints the proposed algorithm can also yield subquadratic space multiplier architectures optimized for trade-offs between space and time. Optimized multiplication architectures over NIST recommended fields generated from the proposed algorithm are presented and analyzed in detail. Compared with existing works with subquadratic space complexity, the proposed architectures are highly modular and have improved efficiency on space or time complexity. Finally generalization of the proposed algorithm to be suitable for much larger size of fields discussed
Synthesis Optimization on Galois-Field Based Arithmetic Operators for Rijndael Cipher
A series of experiments has been conducted to show that FPGA synthesis of Galois-Field (GF) based arithmetic operators can be optimized automatically to improve Rijndael Cipher throughput. Moreover, it has been demonstrated that efficiency improvement in GF operators does not directly correspond to the system performance at application level. The experiments were motivated by so many research works that focused on improving performance of GF operators. Each of the variants has the most efficient form in either time (fastest) or space (smallest occupied area) when implemented in FPGA chips. In fact, GF operators are not utilized individually, but rather integrated one to the others to implement algorithms. Contribution of this paper is to raise issue on GF-based application performance and suggest alternative aspects that potentially affect it. Instead of focusing on GF operator efficiency, system characteristics are worth considered in optimizing application performance
Hardware Implementation of Bit-Parallel Finite Field Multipliers Based on Overlap-free Algorithm on FPGA
Cryptography can be divided into two fundamentally different classes: symmetric-key and public-key. Compared with symmetric-key cryptography, where the complexity of the security system relies on a single key between receiver and sender, public-key cryptographic system using two separate but mathematically related keys. Finite field multiplication is a key operation used in all cryptographic systems relied on finite field arithmetic as it not only is computationally complex but also one of the most frequently used finite field operations. Karatsuba algorithm and its generalization are most often used to construct multiplication architectures with significantly improved in these decades. However, one of its optimized architecture called Overlap-free Karatsuba algorithm has been mentioned by fewer people and even its implementation on FPGA has not been mentioned by anyone. After completion of a detailed study of this specific algorithm, this thesis has proposed implementation of modified Overlap-free Karatsuba algorithm on Xilinx Spartan-605. Applied this algorithm and its specific architecture, reduced gates or shorten critical path will be achieved for the given value of n.Optimized multiplication architecture, generated from proposed modified Overlap-free Karatsuba algorithm and applied on FPGA board,over NIST recommended fields (n = 128), are presented and analysed in detail. Compared with existing works with sub-quadratic space and time complexities, the proposed modified algorithm is highly recommended module and have improved on both space and time complexities. At last, generalization of proposed modified algorithm is suitable for much larger size of finite fields, and improvements of FPGA itself have been discussed
A new approach in building parallel finite field multipliers
A new method for building bit-parallel polynomial basis finite field multipliers is proposed in this thesis. Among the different approaches to build such multipliers, Mastrovito multipliers based on a trinomial, an all-one-polynomial, or an equally-spacedpolynomial have the lowest complexities. The next best in this category is a conventional multiplier based on a pentanomial. Any newly presented method should have complexity results which are at least better than those of a pentanomial based multiplier. By applying our method to certain classes of finite fields we have gained a space complexity as n2 + H - 4 and a time complexity as TA + ([ log2(n-l) ]+3)rx which are better than the lowest space and time complexities of a pentanomial based multiplier found in literature. Therefore this multiplier can serve as an alternative in those finite fields in which no trinomial, all-one-polynomial or equally-spaced-polynomial exists
High-Speed Area-Efficient Hardware Architecture for the Efficient Detection of Faults in a Bit-Parallel Multiplier Utilizing the Polynomial Basis of GF(2m)
The utilization of finite field multipliers is pervasive in contemporary
digital systems, with hardware implementation for bit parallel operation often
necessitating millions of logic gates. However, various digital design issues,
whether natural or stemming from soft errors, can result in gate malfunction,
ultimately leading to erroneous multiplier outputs. Thus, to prevent
susceptibility to error, it is imperative to employ an effective finite field
multiplier implementation that boasts a robust fault detection capability. This
study proposes a novel fault detection scheme for a recent bit-parallel
polynomial basis multiplier over GF(2m), intended to achieve optimal fault
detection performance for finite field multipliers while simultaneously
maintaining a low-complexity implementation, a favored attribute in
resource-constrained applications like smart cards. The primary concept behind
the proposed approach is centered on the implementation of a BCH decoder that
utilizes re-encoding technique and FIBM algorithm in its first and second
sub-modules, respectively. This approach serves to address hardware complexity
concerns while also making use of Berlekamp-Rumsey-Solomon (BRS) algorithm and
Chien search method in the third sub-module of the decoder to effectively
locate errors with minimal delay. The results of our synthesis indicate that
our proposed error detection and correction architecture for a 45-bit
multiplier with 5-bit errors achieves a 37% and 49% reduction in critical path
delay compared to existing designs. Furthermore, the hardware complexity
associated with a 45-bit multiplicand that contains 5 errors is confined to a
mere 80%, which is significantly lower than the most exceptional BCH-based
fault recognition methodologies, including TMR, Hamming's single error
correction, and LDPC-based procedures within the realm of finite field
multiplication.Comment: 9 pages, 4 figures. arXiv admin note: substantial text overlap with
arXiv:2209.1338
Low-Resource and Fast Elliptic Curve Implementations over Binary Edwards Curves
Elliptic curve cryptography (ECC) is an ideal choice for low-resource applications because it provides the same level of security with smaller key sizes than other existing public key encryption schemes. For low-resource applications, designing efficient functional units for elliptic curve computations over binary fields results in an effective platform for an embedded co-processor. This thesis investigates co-processor designs for area-constrained devices. Particularly, we discuss an implementation utilizing state of the art binary Edwards curve equations over mixed point addition and doubling. The binary Edwards curve offers the security advantage that it is complete and is, therefore, immune to the exceptional points attack. In conjunction with Montgomery ladder, such a curve is naturally immune to most types of simple power and timing attacks. Finite field operations were performed in the small and efficient Gaussian normal basis. The recently presented formulas for mixed point addition by K. Kim, C. Lee, and C. Negre at Indocrypt 2014 were found to be invalid, but were corrected such that the speed and register usage were maintained. We utilize corrected mixed point addition and doubling formulas to achieve a secure, but still fast implementation of a point multiplication on binary Edwards curves. Our synthesis results over NIST recommended fields for ECC indicate that the proposed co-processor requires about 50% fewer clock cycles for point multiplication and occupies a similar silicon area when compared to the most recent in literature
- …