32 research outputs found
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Ofman Multiplication Method
In recent years, finite field multiplication in GF(2m) has been widely used in various applications such as error correcting codes and cryptography. One of the motivations for fast and area efficient hardware solution for implementing the arithmetic operation of binary multiplication , in finite field GF (2m), comes from the fact, that they are the most time-consuming and frequently called operations in cryptography and other applications. So, the optimization of their hardware design is critical for overall performance of a system. Since a finite field multiplier is a crucial unit for overall performance of cryptographic systems, novel multiplier architectures, whose performances can be chosen freely, is necessary. In this paper, two Galois field multiplication algorithms (used in cryptography applications) are considered to analyze their performance with respect to parameters viz. area, power, delay, and the consequent Area×Time (AT) and Power×Delay characteristics. The objective of the analysis is to find out the most efficient GF(2m) multiplier algorithm among those considered
On Polynomial Multiplication in Chebyshev Basis
In a recent paper Lima, Panario and Wang have provided a new method to
multiply polynomials in Chebyshev basis which aims at reducing the total number
of multiplication when polynomials have small degree. Their idea is to use
Karatsuba's multiplication scheme to improve upon the naive method but without
being able to get rid of its quadratic complexity. In this paper, we extend
their result by providing a reduction scheme which allows to multiply
polynomial in Chebyshev basis by using algorithms from the monomial basis case
and therefore get the same asymptotic complexity estimate. Our reduction allows
to use any of these algorithms without converting polynomials input to monomial
basis which therefore provide a more direct reduction scheme then the one using
conversions. We also demonstrate that our reduction is efficient in practice,
and even outperform the performance of the best known algorithm for Chebyshev
basis when polynomials have large degree. Finally, we demonstrate a linear time
equivalence between the polynomial multiplication problem under monomial basis
and under Chebyshev basis
Implementation of a Generic Modular Cryptosystem for the RSA on Reconfigurable Hardware
This report summarizes the work that was initiated from the summer of 2008, on the study and analysis of cryptographic design techniques and their implementation on an FPGA board,i.e. the Virtex II pro.
The study began with the understanding of a popular HDL language, namely, Verilog. Based on the study an implementation of a modular cryptosystem based on the RSA and generic upto a 256 bit modulus was realized. Optimal techniques for developing a high speed RSA cryptosystem is presented in this work.
Through out the thesis the primary tool was the Xilinx based ISE toolkit. However for validation purposes other simulators such as ModelSim was also used. However, the simulations presented in this work utilizes the Xilinx ISE 10.1 Simulator environment. The Xilinx XST 10.1 was used in the synthesis of the implementation.
The division technique utilized a modified non-restoring division scheme. The multiplication scheme used the Karatsuba-Ofman technique. The exponentiation scheme used was the Montgomery Modular exponentiation. The inversion scheme used a modified form of the Extended Euclidean Algorithm which involves no division or multiplication as suggested by Laszlo Hars.
The thesis concludes with suggestions on extending the present implementation of RSA on FPGA
Hardware Implementation of Barrett Reduction Exploiting Constant Multiplication
The efficient realization of an Elliptic Curve Cryptosystem is contingent on the efficiency of scalar multiplication. These systems can be improved by optimizing the underlying finite field arithmetic operations which are the most costly such as modular reduction. There are elliptic curves over prime fields for which very efficient reduction formulas are possible due to the special structure of the moduli. For prime moduli of arbitrary form, however, use of general reduction formulas, such as Barrett's reduction algorithm, are necessary. Barrett's algorithm performs modular reduction efficiently by using multiplication as opposed to division, an operation which is generally expensive to realize in hardware. We note, however, that when an Elliptic Curve Cryptosystem is defined over a fixed prime field, all multiplication steps in Barrett's scheme can be realized through constant multiplications; this allows for further optimization. In this thesis, we study the influence using constant multipliers has on four different Barrett reduction variants targeting the Virtex-7 (xc7vx485tffg1157-1). We use the FloPoCo core generator to construct constant multiplier implementations for the different multiplication steps required in each scheme. Then, we create a hybrid constant multiplier circuit based on Karatsuba multiplication which uses smaller FloPoCo-generated base multipliers. It is shown that for certain multiplication steps, the hybrid design provides an improvement in the resource utilization of the constant multiplier circuit at the cost of an increase in the critical path delay. A performance comparison of different Barrett reduction circuits using different combinations of constant multiplier architectures is presented. Additionally, a fully pipelined implementation of each Barrett reduction variant is also designed capable of achieving operational frequencies in the range of 496-504MHz depending on the Barrett scheme considered. With the addition of a 256-bit pipelined Karatsuba multiplier circuit, we also present a compact and fully pipelined modular multiplier based on these Barrett architectures capable of achieving very high throughput compared to others in the literature without the use of embedded multipliers
Fast integer multiplication using generalized Fermat primes
For almost 35 years, Sch{\"o}nhage-Strassen's algorithm has been the fastest
algorithm known for multiplying integers, with a time complexity O(n
log n log log n) for multiplying n-bit inputs. In 2007, F{\"u}rer
proved that there exists K > 1 and an algorithm performing this operation in
O(n log n K log n). Recent work by Harvey, van der Hoeven,
and Lecerf showed that this complexity estimate can be improved in order to get
K = 8, and conjecturally K = 4. Using an alternative algorithm, which relies on
arithmetic modulo generalized Fermat primes, we obtain conjecturally the same
result K = 4 via a careful complexity analysis in the deterministic multitape
Turing model
How Fast Can We Multiply Large Integers on an Actual Computer?
We provide two complexity measures that can be used to measure the running
time of algorithms to compute multiplications of long integers. The random
access machine with unit or logarithmic cost is not adequate for measuring the
complexity of a task like multiplication of long integers. The Turing machine
is more useful here, but fails to take into account the multiplication
instruction for short integers, which is available on physical computing
devices. An interesting outcome is that the proposed refined complexity
measures do not rank the well known multiplication algorithms the same way as
the Turing machine model.Comment: To appear in the proceedings of Latin 2014. Springer LNCS 839
Efficient Implementation of Elliptic Curve Cryptography on FPGAs
This work presents the design strategies of an FPGA-based elliptic curve co-processor. Elliptic curve cryptography is an important topic in cryptography due to its relatively short key length and higher efficiency as compared to other well-known public key crypto-systems like RSA. The most important contributions of this work are: - Analyzing how different representations of finite fields and points on elliptic curves effect the performance of an elliptic curve co-processor and implementing a high performance co-processor. - Proposing a novel dynamic programming approach to find the optimum combination of different recursive polynomial multiplication methods. Here optimum means the method which has the smallest number of bit operations. - Designing a new normal-basis multiplier which is based on polynomial multipliers. The most important part of this multiplier is a circuit of size for changing the representation between polynomial and normal basis
Even faster integer multiplication
We give a new proof of F\"urer's bound for the cost of multiplying n-bit
integers in the bit complexity model. Unlike F\"urer, our method does not
require constructing special coefficient rings with "fast" roots of unity.
Moreover, we prove the more explicit bound O(n log n K^(log^* n))$ with K = 8.
We show that an optimised variant of F\"urer's algorithm achieves only K = 16,
suggesting that the new algorithm is faster than F\"urer's by a factor of
2^(log^* n). Assuming standard conjectures about the distribution of Mersenne
primes, we give yet another algorithm that achieves K = 4
Theoretical and practical efficiency aspects in cryptography
EThOS - Electronic Theses Online ServiceGBUnited Kingdo