732 research outputs found
Efficient Bit-parallel Multiplication with Subquadratic Space Complexity in Binary Extension Field
Bit-parallel multiplication in GF(2^n) with subquadratic space complexity has been explored in recent years due to its lower area cost compared with traditional parallel multiplications. Based on \u27divide and conquer\u27 technique, several algorithms have been proposed to build subquadratic space complexity multipliers. Among them, Karatsuba algorithm and its generalizations are most often used to construct multiplication architectures with significantly improved efficiency. However, recursively using one type of Karatsuba formula may not result in an optimal structure for many finite fields. It has been shown that improvements on multiplier complexity can be achieved by using a combination of several methods. After completion of a detailed study of existing subquadratic multipliers, this thesis has proposed a new algorithm to find the best combination of selected methods through comprehensive search for constructing polynomial multiplication over GF(2^n). Using this algorithm, ameliorated architectures with shortened critical path or reduced gates cost will be obtained for the given value of n, where n is in the range of [126, 600] reflecting the key size for current cryptographic applications. With different input constraints the proposed algorithm can also yield subquadratic space multiplier architectures optimized for trade-offs between space and time. Optimized multiplication architectures over NIST recommended fields generated from the proposed algorithm are presented and analyzed in detail. Compared with existing works with subquadratic space complexity, the proposed architectures are highly modular and have improved efficiency on space or time complexity. Finally generalization of the proposed algorithm to be suitable for much larger size of fields discussed
Synthesis and Optimization of Reversible Circuits - A Survey
Reversible logic circuits have been historically motivated by theoretical
research in low-power electronics as well as practical improvement of
bit-manipulation transforms in cryptography and computer graphics. Recently,
reversible circuits have attracted interest as components of quantum
algorithms, as well as in photonic and nano-computing technologies where some
switching devices offer no signal gain. Research in generating reversible logic
distinguishes between circuit synthesis, post-synthesis optimization, and
technology mapping. In this survey, we review algorithmic paradigms ---
search-based, cycle-based, transformation-based, and BDD-based --- as well as
specific algorithms for reversible synthesis, both exact and heuristic. We
conclude the survey by outlining key open challenges in synthesis of reversible
and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table
Secure and Efficient RNS Approach for Elliptic Curve Cryptography
Scalar multiplication, the main operation in elliptic
curve cryptographic protocols, is vulnerable to side-channel
(SCA) and fault injection (FA) attacks. An efficient countermeasure
for scalar multiplication can be provided by using alternative
number systems like the Residue Number System (RNS). In RNS,
a number is represented as a set of smaller numbers, where each
one is the result of the modular reduction with a given moduli
basis. Under certain requirements, a number can be uniquely
transformed from the integers to the RNS domain (and vice
versa) and all arithmetic operations can be performed in RNS.
This representation provides an inherent SCA and FA resistance
to many attacks and can be further enhanced by RNS arithmetic
manipulation or more traditional algorithmic countermeasures.
In this paper, extending our previous work, we explore the
potentials of RNS as an SCA and FA countermeasure and provide
an description of RNS based SCA and FA resistance means. We
propose a secure and efficient Montgomery Power Ladder based
scalar multiplication algorithm on RNS and discuss its SCAFA
resistance. The proposed algorithm is implemented on an
ARM Cortex A7 processor and its SCA-FA resistance is evaluated
by collecting preliminary leakage trace results that validate our
initial assumptions
Recommended from our members
Formal Analysis of Arithmetic Circuits using Computer Algebra - Verification, Abstraction and Reverse Engineering
Despite a considerable progress in verification and abstraction of random and control logic, advances in formal verification of arithmetic designs have been lagging. This can be attributed mostly to the difficulty in an efficient modeling of arithmetic circuits and datapaths without resorting to computationally expensive Boolean methods, such as Binary Decision Diagrams (BDDs) and Boolean Satisfiability (SAT), that require “bit blasting”, i.e., flattening the design to a bit-level netlist. Approaches that rely on computer algebra and Satisfiability Modulo Theories (SMT) methods are either too abstract to handle the bit-level nature of arithmetic designs or require solving computationally expensive decision or satisfiability problems. The work proposed in this thesis aims at overcoming the limitations of analyzing arithmetic circuits, specifically at the post-synthesized phase. It addresses the verification, abstraction and reverse engineering problems of arithmetic circuits at an algebraic level, treating an arithmetic circuit and its specification as a properly constructed algebraic system. The proposed technique solves these problems by function extraction, i.e., by deriving arithmetic function computed by the circuit from its low-level circuit implementation using computer algebraic rewriting technique. The proposed techniques work on large integer arithmetic circuits and finite field arithmetic circuits, up to 512-bit wide containing millions of logic gates
A new class of irreducible pentanomials for polynomial-based multipliers in binary fields
We introduce a new class of irreducible pentanomials over of
the form . Let and use
to define the finite field extension of degree . We give the exact number of
operations required for computing the reduction modulo . We also provide a
multiplier based on Karatsuba algorithm in combined with our
reduction process. We give the total cost of the multiplier and found that the
bit-parallel multiplier defined by this new class of polynomials has improved
XOR and AND complexity. Our multiplier has comparable time delay when compared
to other multipliers based on Karatsuba algorithm
High Speed and Low Latency ECC Implementation over GF(2m) on FPGA
In this paper, a novel high-speed elliptic curve cryptography (ECC) processor implementation for point multiplication (PM) on field-programmable gate array (FPGA) is proposed. A new segmented pipelined full-precision multiplier is used to reduce the latency, and the Lopez-Dahab Montgomery PM algorithm is modified for careful scheduling to avoid data dependency resulting in a drastic reduction in the number of clock cycles (CCs) required. The proposed ECC architecture has been implemented on Xilinx FPGAs' Virtex4, Virtex5, and Virtex7 families. To the best of our knowledge, our single- and three-multiplier-based designs show the fastest performance to date when compared with reported works individually. Our one-multiplier-based ECC processor also achieves the highest reported speed together with the best reported area-time performance on Virtex4 (5.32 ÎĽs at 210 MHz), on Virtex5 (4.91 ÎĽs at 228 MHz), and on the more advanced Virtex7 (3.18 ÎĽs at 352 MHz). Finally, the proposed three-multiplier-based ECC implementation is the first work reporting the lowest number of CCs and the fastest ECC processor design on FPGA (450 CCs to get 2.83 ÎĽs on Virtex7)
A Chinese Remainder Theorem Approach to Bit-Parallel GF(2^n) Polynomial Basis Multipliers for Irreducible Trinomials
We show that the step “modulo the degree-n field generating irreducible polynomial” in the classical definition of the GF(2^n) multiplication operation can be avoided. This leads to an alternative
representation of the finite field multiplication operation. Combining this representation and the Chinese Remainder Theorem, we design bit-parallel GF(2^n) multipliers for irreducible trinomials u^n + u^k + 1
on GF(2) where 1 < k ≤ n=2. For some values of n, our architectures have the same time complexity as the fastest bit-parallel multipliers – the quadratic multipliers, but their space complexities are reduced. Take the special irreducible trinomial u^(2k) + u^k + 1 for example, the space complexity of the proposed design is reduced by about 1=8, while the time complexity matches the best result. Our experimental results show that among the 539 values of n such that 4 < n < 1000 and x^n+x^k+1 is irreducible over GF(2) for some k in the range 1 < k ≤ n=2, the proposed multipliers beat the current fastest parallel multipliers for 290 values of n when (n − 1)=3 ≤ k ≤ n=2: they have the same time complexity, but the space complexities are reduced by 8.4% on average
Overlap-free Karatsuba-Ofman Polynomial Multiplication Algorithms
We describe how a simple way to split input operands allows for fast VLSI implementations of subquadratic Karatsuba-Ofman multipliers. The theoretical XOR gate delay of the resulting multipliers is reduced significantly. For example, it is reduced by about 33\% and 25\% for and , respectively. To the best of our knowledge, this parameter has never been improved since the original Karatsuba-Ofman algorithm was first used to design multipliers in 1990
Fast Modular Reduction for Large-Integer Multiplication
The work contained in this thesis is a representation of the successful attempt to speed-up the modular reduction as an independent step of modular multiplication, which is the central operation in public-key cryptosystems. Based on the properties of Mersenne and Quasi-Mersenne primes, four distinct sets of moduli have been described, which are responsible for converting the single-precision multiplication prevalent in many of today\u27s techniques into an addition operation and a few simple shift operations. A novel algorithm has been proposed for modular folding. With the backing of the special moduli sets, the proposed algorithm is shown to outperform (speed-wise) the Modified Barrett algorithm by 80% for operands of length 700 bits, the least speed-up being around 70% for smaller operands, in the range of around 100 bits
N-term Karatsuba Algorithm and its Application to Multiplier designs for Special Trinomials
In this paper, we propose a new type of non-recursive Mastrovito multiplier for using a -term Karatsuba algorithm (KA), where is defined by an irreducible trinomial, . We show that such a type of trinomial combined with the -term KA can fully exploit the spatial correlation of entries in related Mastrovito product matrices and lead to a low complexity architecture. The optimal parameter is further studied.
As the main contribution of this study, the lower bound of the space complexity of our proposal is about . Meanwhile, the time complexity matches the best Karatsuba multiplier known to date. To the best of our knowledge, it is the first time that Karatsuba-based multiplier has reached such a space complexity bound while maintaining relatively low time delay
- …