10 research outputs found
Solving RC5 challenges with hardware - a distributed.net perspective
Encryption is the basic means to enforce confidentiality in digital communications. This work explores a hardware design alternative and a cost assessment of an FPGA-based brute force attack against RSA secret-key challenge RC5-72. The aim is to develop an alternative to software-based solutions for distributed.net. Implementation results show that an 80 US$ FPGA can yield a throughput of 145 Mkeys/sec with a power consumption of 10 Watts. This is roughly an order of magnitude faster, cheaper and lower power, when compared with fully dedicated general purpose computers.Anglai
Area and time trade-offs for iterative modular division over GF(2(m)): novel algorithm and implementations on FPGA
Public key cryptography is a concept used by many useful functionalities, such as digital signature, encryption, key agreements, etc. For those needs, elliptic curve cryptography is an attractive solution. Cryptosystems based on elliptic curves (EC) need a costly modular division. Efficient implementations of this operation are useful for both area-constrained designs working in affine coordinates and high-speed processors. For that purpose, this work highlights the most efficient iterative modular division algorithm and explores different time and area trade-offs on FPGA. In particular, a novel algorithm is proposed and a specific feature of the algorithm is exploited. To show the impact of the different trade-offs on a whole architecture, dividers are also integrated in a low-footprint ECC processor. To the best of our knowledge, it is the first report about an iterative digit-serial modular division algorithm, the first area and time trade-off analysis of an iterative algorithm and the best result among the very few implementations on FPGA
An improved Montgomery modular inversion targeted for efficient implementation on FPGA
Modular multiplication and inversion/division are the most common primitives in today's public key cryptography. Elliptic curve public key cryptosystems (ECPKC) are becoming increasingly popular for use in mobile appliances where bandwidth and chip area are strongly constrained. For the same level of security, ECPKC use much smaller key length than the commonly used RSA but need modular inversion/division. This paper presents an improved algorithm for prime field Montgomery modular inversion. The first important contribution lies in the reduction of the number of operations needed. Resource sharing is also used to lighten the control part of the algorithm. The second contribution is the minimization of the set of different instructions to enable powerful FPGA implementations. Resulting 256-bit circuit achieves a ratio throughput/area improved by at least 70% compared to the only known Montgomery inverse design in FPGA technology. Though the implementations are first oriented towards FPGA, some improvements are generic. So, they could prove to be also efficient for ASIC designs in terms of area and power consumption.Anglai
Collision search for elliptic curve discrete logarithm over GF(2/sup m/) with FPGA
Elliptic curve cryptography (ECC) has gained increasing acceptance in the industry and the academic community and has been the subject of several standards. This interest is mainly due to the high level of security with relatively small keys provided by ECC. Indeed, no sub-exponential algorithms are known to solve the underlying hard problem: the elliptic curve discrete logarithm. The aim of this work is to explore the possibilities of dedicated hardware implementing the best known algorithm for generic curves: the parallelized Pollard's rho method. This problem has specific constraints and requires therefore new architectures. Four different strategies were investigated with different FPGA families in order to provide the best area-time product, according to the capabilities of the chosen platforms. The approach yielding the best throughput over hardware cost ratio is then fully described and was implemented in order to estimate the cost of an attack. Such results should help to improve the accuracy of the security level offered by a given key size, especially for the shorter parameters proposed for resource constrained devices.Anglai
Iterative modular division over GF(2/sup m/): novel algorithm and implementations on FPGA
Public key cryptography is a concept used by many useful functionalities such as digital signature, encryption, key agreements, ... For those needs, elliptic curve cryptography is an attractive solution. Cryptosystems based on elliptic curve need a costly modular division. Depending on the choice of coordinates, this operation is requested at each step of algorithms, during a precomputation phase or at the end of the whole computation. As a result, efficient modular division implementations are useful for both area constrained designs working in affine coordinates and high-speed processors. For that purpose, this paper highlights the most efficient iterative modular division algorithm and explores different time and area tradeoffs on FPGA. First, thanks to a novel algorithm, the computational time is divided by two with an area increase of one half. Second, using the single-instruction multiple-data feature of the selected algorithm, the area is divided by two with a doubling of the computational time. To the best of our knowledge, it is the first report about an iterative digit-serial modular division algorithm, the first area and time tradeoff analysis of an iterative algorithm and the best result among the very few implementations on FPGA.Anglai
Faster and smaller hardware implementation of XTR
Modular multiplication is the core of most public key cryptosystems and therefore its implementation plays a crucial role in the overall efficiency of asymmetric cryptosystems. Hardware approaches provide advantages over software in the framework of efficient dedicated accelerators. The concerns of the designers are mainly the die size, frequency, latency (throughput) and power consumption of those solutions. We show in this paper how booth recoding, pipelining, Montgomery modular multiplication and carry save adders offer an attractive solution for hardware modular multiplication. Although most of the hereafter techniques stand as state-of-the-art, the combination described here is unique and particularly efficient in the context of constrained hardware design of XTR cryptosystem. Our solution is implemented on an FPGA platform and compared with previous results. The area-time ratio is improved by around a factor of 3.Anglai
Integer factorization based on elliptic curve method: towards better exploitation of reconfigurable hardware
Currently, the best known algorithm for factorizing modulus of the RSA public key cryptosystem is the Number Field Sieve. One of its important phases usually combines a sieving technique and a method for checking smoothness of mid-size numbers. For this factorization, the Elliptic Curve Method (ECM) is an attractive solution. As ECM is highly regular and many parallel computations are required, hardware-based platforms were shown to be more cost-effective than software solutions. The few papers dealing with implementation of ECM on FPGA are all based on bit-serial architectures. They use only general-purpose logic and low-cost FPGAs which appear as the best performance/cost solution. This work explores another approach, based on the exploitation of embedded multipliers available in modern FPGAs and the use of high-performances FPGAs. The proposed architecture - based on a fully parallel and pipelined modular multiplier circuit - exhibits a 15-fold improvement over throughput/hardware cost ratio of previously published results.Anglai
On solving RC5 challenges with FPGAs
This work explores a hardware design alternative and a cost assessment of an FPGA-based brute force attack against the challenge RC5-72. The aim is to develop an alternative to software-based solutions for distributed.net. Hardware platforms, particularly reconfigurable hardware, can offer significant cost, flexibility and performance advantages, while significantly reducing environmental energy costs and impacts. Implementation results show that an 80 US$ FPGA can yield a throughput of 145 Mkeys/sec with a power consumption of 10 Watts. This is roughly an order of magnitude faster, cheaper and lower power, when compared with fully dedicated general purpose computers.Anglai
Improving Modular Inversion in RNS using the Plus-Minus Method
International audienceThe paper describes a new RNS modular inversion algorithm based on the extended Euclidean algorithm and the plus-minus trick. In our algorithm, comparisons over large RNS values are replaced by cheap computations modulo 4. Comparisons to an RNS version based on Fermat's little theorem were carried out. The number of elementary modular operations is signi cantly reduced: a factor 12 to 26 for multiplications and 6 to 21 for additions. Virtex 5 FPGAs implementations show that for a similar area, our plus-minus RNS modular inversion is 6 to 10 times faster