47 research outputs found
A hardware-accelerated ecdlp with highperformance modular multiplication
Elliptic curve cryptography (ECC) has become a popular public key cryptography standard. The security of ECC is due to the difficulty of solving the elliptic curve discrete logarithm problem (ECDLP). In this paper, we demonstrate a successful attack on ECC over prime field using the Pollard rho algorithm implemented on a hardware-software cointegrated platform. We propose a high-performance architecture for multiplication over prime field using specialized DSP blocks in the FPGA. We characterize this architecture by exploring the design space to determine the optimal integer basis for polynomial representation and we demonstrate an efficient mapping of this design to multiple standard prime field elliptic curves. We use the resulting modular multiplier to demonstrate low-latency multiplications for curves secp112r1 and P-192. We apply our modular multiplier to implement a complete attack on secp112r1 using a Nallatech FSB-Compute platform with Virtex-5 FPGA. The measured performance of the resulting design is 114 cycles per Pollard rho step at 100 MHz, which gives 878 K iterations per second per ECC core. We extend this design to a multicore ECDLP implementation that achieves 14.05 M iterations per second with 16 parallel point addition cores
Constructing cluster of simple FPGA boards for cryptologic computations
In this paper, we propose an FPGA cluster infrastructure, which can be utilized in implementing cryptanalytic attacks and accelerating cryptographic operations. The cluster can be formed using simple and inexpensive, off-the-shelf FPGA boards featuring an FPGA device, local storage, CPLD, and network connection. Forming the cluster is simple and no effort for the hardware development is needed except for the hardware design for the actual computation. Using a softcore processor on FPGA, we are able to configure FPGA devices dynamically and change their configuration on the fly from a remote computer. The softcore on FPGA can execute relatively complicated programs for mundane tasks unworthy of FPGA resources. Finally, we propose and implement a fast and efficient dynamic configuration switch technique that is shown to be useful especially in cryptanalytic applications. Our infrastructure provides a cost-effective alternative for formerly proposed cryptanalytic engines based on FPGA devices
On the Analysis of Public-Key Cryptologic Algorithms
The RSA cryptosystem introduced in 1977 by Ron Rivest, Adi Shamir and Len Adleman is the most commonly deployed public-key cryptosystem. Elliptic curve cryptography (ECC) introduced in the mid 80's by Neal Koblitz and Victor Miller is becoming an increasingly popular alternative to RSA offering competitive performance due the use of smaller key sizes. Most recently hyperelliptic curve cryptography (HECC) has been demonstrated to have comparable and in some cases better performance than ECC. The security of RSA relies on the integer factorization problem whereas the security of (H)ECC is based on the (hyper)elliptic curve discrete logarithm problem ((H)ECDLP). In this thesis the practical performance of the best methods to solve these problems is analyzed and a method to generate secure ephemeral ECC parameters is presented. The best publicly known algorithm to solve the integer factorization problem is the number field sieve (NFS). Its most time consuming step is the relation collection step. We investigate the use of graphics processing units (GPUs) as accelerators for this step. In this context, methods to efficiently implement modular arithmetic and several factoring algorithms on GPUs are presented and their performance is analyzed in practice. In conclusion, it is shown that integrating state-of-the-art NFS software packages with our GPU software can lead to a speed-up of 50%. In the case of elliptic and hyperelliptic curves for cryptographic use, the best published method to solve the (H)ECDLP is the Pollard rho algorithm. This method can be made faster using classes of equivalence induced by curve automorphisms like the negation map. We present a practical analysis of their use to speed up Pollard rho for elliptic curves and genus 2 hyperelliptic curves defined over prime fields. As a case study, 4 curves at the 128-bit theoretical security level are analyzed in our software framework for Pollard rho to estimate their practical security level. In addition, we present a novel many-core architecture to solve the ECDLP using the Pollard rho algorithm with the negation map on FPGAs. This architecture is used to estimate the cost of solving the Certicom ECCp-131 challenge with a cluster of FPGAs. Our design achieves a speed-up factor of about 4 compared to the state-of-the-art. Finally, we present an efficient method to generate unique, secure and unpredictable ephemeral ECC parameters to be shared by a pair of authenticated users for a single communication. It provides an alternative to the customary use of fixed ECC parameters obtained from publicly available standards designed by untrusted third parties. The effectiveness of our method is demonstrated with a portable implementation for regular PCs and Android smartphones. On a Samsung Galaxy S4 smartphone our implementation generates unique 128-bit secure ECC parameters in 50 milliseconds on average
Breaking ECC2K-130
Elliptic-curve cryptography is becoming the standard public-key
primitive not only for mobile devices but also for high-security
applications.
Advantages are the higher cryptographic
strength per bit in comparison with RSA and the higher speed in
implementations.
To improve understanding of the exact strength of the elliptic-curve
discrete-logarithm problem, Certicom has published a series of
challenges. This paper describes breaking the ECC2K-130 challenge
using a parallelized version of Pollard\u27s rho method.
This is a major computation bringing together the contributions of
several clusters of conventional computers, PlayStation~3 clusters,
computers with powerful graphics cards and FPGAs. We also give
/preseestimates for an ASIC design. In particular we present * our choice and analysis of the iteration function for the rho method; * our choice of finite field arithmetic and representation;
* detailed descriptions of the implementations on a multitude of
platforms: CPUs, Cells, GPUs, FPGAs, and ASICs; * details about running the attack
Parallel cryptanalysis
Most of today’s cryptographic primitives are based on computations that are hard to perform for a potential attacker but easy to perform for somebody who is in possession of some secret information, the key, that opens a back door in these hard computations and allows them to be solved in a small amount of time. To estimate the strength of a cryptographic primitive it is important to know how hard it is to perform the computation without knowledge of the secret back door and to get an understanding of how much money or time the attacker has to spend. Usually a cryptographic primitive allows the cryptographer to choose parameters that make an attack harder at the cost of making the computations using the secret key harder as well. Therefore designing a cryptographic primitive imposes the dilemma of choosing the parameters strong enough to resist an attack up to a certain cost while choosing them small enough to allow usage of the primitive in the real world, e.g. on small computing devices like smart phones. This thesis investigates three different attacks on particular cryptographic systems: Wagner’s generalized birthday attack is applied to the compression function of the hash function FSB. Pollard’s rho algorithm is used for attacking Certicom’s ECC Challenge ECC2K-130. The implementation of the XL algorithm has not been specialized for an attack on a specific cryptographic primitive but can be used for attacking some cryptographic primitives by solving multivariate quadratic systems. All three attacks are general attacks, i.e. they apply to various cryptographic systems; the implementations of Wagner’s generalized birthday attack and Pollard’s rho algorithm can be adapted for attacking other primitives than those given in this thesis. The three attacks have been implemented on different parallel architectures. XL has been parallelized using the Block Wiedemann algorithm on a NUMA system using OpenMP and on an Infiniband cluster using MPI. Wagner’s attack was performed on a distributed system of 8 multi-core nodes connected by an Ethernet network. The work on Pollard’s Rho algorithm is part of a large research collaboration with several research groups; the computations are embarrassingly parallel and are executed in a distributed fashion in several facilities with almost negligible communication cost. This dissertation presents implementations of the iteration function of Pollard’s Rho algorithm on Graphics Processing Units and on the Cell Broadband Engine
Elliptic Curve Cryptography using Computational Intelligence
Public-key cryptography is a fundamental component of modern electronic communication that can be constructed with many different mathematical processes. Presently, cryptosystems based on elliptic curves are becoming popular due to strong cryptographic strength per small key size. At the heart of these schemes is the complexity of the elliptic curve discrete logarithm problem (ECDLP).
Pollard’s Rho algorithm is a well known method for solving the ECDLP and thereby breaking ciphers based on elliptic curves for reasonably small key sizes (up to approximately 100 bits in length). It has the same time complexity as other known methods but is advantageous due to smaller memory requirements. This study considers how to speed up the Rho process by modifying a key component: the iterating function, which is the part of the algorithm responsible for determining what point is considered next when looking for the solution to the ECDLP. It is replaced with an alternative that is found through an evolutionary process. This alternative consistently and significantly decreases the number of iterations required by Pollard’s Rho Algorithm to successfully find the sought after solution
Faster elliptic-curve discrete logarithms on FPGAs
This paper accelerates FPGA computations of discrete logarithms
on elliptic curves over binary fields. As a toy example,
this paper successfully attacks the SECG standard curve sect113r2,
a binary elliptic curve that was not removed from the SECG standard until 2010 and was not disabled in OpenSSL until June 2015.
This is a new size record for completed ECDL computations,
using a prime order very slightly larger than the previous record holder. More importantly, this paper uses FPGAs much more efficiently,
saving a factor close to 3/2 in the size of each high-speed ECDL core. This paper squeezes 3 cores into a low-cost Spartan-6 FPGA
and many more cores into larger FPGAs. The paper also
benchmarks many smaller-size attacks to demonstrate reliability of the estimates, and covers a much larger curve over a 127-bit field to demonstrate scalability
Applications of Frobenius Expansions in Elliptic Curve Cryptography
Recent developments in elliptic curve cryptography have
heightened the need for fast scalar point multiplication,
specially when working on environments with limited
computational power. It is well known that point multiplication
on elliptic curves over F_{q^m} (with m > 1) can be accelerated
using Frobenius expansions. In practice, the computation is
much faster than the standard double-and-add scalar
multiplication.
An efficient implementation of elliptic curve cryptosystems can
use a Koblitz curve and convert integers into Frobenius
expansions to perform fast scalar multiplications. However,
this conversion of integers to Frobenius expansions would lead
to extra code on the device (i.e., silicon area) and extra
computational cost.
According to N. Koblitz, H. Lenstra suggested that rather than
choosing a random integer n and then converting to a Frobenius
expansion n(\tau), in certain cryptosystems it might be more
efficient to generate a random Frobenius expansion directly.
The temptation then is to choose a relatively short and/or
sparse value for n(\tau). If this is done then we must
re-evaluate the difficulty of the discrete logarithm problem
(and other computational problems). A further issue is that the
existing security proofs may not directly apply. For some
systems it may be necessary to develop bespoke security proofs
for the Frobenius expansion case.
In this thesis, we analyse the Frobenius expansion DLP and
present algorithms to solve it. Furthermore, we propose a
variant of a well known identification scheme designed for
public key cryptography on very restricted devices. More
precisely, we construct the Girault-Poupard-Stern (GPS)
identification scheme for Koblitz elliptic curves using
Frobenius expansions. The idea is to use Frobenius expansions
throughout the protocol, so there is no need to convert between
integers and Frobenius expansions. We also give a security
analysis of the proposed scheme
Constructing cluster of simple FPGA boards for cryptologic computations
In this thesis, we propose an FPGA cluster infrastructure which can be utilized in implementing cryptanalytic attacks and accelerating cryptographic operations. The cluster can be formed using simple and inexpensive, off-the-shelf FPGA boards featuring an FPGA device, local storage, CPLD, and network connection. Forming the cluster is simple and no effort for the hardware development is needed except for the hardware design for the actual computation. Using a softcore processor on FPGA, we are able to configure FPGA devices dynamically and change their configuration on the fly from a remote computer. The softcore on FPGA can execute relatively complicated programs for mundane tasks unworthy of FPGA resources. Finally, we propose and implement a fast and efficient dynamic configuration switch technique that is shown to be useful especially in cryptanalytic applications. Our infrastructure provides a cost-effective alternative for formerly proposed cryptanalytic engines based on FPGA devices