106 research outputs found
On the Cryptanalysis of Public-Key Cryptography
Nowadays, the most popular public-key cryptosystems are based on either the integer factorization or the discrete logarithm problem. The feasibility of solving these mathematical problems in practice is studied and techniques are presented to speed-up the underlying arithmetic on parallel architectures. The fastest known approach to solve the discrete logarithm problem in groups of elliptic curves over finite fields is the Pollard rho method. The negation map can be used to speed up this calculation by a factor √2. It is well known that the random walks used by Pollard rho when combined with the negation map get trapped in fruitless cycles. We show that previously published approaches to deal with this problem are plagued by recurring cycles, and we propose effective alternative countermeasures. Furthermore, fast modular arithmetic is introduced which can take advantage of prime moduli of a special form using efficient "sloppy reduction." The effectiveness of these techniques is demonstrated by solving a 112-bit elliptic curve discrete logarithm problem using a cluster of PlayStation 3 game consoles: breaking a public-key standard and setting a new world record. The elliptic curve method (ECM) for integer factorization is the asymptotically fastest method to find relatively small factors of large integers. From a cryptanalytic point of view the performance of ECM gives information about secure parameter choices of some cryptographic protocols. We optimize ECM by proposing carry-free arithmetic modulo Mersenne numbers (numbers of the form 2M – 1) especially suitable for parallel architectures. Our implementation of these techniques on a cluster of PlayStation 3 game consoles set a new record by finding a 241-bit prime factor of 21181 – 1. A normal form for elliptic curves introduced by Edwards results in the fastest elliptic curve arithmetic in practice. Techniques to reduce the temporary storage and enhance the performance even further in the setting of ECM are presented. Our results enable one to run ECM efficiently on resource-constrained platforms such as graphics processing units
On the Analysis of Public-Key Cryptologic Algorithms
The RSA cryptosystem introduced in 1977 by Ron Rivest, Adi Shamir and Len Adleman is the most commonly deployed public-key cryptosystem. Elliptic curve cryptography (ECC) introduced in the mid 80's by Neal Koblitz and Victor Miller is becoming an increasingly popular alternative to RSA offering competitive performance due the use of smaller key sizes. Most recently hyperelliptic curve cryptography (HECC) has been demonstrated to have comparable and in some cases better performance than ECC. The security of RSA relies on the integer factorization problem whereas the security of (H)ECC is based on the (hyper)elliptic curve discrete logarithm problem ((H)ECDLP). In this thesis the practical performance of the best methods to solve these problems is analyzed and a method to generate secure ephemeral ECC parameters is presented. The best publicly known algorithm to solve the integer factorization problem is the number field sieve (NFS). Its most time consuming step is the relation collection step. We investigate the use of graphics processing units (GPUs) as accelerators for this step. In this context, methods to efficiently implement modular arithmetic and several factoring algorithms on GPUs are presented and their performance is analyzed in practice. In conclusion, it is shown that integrating state-of-the-art NFS software packages with our GPU software can lead to a speed-up of 50%. In the case of elliptic and hyperelliptic curves for cryptographic use, the best published method to solve the (H)ECDLP is the Pollard rho algorithm. This method can be made faster using classes of equivalence induced by curve automorphisms like the negation map. We present a practical analysis of their use to speed up Pollard rho for elliptic curves and genus 2 hyperelliptic curves defined over prime fields. As a case study, 4 curves at the 128-bit theoretical security level are analyzed in our software framework for Pollard rho to estimate their practical security level. In addition, we present a novel many-core architecture to solve the ECDLP using the Pollard rho algorithm with the negation map on FPGAs. This architecture is used to estimate the cost of solving the Certicom ECCp-131 challenge with a cluster of FPGAs. Our design achieves a speed-up factor of about 4 compared to the state-of-the-art. Finally, we present an efficient method to generate unique, secure and unpredictable ephemeral ECC parameters to be shared by a pair of authenticated users for a single communication. It provides an alternative to the customary use of fixed ECC parameters obtained from publicly available standards designed by untrusted third parties. The effectiveness of our method is demonstrated with a portable implementation for regular PCs and Android smartphones. On a Samsung Galaxy S4 smartphone our implementation generates unique 128-bit secure ECC parameters in 50 milliseconds on average
Fast Modular Arithmetic on the Kalray MPPA-256 Processor for an Energy-Efficient Implementation of ECM
International audienceThe Kalray MPPA-256 processor is based on a recent low-energy manycore architecture. In this article, we investigate its performance in multiprecision arithmetic for number-theoretic applications. We have developed a library for modular arithmetic that takes full advantage of the particularities of this architecture. This is in turn used in an implementation of the ECM, an algorithm for integer factorization using elliptic curves. For parameters corresponding to a cryptanalytic context, our implementation compares well to state-of-the-art implementations on GPU, while using much less energy
Integer factorization and discrete logarithm problems
Notes d'un cours donné aux Journées Nationales de Calcul FormelThese are notes for a lecture given at CIRM in 2014, for the Journées Nationales du Calcul Formel. We explain the basic algorithms based on combining congruences for solving the integer factorization and the discrete logarithm problems. We highlight two particular situations where the interaction with symbolic computation is visible: the use of Gröbner basis in Joux's algorithm for discrete logarithm in nite eld of small characteristic, and the exact sparse linear algebra tools that occur in the Number Field Sieve algorithm for discrete logarithm in large characteristic
Measuring And Securing Cryptographic Deployments
This dissertation examines security vulnerabilities that arise due to communication failures and incentive mismatches along the path from cryptographic algorithm design to eventual deployment. I present six case studies demonstrating vulnerabilities in real-world cryptographic deployments. I also provide a framework with which to analyze the root cause of cryptographic vulnerabilities by characterizing them as failures in four key stages of the deployment process: algorithm design and cryptanalysis, standardization, implementation, and endpoint deployment. Each stage of this process is error-prone and influenced by various external factors, the incentives of which are not always aligned with security. I validate the framework by applying it to the six presented case studies, tracing each vulnerability back to communication failures or incentive mismatches in the deployment process.
To curate these case studies, I develop novel techniques to measure both existing and new cryptographic attacks, and demonstrate the widespread impact of these attacks on real-world systems through measurement and cryptanalysis. While I do not claim that all cryptographic vulnerabilities can be described with this framework, I present a non-trivial (in fact substantial) number of case studies demonstrating that this framework characterizes the root cause of failures in a diverse set of cryptographic deployments
Polyhedral+Dataflow Graphs
This research presents an intermediate compiler representation that is designed for optimization, and emphasizes the temporary storage requirements and execution schedule of a given computation to guide optimization decisions. The representation is expressed as a dataflow graph that describes computational statements and data mappings within the polyhedral compilation model. The targeted applications include both the regular and irregular scientific domains.
The intermediate representation can be integrated into existing compiler infrastructures. A specification language implemented as a domain specific language in C++ describes the graph components and the transformations that can be applied. The visual representation allows users to reason about optimizations. Graph variants can be translated into source code or other representation. The language, intermediate representation, and associated transformations have been applied to improve the performance of differential equation solvers, or sparse matrix operations, tensor decomposition, and structured multigrid methods
Breaking ECC2K-130
Elliptic-curve cryptography is becoming the standard public-key
primitive not only for mobile devices but also for high-security
applications.
Advantages are the higher cryptographic
strength per bit in comparison with RSA and the higher speed in
implementations.
To improve understanding of the exact strength of the elliptic-curve
discrete-logarithm problem, Certicom has published a series of
challenges. This paper describes breaking the ECC2K-130 challenge
using a parallelized version of Pollard\u27s rho method.
This is a major computation bringing together the contributions of
several clusters of conventional computers, PlayStation~3 clusters,
computers with powerful graphics cards and FPGAs. We also give
/preseestimates for an ASIC design. In particular we present * our choice and analysis of the iteration function for the rho method; * our choice of finite field arithmetic and representation;
* detailed descriptions of the implementations on a multitude of
platforms: CPUs, Cells, GPUs, FPGAs, and ASICs; * details about running the attack
Spinfoams and high performance computing
Numerical methods are a powerful tool for doing calculations in spinfoam
theory. We review the major frameworks available, their definition, and various
applications. We start from , the state-of-the-art
library to efficiently compute EPRL spin foam amplitudes based on the booster
decomposition. We also review two alternative approaches based on the
integration representation of the spinfoam amplitude: Firstly, the numerical
computations of the complex critical points discover the curved geometries from
the spinfoam amplitude and provides important evidence of resolving the
flatness problem in the spinfoam theory. Lastly, we review the numerical
estimation of observable expectation values based on the Lefschetz thimble and
Markov-Chain Monte Carlo method, with the EPRL spinfoam propagator as an
example.Comment: 33 pages, 11 figures. Invited chapter for the book "Handbook of
Quantum Gravity" (Eds. C. Bambi, L. Modesto and I.L. Shapiro, Springer
Singapore, expected in 2023
Recommended from our members
A Study of High Performance Multiple Precision Arithmetic on Graphics Processing Units
Multiple precision (MP) arithmetic is a core building block of a wide variety of algorithms in computational mathematics and computer science. In mathematics MP is used in computational number theory, geometric computation, experimental mathematics, and in some random matrix problems. In computer science, MP arithmetic is primarily used in cryptographic algorithms: securing communications, digital signatures, and code breaking. In most of these application areas, the factor that limits performance is the MP arithmetic. The focus of our research is to build and analyze highly optimized libraries that allow the MP operations to be offloaded from the CPU to the GPU. Our goal is to achieve an order of magnitude improvement over the CPU in three key metrics: operations per second per socket, operations per watt, and operation per second per dollar. What we find is that the SIMD design and balance of compute, cache, and bandwidth resources on the GPU is quite different from the CPU, so libraries such as GMP cannot simply be ported to the GPU. New approaches and algorithms are required to achieve high performance and high utilization of GPU resources. Further, we find that low-level ISA differences between GPU generations means that an approach that works well on one generation might not run well on the next.
Here we report on our progress towards MP arithmetic libraries on the GPU in four areas: (1) large integer addition, subtraction, and multiplication; (2) high performance modular multiplication and modular exponentiation (the key operations for cryptographic algorithms) across generations of GPUs; (3) high precision floating point addition, subtraction, multiplication, division, and square root; (4) parallel short division, which we prove is asymptotically optimal on EREW and CREW PRAMs
- …