13 research outputs found

    Parallel cryptanalysis

    Get PDF
    Most of today’s cryptographic primitives are based on computations that are hard to perform for a potential attacker but easy to perform for somebody who is in possession of some secret information, the key, that opens a back door in these hard computations and allows them to be solved in a small amount of time. To estimate the strength of a cryptographic primitive it is important to know how hard it is to perform the computation without knowledge of the secret back door and to get an understanding of how much money or time the attacker has to spend. Usually a cryptographic primitive allows the cryptographer to choose parameters that make an attack harder at the cost of making the computations using the secret key harder as well. Therefore designing a cryptographic primitive imposes the dilemma of choosing the parameters strong enough to resist an attack up to a certain cost while choosing them small enough to allow usage of the primitive in the real world, e.g. on small computing devices like smart phones. This thesis investigates three different attacks on particular cryptographic systems: Wagner’s generalized birthday attack is applied to the compression function of the hash function FSB. Pollard’s rho algorithm is used for attacking Certicom’s ECC Challenge ECC2K-130. The implementation of the XL algorithm has not been specialized for an attack on a specific cryptographic primitive but can be used for attacking some cryptographic primitives by solving multivariate quadratic systems. All three attacks are general attacks, i.e. they apply to various cryptographic systems; the implementations of Wagner’s generalized birthday attack and Pollard’s rho algorithm can be adapted for attacking other primitives than those given in this thesis. The three attacks have been implemented on different parallel architectures. XL has been parallelized using the Block Wiedemann algorithm on a NUMA system using OpenMP and on an Infiniband cluster using MPI. Wagner’s attack was performed on a distributed system of 8 multi-core nodes connected by an Ethernet network. The work on Pollard’s Rho algorithm is part of a large research collaboration with several research groups; the computations are embarrassingly parallel and are executed in a distributed fashion in several facilities with almost negligible communication cost. This dissertation presents implementations of the iteration function of Pollard’s Rho algorithm on Graphics Processing Units and on the Cell Broadband Engine

    ECC2K-130 on Cell CPUs

    Get PDF
    This paper describes an implementation of Pollard’s rho algorithm to compute the elliptic curve discrete logarithm for the Synergistic Processor Elements of the Cell Broadband Engine Architecture. Our implementation targets the elliptic curve discrete logarithm problem defined in the Certicom ECC2K-130 challenge. We compare a bitsliced implementation to a non-bitsliced implementation and describe several optimization techniques for both approaches. In particular, we address the question whether normal-basis or polynomial-basis representation of field elements leads to better performance. Using our software, the ECC2K-130 challenge can be solved in one year using the Synergistic Processor Units of less than 2700 Sony Playstation 3 gaming consoles. Keywords: Cell Broadband Engine Architecture, elliptic curve discrete logarithm problem, binary-field arithmetic, parallel Pollard rho

    Fast exhaustive search for quadratic systems in F2\mathbb{F}_2 on FPGAs : extended version

    Get PDF
    In 2010, Bouillaguet et al. proposed an efficient solver for polynomial systems over F2\mathbb{F}_2 that trades memory for speed. As a result, 48 quadratic equations in 48 variables can be solved on a graphics card (GPU) in 21 minutes. The research question that we would like to answer in this paper is how specifically designed hardware performs on this task. We approach the answer by solving multivariate quadratic systems on reconfigurable hardware, namely Field-Programmable Gate Arrays (FPGAs). We show that, although the algorithm proposed by Bouillaguet et al. has a better asymptotic time complexity than traditional enumeration algorithms, it does not have a better asymptotic complexity in terms of silicon area. Nevertheless, our FPGA implementation consumes 25 times less energy than their GPU implementation. This is a significant improvement, not to mention that the monetary cost per unit of computational power for FPGAs is generally much cheaper than that of GPUs

    Fast exhaustive search for quadratic systems in F_2 on FPGAs

    No full text
    In 2010, Bouillaguet et al. proposed an e¿cient solver for polynomial systems over F2 that trades memory for speed [BCC+10]. As a result, 48 quadratic equations in 48 variables can be solved on a graphics processing unit (GPU) in 21 min. The research question that we would like to answer in this paper is how speci¿cally designed hardware performs on this task. We approach the answer by solving multivariate quadratic systems on recon¿gurable hardware, namely Field-Programmable Gate Arrays (FPGAs). We show that, although the algorithm proposed in [BCC+10] has a better asymptotic time complexity than traditional enumeration algorithms, it does not have a better asymptotic complexity in terms of silicon area. Nevertheless, our FPGA implementation consumes 20–25 times less energy than its GPU counterpart. This is a signi¿cant improvement, not to mention that the monetary cost per unit of computational power for FPGAs is generally much cheaper than that of GPUs

    New software speed records for cryptographic pairings

    No full text
    Abstract This paper presents new software speed records for the computation of cryptographic pairings. More specifically, we present details of an implementation which computes the optimal ate pairing on a 257-bit Barreto-Naehrig curve in only 4,470,408 cycles on one core of an Intel Core 2 Quad Q6600 processor. This speed is achieved by combining 1.) state-of-the-art high-level optimization techniques, 2.) a new representation of elements in the underlying finite fields which makes use of the special modulus arising from the Barreto-Naehrig curve construction, and 3.) implementing arithmetic in this representation using the double-precision floating-point SIMD instructions of the AMD64 architecture. Keywords: Pairings, Barreto-Naehrig curves, ate pairing, AMD64 architecture, modular arithmetic, SIMD floating-point instructions

    ECC2K-130 on Cell CPUs

    No full text
    This paper describes an implementation of Pollard’s rho algorithm to compute the elliptic curve discrete logarithm for the Synergistic Processor Elements of the Cell Broadband Engine Architecture. Our implementation targets the elliptic curve discrete logarithm problem defined in the Certicom ECC2K-130 challenge. We compare a bitsliced implementation to a non-bitsliced implementation and describe several optimization techniques for both approaches. In particular, we address the question whether normal-basis or polynomial-basis representation of field elements leads to better performance. We show that using our software the ECC2K-130 challenge can be solved in one year using the Synergistic Processor Units of less than 2700 Sony Playstation 3 gaming consoles

    FSBday : Implementing Wagner's generalized birthday attack against the SHA-3 round-1 candidate FSB

    No full text
    The hash function FSB is one of the candidates submitted to NIST’s competition to find the new standard hash function, SHA-3. The compression function of FSB is based on error correcting codes. In this paper we show how to use Wagner’s generalized birthday attack to find collisions in FSB’s compression function. In particular, we present details on our implementation attacking FSB48, a toy version of FSB which was proposed by the FSB submitters as a training case for FSB. Our attack does not make use of any properties of the particular linear code used within FSB. FSB48 was chosen as a target where generalized birthday attacks would be one of the strongest attacks and which could be attacked in practice. We show how to adapt this attack so that it runs on our computer cluster of only 10 PCs which provides far less memory than the usual implementation of generalized birthday attacks would require. This situation is very interesting for estimating the security of systems against distributed attacks using contributed off-the-shelf PCs. For the SHA-3 competition this result is meaningful in that it allows to assess the security of FSB against the strongest non-structural attack; it does not provide any insight in the security of this particular choice of linear code

    How to manipulate curve standards : a white paper for the black hat

    No full text
    This paper analyzes the cost of breaking ECC under the following assumptions: (1) ECC is using a standardized elliptic curve that was actually chosen by an attacker; (2) the attacker is aware of a vulnerability in some curves that are not publicly known to be vulnerable. This cost includes the cost of exploiting the vulnerability, but also the initial cost of computing a curve suitable for sabotaging the standard. This initial cost depends upon the acceptability criteria used by the public to decide whether to allow a curve as a standard, and (in most cases) also upon the chance of a curve being vulnerable. This paper shows the importance of accurately modeling the actual acceptability criteria: i.e., figuring out what the public can be fooled into accepting. For example, this paper shows that plausible models of the "Brainpool acceptability criteria" allow the attacker to target a one-in-a-million vulnerability. Keywords: Elliptic-curve cryptography, verifiably random curves, verifiably pseudorandom curves, nothing- up-my-sleeve numbers, sabotaging standards, fighting terrorism, protecting the childre

    How to manipulate curve standards : a white paper for the black hat

    No full text
    This paper analyzes the cost of breaking ECC under the following assumptions: (1) ECC is using a standardized elliptic curve that was actually chosen by an attacker; (2) the attacker is aware of a vulnerability in some curves that are not publicly known to be vulnerable. This cost includes the cost of exploiting the vulnerability, but also the initial cost of computing a curve suitable for sabotaging the standard. This initial cost depends upon the acceptability criteria used by the public to decide whether to allow a curve as a standard, and (in most cases) also upon the chance of a curve being vulnerable. This paper shows the importance of accurately modeling the actual acceptability criteria: i.e., figuring out what the public can be fooled into accepting. For example, this paper shows that plausible models of the "Brainpool acceptability criteria" allow the attacker to target a one-in-a-million vulnerability. Keywords: Elliptic-curve cryptography, verifiably random curves, verifiably pseudorandom curves, nothing- up-my-sleeve numbers, sabotaging standards, fighting terrorism, protecting the childre

    Fast exhaustive search for quadratic systems in F_2 on FPGAs

    No full text
    In 2010, Bouillaguet et al. proposed an e¿cient solver for polynomial systems over F2 that trades memory for speed [BCC+10]. As a result, 48 quadratic equations in 48 variables can be solved on a graphics processing unit (GPU) in 21 min. The research question that we would like to answer in this paper is how speci¿cally designed hardware performs on this task. We approach the answer by solving multivariate quadratic systems on recon¿gurable hardware, namely Field-Programmable Gate Arrays (FPGAs). We show that, although the algorithm proposed in [BCC+10] has a better asymptotic time complexity than traditional enumeration algorithms, it does not have a better asymptotic complexity in terms of silicon area. Nevertheless, our FPGA implementation consumes 20–25 times less energy than its GPU counterpart. This is a signi¿cant improvement, not to mention that the monetary cost per unit of computational power for FPGAs is generally much cheaper than that of GPUs
    corecore