128 research outputs found

    Efficient SIMD arithmetic modulo a Mersenne number

    Get PDF
    This paper describes carry-less arithmetic operations modulo an integer 2^M − 1 in the thousand-bit range, targeted at single instruction multiple data platforms and applications where overall throughput is the main performance criterion. Using an implementation on a cluster of PlayStation 3 game consoles a new record was set for the elliptic curve method for integer factorization

    Regular and almost universal hashing: an efficient implementation

    Get PDF
    Random hashing can provide guarantees regarding the performance of data structures such as hash tables---even in an adversarial setting. Many existing families of hash functions are universal: given two data objects, the probability that they have the same hash value is low given that we pick hash functions at random. However, universality fails to ensure that all hash functions are well behaved. We further require regularity: when picking data objects at random they should have a low probability of having the same hash value, for any fixed hash function. We present the efficient implementation of a family of non-cryptographic hash functions (PM+) offering good running times, good memory usage as well as distinguishing theoretical guarantees: almost universality and component-wise regularity. On a variety of platforms, our implementations are comparable to the state of the art in performance. On recent Intel processors, PM+ achieves a speed of 4.7 bytes per cycle for 32-bit outputs and 3.3 bytes per cycle for 64-bit outputs. We review vectorization through SIMD instructions (e.g., AVX2) and optimizations for superscalar execution.Comment: accepted for publication in Software: Practice and Experience in September 201

    Network Coding Over The 232:5 Prime Field

    Get PDF

    A Search for Good Pseudo-random Number Generators : Survey and Empirical Studies

    Full text link
    In today's world, several applications demand numbers which appear random but are generated by a background algorithm; that is, pseudo-random numbers. Since late 19th19^{th} century, researchers have been working on pseudo-random number generators (PRNGs). Several PRNGs continue to develop, each one demanding to be better than the previous ones. In this scenario, this paper targets to verify the claim of so-called good generators and rank the existing generators based on strong empirical tests in same platforms. To do this, the genre of PRNGs developed so far has been explored and classified into three groups -- linear congruential generator based, linear feedback shift register based and cellular automata based. From each group, well-known generators have been chosen for empirical testing. Two types of empirical testing has been done on each PRNG -- blind statistical tests with Diehard battery of tests, TestU01 library and NIST statistical test-suite and graphical tests (lattice test and space-time diagram test). Finally, the selected 2929 PRNGs are divided into 2424 groups and are ranked according to their overall performance in all empirical tests

    Efficient Modular Multiplication

    Get PDF
    This paper is concerned with one of the fundamental building blocks used in modern public-key cryptography: modular multiplication. Speed-ups applied to the modular multiplication algorithm or implementation directly translate in a faster modular exponentiation for RSA or a faster realization of the group law when using elliptic curve cryptography

    Fast implementation of Curve25519 using AVX2

    Get PDF
    AVX2 is the newest instruction set on the Intel Haswell processor that provides simultaneous execution of operations over vectors of 256 bits. This work presents the advances on the applicability of AVX2 on the development of an efficient software implementation of the elliptic curve Diffie-Hellman protocol using the Curve25519 elliptic curve. Also, we will discuss some advantages that vector instructions offer as an alternative method to accelerate prime field and elliptic curve arithmetic. The performance of our implementation shows a slight improvement against the fastest state-of-the-art implementations.AVX2 is the newest instruction set on the Intel Haswell processor that provides simultaneous execution of operations over vectors of 256 bits. This work presents the advances on the applicability of AVX2 on the development of an efficient software impleme9230329345FAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOSEM INFOMAÇÃO4th International Conference on Cryptology and Information Security in Latin AmericaThe authors would like to thank the anonymous reviewers for their helpful suggestions and comments. Additionally, they would like to show their gratitude to J´er´emie Detrey for his valuable comments on an earlier version of the manuscrip

    On the Cryptanalysis of Public-Key Cryptography

    Get PDF
    Nowadays, the most popular public-key cryptosystems are based on either the integer factorization or the discrete logarithm problem. The feasibility of solving these mathematical problems in practice is studied and techniques are presented to speed-up the underlying arithmetic on parallel architectures. The fastest known approach to solve the discrete logarithm problem in groups of elliptic curves over finite fields is the Pollard rho method. The negation map can be used to speed up this calculation by a factor √2. It is well known that the random walks used by Pollard rho when combined with the negation map get trapped in fruitless cycles. We show that previously published approaches to deal with this problem are plagued by recurring cycles, and we propose effective alternative countermeasures. Furthermore, fast modular arithmetic is introduced which can take advantage of prime moduli of a special form using efficient "sloppy reduction." The effectiveness of these techniques is demonstrated by solving a 112-bit elliptic curve discrete logarithm problem using a cluster of PlayStation 3 game consoles: breaking a public-key standard and setting a new world record. The elliptic curve method (ECM) for integer factorization is the asymptotically fastest method to find relatively small factors of large integers. From a cryptanalytic point of view the performance of ECM gives information about secure parameter choices of some cryptographic protocols. We optimize ECM by proposing carry-free arithmetic modulo Mersenne numbers (numbers of the form 2M – 1) especially suitable for parallel architectures. Our implementation of these techniques on a cluster of PlayStation 3 game consoles set a new record by finding a 241-bit prime factor of 21181 – 1. A normal form for elliptic curves introduced by Edwards results in the fastest elliptic curve arithmetic in practice. Techniques to reduce the temporary storage and enhance the performance even further in the setting of ECM are presented. Our results enable one to run ECM efficiently on resource-constrained platforms such as graphics processing units

    Pseudo-Random Number Generators for Vector Processors and Multicore Processors

    Get PDF
    Large scale Monte Carlo applications need a good pseudo-random number generator capable of utilizing both the vector processing capabilities and multiprocessing capabilities of modern computers in order to get the maximum performance. The requirements for such a generator are discussed. New ways of avoiding overlapping subsequences by combining two generators are proposed. Some fundamental philosophical problems in proving independence of random streams are discussed. Remedies for hitherto ignored quantization errors are offered. An open source C++ implementation is provided for a generator that meets these needs
    • …
    corecore