Search CORE

10 research outputs found

Approaches for the Parallelization of Software Implementation of Integer Multiplication

Author: Andrew Okhrimenko
Vladislav Kovtun
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 22/08/2012
Field of study

In this paper there are considered several approaches for the increasing performance of software implementation of integer multiplication algorithm for the 32-bit & 64-bit platforms via parallelization. The main idea of algorithm parallelization consists in delayed carry mechanism using which authors have proposed earlier [11]. The delayed carry allows to get rid of connectivity in loop iterations for sums accumulation of products, which allows parallel execution of loops iterations in separate threads. Upon completion of sum accumulation threads, it is necessary to make corrections in final result via assimilation of carries. First approach consists in optimization of parallelization for the two execution threads and second approach is an evolution of the first approach and is oriented on three and more execution threads. Proposed approaches for parallelization allow increasing the total algorithm computational complexity, as for one execution thread, but decrease total execution time on multi-core CPU

Cryptology ePrint Archive

ПОДХОДЫ К ПОВЫШЕНИЮ ПРОИЗВОДИТЕЛЬНОСТИ ПРОГРАММНОЙ РЕАЛИЗАЦИИ ОПЕРАЦИИ УМНОЖЕНИЯ В ПОЛЕ ЦЕЛЫХ ЧИСЕЛ

Author: Ковтун В.Ю.
Нечипорук В.В.
Охрименко А.А.
Publication venue: 'National Aviation University'
Publication date: 15/03/2012
Field of study

Авторами предлагается подход к увеличению производительности программной реализации алгоритма умножения в поле чисел для 32-х и 64-х разрядных платформ, который состоит в использовании механизма отложенного учета переноса из старшего разряда при накоплении суммы, что позволяет избежать необходимости учета переноса из старшего разряда на каждой итерации цикла накопления суммы. Отложенный перенос дает возможность уменьшить общее число операций сложения и эффективно применять существующие технологии распараллеливания

Crossref

Наукові журнали Національного Авіаційного Університету

Approaches for the performance increasing of software implementation of integer multiplication in prime fields

Author: Andrew Okhrimenko
Vladislav Kovtun
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 31/03/2012
Field of study

Authors have proposed the approach to increase performance of software implementation of finite field multiplication algorithm, for 32-bit and 64-bit platforms. The approach is based on delayed carry mechanism of significant bit in sum accumulating. This allows to avoid the requirement of taking into account the significant bit carry at the each iteration of the sum accumulation loop. The delayed carry mechanism reduces the total number of additions and gives the opportunity to apply the modern parallelization technologies

Cryptology ePrint Archive

Performance Increasing Approaches For Binary Field Inversion

Author: Maria Bulakh
Vladislav Kovtun
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/07/2014
Field of study

Authors propose several approaches for increasing performance of multiplicative inversion algorithm in binary fields based on Extended Euclidean Algorithm (EEA). First approach is based on Extended Euclidean Algorithm specificity: either invariant polynomial u remains intact or swaps with invariant polynomial v. It makes it possible to avoid necessity of polynomial v degree computing. The second approach is based on searching the next matching index when calculating the degree of the polynomial, since degree polynomial invariant u at least decreases by 1, then it is possible to use current value while further calculation the degree of the polynomial

Cryptology ePrint Archive

Techniques for Performance Improvement of Integer Multiplication in Cryptographic Applications

Author: Andrew Okhrimenko
Robert Brumnik
Sergii Kavun
Vladislav Kovtun
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

The problem of arithmetic operations performance in number fields is actively researched by many scientists, as evidenced by significant publications in this field. In this work, we offer some techniques to increase performance of software implementation of finite field multiplication algorithm, for both 32-bit and 64-bit platforms. The developed technique, called “delayed carry mechanism,” allows to preventing necessity to consider a significant bit carry at each iteration of the sum accumulation loop. This mechanism enables reducing the total number of additions and applies the modern parallelization technologies effectively

Crossref

Directory of Open Access Journals

Vectorizing and distributing number-theoretic transform to count Goldbach partitions on Arm-based supercomputers

Author: Jesus Ricardo
Oliveira e Silva Tomas
Weiland Michele
Publication venue
Publication date: 14/08/2023
Field of study

In this article, we explore the usage of scalable vector extension (SVE) to vectorize number-theoretic transforms (NTTs). In particular, we show that 64-bit modular arithmetic operations, including modular multiplication, can be efficiently implemented with SVE instructions. The vectorization of NTT loops and kernels involving 64-bit modular operations was not possible in previous Arm-based single instruction multiple data architectures since these architectures lacked crucial instructions to efficiently implement modular multiplication. We test and evaluate our SVE implementation on the A64FX processor in an HPE Apollo 80 system. Furthermore, we implement a distributed NTT for the computation of large-scale exact integer convolutions. We evaluate this transform on HPE Apollo 70, Cray XC50, HPE Apollo 80, and HPE Cray EX systems, where we demonstrate good scalability to thousands of cores. Finally, we describe how these methods can be utilized to count the number of Goldbach partitions of all even numbers to large limits. We present some preliminary results concerning this problem, in particular a histogram of the number of Goldbach partitions of the even numbers up to 2 40.</p

Edinburgh Research Explorer

Modular SIMD arithmetic in Mathemagix

Author: Lecerf Grégoire
Quintin Guillaume
van der Hoeven Joris
Publication venue
Publication date: 29/06/2014
Field of study

Modular integer arithmetic occurs in many algorithms for computer algebra, cryptography, and error correcting codes. Although recent microprocessors typically offer a wide range of highly optimized arithmetic functions, modular integer operations still require dedicated implementations. In this article, we survey existing algorithms for modular integer arithmetic, and present detailed vectorized counterparts. We also present several applications, such as fast modular Fourier transforms and multiplication of integer polynomials and matrices. The vectorized algorithms have been implemented in C++ inside the free computer algebra and analysis system Mathemagix. The performance of our implementation is illustrated by various benchmarks

arXiv.org e-Print Archive

HAL-UNILIM

HAL-Polytechnique

Comparison of Modular Arithmetic Algorithms on GPUs

Author: Giorgi Pascal
Izard Thomas
Tisserand Arnaud
Publication venue: HAL CCSD
Publication date: 09/11/2009
Field of study

International audienceWe present below our first implementation results on a modular arithmetic library on GPUs for cryptography. Our library, in C++ for CUDA, provides modular arithmetic, finite field arithmetic and some ECC support. Several algorithms and memory coding styles have been compared: local, shared and register. For moderate sizes, we report up to 2.6 speedup compared to state-of-the-art library

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1