41 research outputs found
Modular SIMD arithmetic in Mathemagix
Modular integer arithmetic occurs in many algorithms for computer algebra,
cryptography, and error correcting codes. Although recent microprocessors
typically offer a wide range of highly optimized arithmetic functions, modular
integer operations still require dedicated implementations. In this article, we
survey existing algorithms for modular integer arithmetic, and present detailed
vectorized counterparts. We also present several applications, such as fast
modular Fourier transforms and multiplication of integer polynomials and
matrices. The vectorized algorithms have been implemented in C++ inside the
free computer algebra and analysis system Mathemagix. The performance of our
implementation is illustrated by various benchmarks
Hardware Aspects of Montgomery Modular Multiplication
This chapter compares Peter Montgomery\u27s modular multiplication method
with traditional techniques for suitability on hardware platforms. It also covers systolic array implementations and side channel leakage
Organization of parallel execution of modular multiplication to speed up the computational implementation of public-key cryptography
The article theoretically substantiates, investigates and develops a method for parallel
execution of the basic operation of public key cryptography - modular multiplication of numbers with
high bit count. It is based on a special organization of the division of the components of modular
multiplication into independent computational processes. To implement this, it is proposed to use the
Montgomery modular reduction. The described solution is illustrated with numerical examples. It has
been theoretically and experimentally proven that the proposed approach to parallelization of the
arithmetical process of modular multiplication makes it possible to speed up this important for
cryptographic tasks operation by 5-6 times
Comparison of Scalable Montgomery Modular Multiplication Implementations Embedded in Reconfigurable Hardware
International audienceThis paper presents a comparison of possible approaches for an efficient implementation of Multiple-word radix-2 Montgomery Modular Multiplication (MM) on modern Field Programmable Gate Arrays (FPGAs). The hardware implementation of MM coprocessor is fully scalable what means that it can be reused in order to generate long-precision results independently on the word length of the originally proposed coprocessor. The first of analyzed implementations uses a data path based on traditionally used redundant carry-save adders, the second one exploits, in scalable designs not yet applied, standard carry-propagate adders with fast carry chain logic. As a control unit and a platform for purely software implementation an embedded soft-core processor Altera NIOS is employed. All implementations use large embedded memory blocks available in recent FPGAs. Speed and logic requirements comparisons are performed on the optimized software and combined hardware-software designs in Altera FPGAs. The issues of targeting a design specifically for a FPGA are considered taking into account the underlying architecture imposed by the target FPGA technology. It is shown that the coprocessors based on carry-save adders and carry-propagate adders provide comparable results in constrained FPGA implementations but in case of carry-propagate logic, the solution requires less embedded memory and provides some additional implementation advantages presented in the paper
Customisable arithmetic hardware designs
Imperial Users onl