27,012 research outputs found
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared
Composite Cyclotomic Fourier Transforms with Reduced Complexities
Discrete Fourier transforms~(DFTs) over finite fields have widespread
applications in digital communication and storage systems. Hence, reducing the
computational complexities of DFTs is of great significance. Recently proposed
cyclotomic fast Fourier transforms (CFFTs) are promising due to their low
multiplicative complexities. Unfortunately, there are two issues with CFFTs:
(1) they rely on efficient short cyclic convolution algorithms, which has not
been investigated thoroughly yet, and (2) they have very high additive
complexities when directly implemented. In this paper, we address both issues.
One of the main contributions of this paper is efficient bilinear 11-point
cyclic convolution algorithms, which allow us to construct CFFTs over
GF. The other main contribution of this paper is that we propose
composite cyclotomic Fourier transforms (CCFTs). In comparison to previously
proposed fast Fourier transforms, our CCFTs achieve lower overall complexities
for moderate to long lengths, and the improvement significantly increases as
the length grows. Our 2047-point and 4095-point CCFTs are also first efficient
DFTs of such lengths to the best of our knowledge. Finally, our CCFTs are also
advantageous for hardware implementations due to their regular and modular
structure.Comment: submitted to IEEE trans on Signal Processin
- …