26,758 research outputs found
On Polynomial Multiplication in Chebyshev Basis
In a recent paper Lima, Panario and Wang have provided a new method to
multiply polynomials in Chebyshev basis which aims at reducing the total number
of multiplication when polynomials have small degree. Their idea is to use
Karatsuba's multiplication scheme to improve upon the naive method but without
being able to get rid of its quadratic complexity. In this paper, we extend
their result by providing a reduction scheme which allows to multiply
polynomial in Chebyshev basis by using algorithms from the monomial basis case
and therefore get the same asymptotic complexity estimate. Our reduction allows
to use any of these algorithms without converting polynomials input to monomial
basis which therefore provide a more direct reduction scheme then the one using
conversions. We also demonstrate that our reduction is efficient in practice,
and even outperform the performance of the best known algorithm for Chebyshev
basis when polynomials have large degree. Finally, we demonstrate a linear time
equivalence between the polynomial multiplication problem under monomial basis
and under Chebyshev basis
Parallel Integer Polynomial Multiplication
We propose a new algorithm for multiplying dense polynomials with integer
coefficients in a parallel fashion, targeting multi-core processor
architectures. Complexity estimates and experimental comparisons demonstrate
the advantages of this new approach
A low multiplicative complexity fast recursive DCT-2 algorithm
A fast Discrete Cosine Transform (DCT) algorithm is introduced that can be of
particular interest in image processing. The main features of the algorithm are
regularity of the graph and very low arithmetic complexity. The 16-point
version of the algorithm requires only 32 multiplications and 81 additions. The
computational core of the algorithm consists of only 17 nontrivial
multiplications, the rest 15 are scaling factors that can be compensated in the
post-processing. The derivation of the algorithm is based on the algebraic
signal processing theory (ASP).Comment: 4 pages, 2 figure
Exact Sparse Matrix-Vector Multiplication on GPU's and Multicore Architectures
We propose different implementations of the sparse matrix--dense vector
multiplication (\spmv{}) for finite fields and rings \Zb/m\Zb. We take
advantage of graphic card processors (GPU) and multi-core architectures. Our
aim is to improve the speed of \spmv{} in the \linbox library, and henceforth
the speed of its black box algorithms. Besides, we use this and a new
parallelization of the sigma-basis algorithm in a parallel block Wiedemann rank
implementation over finite fields
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared
- …