3,625 research outputs found
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many
high-performance graph algorithms as well as for some linear solvers, such as
algebraic multigrid. The scaling of existing parallel implementations of SpGEMM
is heavily bound by communication. Even though 3D (or 2.5D) algorithms have
been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi
matrices, those algorithms had not been implemented in practice and their
complexities had not been analyzed for the general case. In this work, we
present the first ever implementation of the 3D SpGEMM formulation that also
exploits multiple (intra-node and inter-node) levels of parallelism, achieving
significant speedups over the state-of-the-art publicly available codes at all
levels of concurrencies. We extensively evaluate our implementation and
identify bottlenecks that should be subject to further research
Efficient Big Integer Multiplication and Squaring Algorithms for Cryptographic Applications
Public-key cryptosystems are broadly employed to provide security for digital information. Improving the efficiency of public-key
cryptosystem through speeding up calculation and using fewer resources are among themain goals of cryptography research. In this
paper, we introduce new symbols extracted from binary representation of integers called Big-ones.We present a modified version
of the classicalmultiplication and squaring algorithms based on the Big-ones to improve the efficiency of big integermultiplication
and squaring in number theory based cryptosystems. Compared to the adopted classical and Karatsuba multiplication algorithms
for squaring, the proposed squaring algorithm is 2 to 3.7 and 7.9 to 2.5 times faster for squaring 32-bit and 8-Kbit numbers,
respectively. The proposed multiplication algorithm is also 2.3 to 3.9 and 7 to 2.4 times faster for multiplying 32-bit and 8-Kbit
numbers, respectively.The number theory based cryptosystems, which are operating in the range of 1-Kbit to 4-Kbit integers, are
directly benefited from the proposed method since multiplication and squaring are the main operations in most of these systems
New Structured Matrix Methods for Real and Complex Polynomial Root-finding
We combine the known methods for univariate polynomial root-finding and for
computations in the Frobenius matrix algebra with our novel techniques to
advance numerical solution of a univariate polynomial equation, and in
particular numerical approximation of the real roots of a polynomial. Our
analysis and experiments show efficiency of the resulting algorithms.Comment: 18 page
Analysis of Internally Bandlimited Multistage Cubic-Term Generators for RF Receivers
Adaptive feedforward error cancellation applied to correct distortion arising from third-order nonlinearities in RF receivers requires low-noise low-power reference cubic nonidealities. Multistage cubic-term generators utilizing cascaded nonlinear operations are ideal in this regard, but the frequency response of the interstage circuitry can introduce errors into the cubing operation. In this paper, an overview of the use of cubic-term generators in receivers relative to other applications is presented. An interstage frequency response plan is presented for a receiver cubic-term generator and is shown to function for arbitrary three-signal third-order intermodulation generation. The noise of such circuits is also considered and is shown to depend on the total incoming signal power across a particular frequency band. Finally, the effects of the interstage group delay are quantified in the context of a relevant communication standard requirement
- …