3,625 research outputs found

    Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication

    Full text link
    Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first ever implementation of the 3D SpGEMM formulation that also exploits multiple (intra-node and inter-node) levels of parallelism, achieving significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research

    Efficient Big Integer Multiplication and Squaring Algorithms for Cryptographic Applications

    Get PDF
    Public-key cryptosystems are broadly employed to provide security for digital information. Improving the efficiency of public-key cryptosystem through speeding up calculation and using fewer resources are among themain goals of cryptography research. In this paper, we introduce new symbols extracted from binary representation of integers called Big-ones.We present a modified version of the classicalmultiplication and squaring algorithms based on the Big-ones to improve the efficiency of big integermultiplication and squaring in number theory based cryptosystems. Compared to the adopted classical and Karatsuba multiplication algorithms for squaring, the proposed squaring algorithm is 2 to 3.7 and 7.9 to 2.5 times faster for squaring 32-bit and 8-Kbit numbers, respectively. The proposed multiplication algorithm is also 2.3 to 3.9 and 7 to 2.4 times faster for multiplying 32-bit and 8-Kbit numbers, respectively.The number theory based cryptosystems, which are operating in the range of 1-Kbit to 4-Kbit integers, are directly benefited from the proposed method since multiplication and squaring are the main operations in most of these systems

    New Structured Matrix Methods for Real and Complex Polynomial Root-finding

    Full text link
    We combine the known methods for univariate polynomial root-finding and for computations in the Frobenius matrix algebra with our novel techniques to advance numerical solution of a univariate polynomial equation, and in particular numerical approximation of the real roots of a polynomial. Our analysis and experiments show efficiency of the resulting algorithms.Comment: 18 page

    Analysis of Internally Bandlimited Multistage Cubic-Term Generators for RF Receivers

    Get PDF
    Adaptive feedforward error cancellation applied to correct distortion arising from third-order nonlinearities in RF receivers requires low-noise low-power reference cubic nonidealities. Multistage cubic-term generators utilizing cascaded nonlinear operations are ideal in this regard, but the frequency response of the interstage circuitry can introduce errors into the cubing operation. In this paper, an overview of the use of cubic-term generators in receivers relative to other applications is presented. An interstage frequency response plan is presented for a receiver cubic-term generator and is shown to function for arbitrary three-signal third-order intermodulation generation. The noise of such circuits is also considered and is shown to depend on the total incoming signal power across a particular frequency band. Finally, the effects of the interstage group delay are quantified in the context of a relevant communication standard requirement
    corecore