1 research outputs found

    Communication-Optimal Parallel Standard and Karatsuba Integer Multiplication in the Distributed Memory Model

    Full text link
    We present COPSIM a parallel implementation of standard integer multiplication for the distributed memory setting, and COPK a parallel implementation of Karatsuba's fast integer multiplication algorithm for a distributed memory setting. When using P\mathcal{P} processors, each equipped with a local non-shared memory, to compute the product of tho nn-digits integer numbers, under mild conditions, our algorithms achieve optimal speedup of the computational time. That is, O(n2/P)\mathcal{O}\left(n^2/\mathcal{P}\right) for COPSIM, and O(nlog⁑23/P)\mathcal{O}\left(n^{\log_2 3}/\mathcal{P}\right) for COPK. The total amount of memory required across the processors is O(n)\mathcal{O}\left(n\right), that is, within a constant factor of the minimum space required to store the input values. We rigorously analyze the Input/Output (I/O) cost of the proposed algorithms. We show that their bandwidth cost (i.e., the number of memory words sent or received by at least one processors) matches asymptotically corresponding known I/O lower bounds, and their latency (i.e., the number of messages sent or received in the algorithm's critical execution path) is asymptotically within a multiplicative factor O(log⁑22P)\mathcal{O}\left(\log^2_2 \mathcal{P}\right) of the corresponding known I/O lower bounds. Hence, our algorithms are asymptotically optimal with respect to the bandwidth cost and almost asymptotically optimal with respect to the latency cost
    corecore