14 research outputs found

    Parallel Spectral Division Via The Generalized Matrix Sign Function

    No full text
    . In this paper we demonstrate the parallelism of the spectral division via the matrix sign function for the generalized nonsymmetric eigenproblem. We employ the so-called generalized Newton iterative scheme in order to compute the sign function of a matrix pair. A recent study has allowed considerable reduction (by 75%) in the computational cost of this iteration, making this approach competitive when compared to the traditional QZ algorithm. The matrix sign function is thus revealed as an efficient and reliable spectral division method for applications that only require partial information of the eigenspectrum. For applications which require complete information of the eigendistribution, the matrix sign function can be used as an initial divide-and-conquer method, combined with the QZ algorithm for the last stages. The experimental results on an IBM SP2 multicomputer demonstrate the parallel performance (efficiency around 60--80%) and scalability of this approach. Key words. General..

    Using Recursion to Boost ATLAS’s Performance

    No full text

    MPI-2: Extending the message-passing interface

    No full text
    This paper describes current activities of the MPI-2 Forum. The MPI - 2 Forum is a group of parallel computer vendors, library writers, and application specialists working together to define a set of extensions to MPI (Message Passing Interface). MPI was defined by the same process and now has many implementations, both vendor- proprietary and publicly available, for a wide variety of parallel computing environments. In this paper we present the salient aspects of the evolving MPI-2 document as it now stands. We discuss proposed extensions and enhancements to MPI in the areas of dynamic process management, one-sided operations, collective operations, new language binding, real-time computing, external interfaces, and miscellaneous topics

    Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator

    Get PDF
    The design of future parallel computers requires rapid simulation of target designs running realistic workloads. These simulations have been accelerated using two techniques: direct execution and the use of a parallel host. Historically, these techniques have been considered to have poor portability. This paper identi- fies and describes the implementation of four key oper- ations necessary to make such simulation portable across a variety of parallel computers. These four operations are: calculation of target execution time, simulation of features of interest, communication of target messages, and synchronization of host proces- sors. Portable implementations of these four operations have allowed us to easily run the Wisconsin Wind Tun- nel II (WWT II)—a parallel, discrete-event, direct-exe- cution simulator—across a wide range of platforms, such as desktop workstations, a SUN Enterprise server, a cluster of workstations, and a cluster of symmetric multiprocessing nodes. We plan to release WWTII in August, 1997. We also plan to port WWT II to the IBM SP2. We find that for two benchmarks, WWT II demon- strates both good performance and good scalability. Uniprocessor WWT II simulates one target cycle of a 32- node target machine in 114 and 166 host cycles respec- tively for the two benchmarks on a SUN UltraSPARC. Parallel WWT II achieves speedups between 4.1-5.4 on 8 host processors in our three parallel machine configura- tions
    corecore