24,344 research outputs found

    QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment

    Get PDF
    Previous studies have reported that common dense linear algebra operations do not achieve speed up by using multiple geographical sites of a computational grid. Because such operations are the building blocks of most scientific applications, conventional supercomputers are still strongly predominant in high-performance computing and the use of grids for speeding up large-scale scientific problems is limited to applications exhibiting parallelism at a higher level. We have identified two performance bottlenecks in the distributed memory algorithms implemented in ScaLAPACK, a state-of-the-art dense linear algebra library. First, because ScaLAPACK assumes a homogeneous communication network, the implementations of ScaLAPACK algorithms lack locality in their communication pattern. Second, the number of messages sent in the ScaLAPACK algorithms is significantly greater than other algorithms that trade flops for communication. In this paper, we present a new approach for computing a QR factorization -- one of the main dense linear algebra kernels -- of tall and skinny matrices in a grid computing environment that overcomes these two bottlenecks. Our contribution is to articulate a recently proposed algorithm (Communication Avoiding QR) with a topology-aware middleware (QCG-OMPI) in order to confine intensive communications (ScaLAPACK calls) within the different geographical sites. An experimental study conducted on the Grid'5000 platform shows that the resulting performance increases linearly with the number of geographical sites on large-scale problems (and is in particular consistently higher than ScaLAPACK's).Comment: Accepted at IPDPS10. (IEEE International Parallel & Distributed Processing Symposium 2010 in Atlanta, GA, USA.

    Minimizing Communication for Eigenproblems and the Singular Value Decomposition

    Full text link
    Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all O(n3)O(n^3)-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

    Optical-inertia space sextant for an advanced space navigation system, phase B

    Get PDF
    Optical-inertia space sextant for advanced space navigation syste

    Optimizing local protocols implementing nonlocal quantum gates

    Full text link
    We present a method of optimizing recently designed protocols for implementing an arbitrary nonlocal unitary gate acting on a bipartite system. These protocols use only local operations and classical communication with the assistance of entanglement, and are deterministic while also being "one-shot", in that they use only one copy of an entangled resource state. The optimization is in the sense of minimizing the amount of entanglement used, and it is often the case that less entanglement is needed than with an alternative protocol using two-way teleportation.Comment: 11 pages, 1 figure. This is a companion paper to arXiv:1001.546

    Implementing the SCAN language by neural networks

    Get PDF
    The real-time encryption of pictures is an important subject for many applications, e.g. television broadcast stations, network security, etc. The paper shows how the previously introduced SCAN encryption method can be easily implemented using binary neural network autoassociative memory

    Optimizing local protocols implementing nonlocal quantum gates

    Get PDF
    We present a method of optimizing recently designed protocols for implementing an arbitrary nonlocal unitary gate acting on a bipartite system. These protocols use only local operations and classical communication with the assistance of entanglement, and are deterministic while also being "one-shot", in that they use only one copy of an entangled resource state. The optimization is in the sense of minimizing the amount of entanglement used, and it is often the case that less entanglement is needed than with an alternative protocol using two-way teleportation.Comment: 11 pages, 1 figure. This is a companion paper to arXiv:1001.546
    • …
    corecore