Search CORE

700 research outputs found

Wildcard dimensions, coding theory and fault-tolerant meshes and hypercubes

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

Hypercubes, meshes and tori are well known interconnection networks for parallel computers. The sets of edges in those graphs can be partitioned to dimensions. It is well known that the hypercube can be extended by adding a wildcard dimension resulting in a folded hypercube that has better fault-tolerant and communication capabilities. First we prove that the folded hypercube is optimal in the sense that only a single wildcard dimension can be added to the hypercube. We then investigate the idea of adding wildcard dimensions to d-dimensional meshes and tori. Using techniques from error correcting codes we construct d-dimensional meshes and tori with wildcard dimensions. Finally, we show how these constructions can be used to tolerate edge and node faults in mesh and torus networks

CiteSeerX

Caltech Authors

Communication aspects of parallel processing

Author
Publication venue: Massachusetts Institute of Technology, Laboratory for Information and Decision Systems]
Publication date: 01/01/1987
Field of study

Cover title.Includes bibliographical references.Supported in part by the Air Force Office of Scientific Research. AFOSR-88-0032Cüneyt Özveren

DSpace@MIT

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

A study of the communication cost of the FFT on torus multicomputers

Author: Díaz de Cerio Ripalda Luis Manuel
González Colás Antonio María
Valero García Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

The computation of a one-dimensional FFT on a c-dimensional torus multicomputer is analyzed. Different approaches are proposed which differ in the way they use the interconnection network. The first approach is based on the multidimensional index mapping technique for the FFT computation. The second approach starts from a hypercube algorithm and then embeds the hypercube onto the torus. The third approach reduces the communication cost of the hypercube algorithm by pipelining the communication operations. A novel methodology to pipeline the communication operations on a torus is proposed. Analytical models are presented to compare the different approaches. This comparison study shows that the best approach depends on the number of dimensions of the torus and the communication start-up and transfer times. The analytical models allow us to select the most efficient approach for the available machine.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

Author: Baruch E.
Cannon L. E.
Choi J
Choi J.
Choi J.
Solomonik E.
Szabo A.
van de Geijn R. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2015
Field of study

A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).Comment: 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.0504

arXiv.org e-Print Archive

Crossref

Adapting the interior point method for the solution of linear programs on high performance computers

Author: Ashcroft
Bixby
Chen
Duff
Forrest
Gay
George
Golub
Karmarkar
Lai
Liu
Megiddo
Monteiro
Publication venue: Brunel University
Publication date: 01/01/1991
Field of study

In this paper we describe a unified algorithmic framework for the interior point method (IPM) of solving Linear Programs (LPs) which allows us to adapt it over a range of high performance computer architectures. We set out the reasons as to why IPM makes better use of high performance computer architecture than the sparse simplex method. In the inner iteration of the IPM a search direction is computed using Newton or higher order methods. Computationally this involves solving a sparse symmetric positive definite (SSPD) system of equations. The choice of direct and indirect methods for the solution of this system and the design of data structures to take advantage of coarse grain parallel and massively parallel computer architectures are considered in detail. Finally, we present experimental results of solving NETLIB test problems on examples of these architectures and put forward arguments as to why integration of the system within sparse simplex is beneficial

CiteSeerX

Crossref

Brunel University Research Archive