Fault-tolerant meshes and hypercubes with minimal numbers of spares

Bruck, Jehoshua; Cypher, Robert; Ho, Ching-Tien

research

Fault-tolerant meshes and hypercubes with minimal numbers of spares

Authors: Jehoshua Bruck
Robert Cypher
Ching-Tien Ho
Publication date: 1 September 1993
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

Many parallel computers consist of processors connected in the form of a d-dimensional mesh or hypercube. Two- and three-dimensional meshes have been shown to be efficient in manipulating images and dense matrices, whereas hypercubes have been shown to be well suited to divide-and-conquer algorithms requiring global communication. However, even a single faulty processor or communication link can seriously affect the performance of these machines. This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. Our approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. We optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k = 1, we present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. We also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, we give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Caltech Authors - Main

oai:authors.library.caltech.ed...

Last time updated on 09/07/2019

Caltech Authors - Main

oai:authors.library.caltech.ed...

Last time updated on 05/02/2021