18,335 research outputs found

    Fault-tolerant meshes with minimal numbers of spares

    Get PDF
    This paper presents several techniques for adding fault-tolerance to distributed memory parallel computers. More formally, given a target graph with n nodes, we create a fault-tolerant graph with n + k nodes such that given any set of k or fewer faulty nodes, the remaining graph is guaranteed to contain the target graph as a fault-free subgraph. As a result, any algorithm designed for the target graph will run with no slowdown in the presence of k or fewer node faults, regardless of their distribution. We present fault-tolerant graphs for target graphs which are 2-dimensional meshes, tori, eight-connected meshes and hexagonal meshes. In all cases our fault-tolerant graphs have smaller degree than any previously known graphs with the same properties

    Fault-tolerant meshes and hypercubes with minimal numbers of spares

    Get PDF
    Many parallel computers consist of processors connected in the form of a d-dimensional mesh or hypercube. Two- and three-dimensional meshes have been shown to be efficient in manipulating images and dense matrices, whereas hypercubes have been shown to be well suited to divide-and-conquer algorithms requiring global communication. However, even a single faulty processor or communication link can seriously affect the performance of these machines. This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. Our approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. We optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k = 1, we present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. We also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, we give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes

    Optimal Vertex Fault Tolerant Spanners (for fixed stretch)

    Full text link
    A kk-spanner of a graph GG is a sparse subgraph HH whose shortest path distances match those of GG up to a multiplicative error kk. In this paper we study spanners that are resistant to faults. A subgraph HGH \subseteq G is an ff vertex fault tolerant (VFT) kk-spanner if HFH \setminus F is a kk-spanner of GFG \setminus F for any small set FF of ff vertices that might "fail." One of the main questions in the area is: what is the minimum size of an ff fault tolerant kk-spanner that holds for all nn node graphs (as a function of ff, kk and nn)? This question was first studied in the context of geometric graphs [Levcopoulos et al. STOC '98, Czumaj and Zhao SoCG '03] and has more recently been considered in general undirected graphs [Chechik et al. STOC '09, Dinitz and Krauthgamer PODC '11]. In this paper, we settle the question of the optimal size of a VFT spanner, in the setting where the stretch factor kk is fixed. Specifically, we prove that every (undirected, possibly weighted) nn-node graph GG has a (2k1)(2k-1)-spanner resilient to ff vertex faults with Ok(f11/kn1+1/k)O_k(f^{1 - 1/k} n^{1 + 1/k}) edges, and this is fully optimal (unless the famous Erdos Girth Conjecture is false). Our lower bound even generalizes to imply that no data structure capable of approximating distGF(s,t)dist_{G \setminus F}(s, t) similarly can beat the space usage of our spanner in the worst case. We also consider the edge fault tolerant (EFT) model, defined analogously with edge failures rather than vertex failures. We show that the same spanner upper bound applies in this setting. Our data structure lower bound extends to the case k=2k=2 (and hence we close the EFT problem for 33-approximations), but it falls to Ω(f1/21/(2k)n1+1/k)\Omega(f^{1/2 - 1/(2k)} \cdot n^{1 + 1/k}) for k3k \ge 3. We leave it as an open problem to close this gap.Comment: To appear in SODA 201

    Minimum survivable graphs with bounded distance increase

    Get PDF
    We study in graphs properties related to fault-tolerance in case a node fails. A graph G is k-self-repairing, where k is a non-negative integer, if after the removal of any vertex no distance in the surviving graph increases by more than k. In the design of interconnection networks such graphs guarantee good fault-tolerance properties. We give upper and lower bounds on the minimum number of edges of a k-self-repairing graph for prescribed k and n, where n is the order of the graph. We prove that the problem of finding, in a k-self-repairing graph, a spanning k-self-repairing subgraph of minimum size is NP-Hard
    corecore