11 research outputs found
Tolerating faults in hypercubes using subcube partitioning
We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an n-dimensional hypercube (n-cube) with n^c faulty components contains a fault-free subgraph that can implement a large class of hypercube algorithms with only a constant factor slowdown. In addition, our approach yields practical implementations for small numbers of faults. For example, we show that any regular algorithm can be implemented on an n-cube that has at most n-1 faults with slowdowns of at most 2 for computation and at most 4 for communication.
To the best of our knowledge this is the first result showing that an n-cube can tolerate more than O(n) arbitrarily placed faults with a constant factor slowdown
Recommended from our members
Partitioning and broadcasting in hypercubes in the presence of faulty communication links
The problem of broadcasting in faulty hypercubes is considered, based upon a strategy of partitioning the faulty hypercube into subcubes in which currently known algorithms can be implemented. Three similar partitioning and broadcasting algorithms for an n-dimensional hypercube in the presence of up to (n² + 2n - c) / 4 faulty communication links are presented, where c = 4 if n is an even number or c = 3 if n is an odd number. The most efficient algorithm is implemented in 1.3n + 6log(n) + 9 time units. To the best of our knowledge, this algorithm is the most efficient one for an n-dimensional hypercube in the presence of O(n²) faults
Embedding cube-connected cycles graphs into faulty hypercubes
We consider the problem of embedding a cube-connected cycles graph (CCC) into a hypercube with edge faults. Our main result is an algorithm that, given a list of faulty edges, computes an embedding of the CCC that spans all of the nodes and avoids all of the faulty edges. The algorithm has optimal running time and tolerates the maximum number of faults (in a worst-case setting). Because ascend-descend algorithms can be implemented efficiently on a CCC, this embedding enables the implementation of ascend-descend algorithms, such as bitonic sort, on hypercubes with edge faults. We also present a number of related results, including an algorithm for embedding a CCC into a hypercube with edge and node faults and an algorithm for embedding a spanning torus into a hypercube with edge faults
Fault-tolerance embedding of rings and arrays in star and pancake graphs
The star and pancake graphs are useful interconnection networks for connecting processors in a parallel and distributed computing environment. The star network has been widely studied and is shown to possess attactive features like sublogarithmic diameter, node and edge symmetry and high resilience. The star/pancake interconnection graphs, {dollar}S\sb{n}/P\sb{n}{dollar} of dimension n have n! nodes connected by {dollar}{(n-1).n!\over2}{dollar} edges. Due to their large number of nodes and interconnections, they are prone to failure of one or more nodes/edges; In this thesis, we present methods to embed Hamiltonian paths (H-path) and Hamiltonian cycles (H-cycle) in a star graph {dollar}S\sb{n}{dollar} and pancake graph {dollar}P\sb{n}{dollar} in a faulty environment. Such embeddings are important for solving computational problems, formulated for array and ring topologies, on star and pancake graphs. The models considered include single-processor failure, double-processor failure, and multiple-processor failures. All the models are applied to an H-cycle which is formed by visiting all the ({dollar}{n!\over4!})\ S\sb4/P\sb4{dollar}s in an {dollar}S\sb{n}/P\sb{n}{dollar} in a particular order. Each {dollar}S\sb4/P\sb4{dollar} has an entry node where the cycle/path enters that particular {dollar}S\sb4/P\sb4{dollar} and an exit node where the path leaves it. Distributed algorithms for embedding hamiltonian cycle in the presence of multiple faults, are also presented for both {dollar}S\sb{n}{dollar} and {dollar}P\sb{n}{dollar}
Deadlock-free routing in a faulty hypercube
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 41-42).by Eric Lehman.M.S
On strong fault tolerance (or strong Menger-connectivity) of multicomputer networks
As the size of networks increases continuously, dealing with networks with faulty nodes becomes unavoidable. In this dissertation, we introduce a new measure for network fault tolerance, the strong fault tolerance (or strong Menger-connectivity)in multicomputer networks, and study the strong fault tolerance for popular multicomputer network structures. Let G be a network in which all nodes have degree d. We say that G is strongly fault tolerant if it has the following property: Let Gf be a copy of G with at most d - 2 faulty nodes. Then for any pair of non-faulty nodes u and v in Gf , there are min{degf (u), degf (v)} node-disjoint paths in Gf from u to v, where degf (u) and degf (v) are the degrees of the nodes u and v in Gf, respectively.
First we study the strong fault tolerance for the popular network structures such as star networks and hypercube networks. We show that the star networks and the hypercube networks are strongly fault tolerant and develop efficient algorithms that construct the maximum number of node-disjoint paths of nearly optimal or optimal
length in these networks when they contain faulty nodes. Our algorithms are optimal in terms of their time complexity. In addition to studying the strong fault tolerance, we also investigate a more realistic concept to describe the ability of networks for tolerating faults. The traditional definition of fault tolerance, sustaining at most d - 1faulty nodes for a regular graph G of degree d, reflects a very rare situation. In many cases, there is a chance
that a routing path between two given nodes can be constructed though the network may have more faulty nodes than its degree. In this dissertation, we study the fault tolerance of hypercube networks under a probability model. When each node of the n-dimensional hypercube network has an independent failure probability p, we develop algorithms that, with very high probability, can construct a fault-free path
when the hypercube network can sustain up to 2np faulty nodes
Recommended from our members
Resource placement, data rearrangement, and Hamiltonian cycles in torus networks
Many parallel machines, both commercial and experimental, have been/are being designed with toroidal interconnection networks. For a given number of nodes, the torus has a relatively larger diameter, but better cost/performance tradeoffs, such as higher channel bandwidth, and lower node degree, when compared to the hypercube. Thus, the torus is becoming a popular topology for the interconnection network of a high performance parallel computers.
In a multicomputer, the resources, such as I/O devices or software packages, are distributed over the networks. The first part of the thesis investigates efficient methods of distributing resources in a torus network. Three classes of placement methods are studied. They are (1) distant-t placement problem: in this case, any non-resource node is at a distance of at most t from some resource nodes, (2) j-adjacency problem: here, a non-resource node is adjacent to at least j resource nodes, and (3) generalized placement problem: a non-resource node must be a distance of at most t from at least j resource nodes.
This resource placement technique can be applied to allocating spare processors to provide fault-tolerance in the case of the processor failures. Some efficient
spare processor placement methods and reconfiguration schemes in the case of processor failures are also described.
In a torus based parallel system, some algorithms give best performance if the data are distributed to processors numbered in Cartesian order; in some other cases, it is better to distribute the data to processors numbered in Gray code order. Since the placement patterns may be changed dynamically, it is essential to find efficient methods of rearranging the data from Gray code order to Cartesian order and vice versa. In the second part of the thesis, some efficient methods for data transfer from Cartesian order to radix order and vice versa are developed.
The last part of the thesis gives results on generating edge disjoint Hamiltonian cycles in k-ary n-cubes, hypercubes, and 2D tori. These edge disjoint cycles are quite useful for many communication algorithms
Reliable low latency I/O in torus-based interconnection networks
In today's high performance computing environment I/O remains the main bottleneck in
achieving the optimal performance expected of the ever improving processor and
memory technologies. Interconnection networks therefore combines processing units,
system I/O and high speed switch network fabric into a new paradigm of I/O based
network. It decouples the system into computational and I/O interconnections each
allowing "any-to-any" communications among processors and I/O devices unlike the
shared model in bus architecture. The computational interconnection, a network of
processing units (compute-nodes), is used for inter-processor communication in carrying
out computation tasks, while the I/O interconnection manages the transfer of I/O requests
between the compute-nodes and the I/O or storage media through some dedicated I/O
processing units (I /O-nodes). Considering the special functions performed by the I/O
nodes, their placement and reliability become important issues in improving the overall
performance of the interconnection system.
This thesis focuses on design and topological placement of I/O-nodes in torus based
interconnection networks, with the aim of reducing I/O communication latency between
compute-nodes and I/O-nodes even in the presence of faulty I/O-nodes. We propose an
efficient and scalable relaxed quasi-perfect placement scheme using Lee distance error
correction code such that compute-nodes are at distance-t or at most distance-t+1 from an
I/O-node for a given t. This scheme provides a better and optimal alternative placement
than quasi perfect placement when perfect placement cannot be found for a particular
torus. Furthermore, in the occurrence of faulty I/O-nodes, the placement scheme is also
used in determining other alternative I/O-nodes for rerouting I/O traffic from affected
compute-nodes with minimal slowdown. In order to guarantee the quality of service
required of inter-processor communication, a scheduling algorithm was developed at the router level to prioritize message forwarding according to inter-process and I/O messages
with the former given higher priority.
Our simulation results show that relaxed quasi-perfect outperforms quasi-perfect and the
conventional I/O placement (where I/O nodes are concentrated at the base of the torus
interconnection) with little degradation in inter-process communication performance.
Also the fault tolerant redirection scheme provides a minimal slowdown, especially when
the number of faulty I/O nodes is less than half of the initial available I/O nodes
Vulnerabilidad del diámetro de ciertas familias de grafos
En este trabajo hemos realizado un estudio completo sobre la vulnerabilidad del diámetro de dos familias de grafos:Los grafos impares y los n-cubo plegados. En el caso de los grafos impares, hemos probado que la eliminaciĂłn de cualquier conjunto de vĂ©rtices o ramas de cardinalidad k menor que el grado incrementa el diámetro de los subgrafos resultantes a lo sumo en dos unidades.Asimismo, hemos estudiado como varĂan los parámetros d'k y d'k' cuando eliminamos k vĂ©rtices o ramas del grafo.Análogamente, para los grafos cubo plegado hemos estudiado como varĂan estos parámetros cuando eliminamos k vĂ©rtices o ramas del grafo, para valores de k inferiores al grado del grafo. Por los resultados obtenidos podemos afirmar que ambas familias de grafos son adecuadas para la implementaciĂłn de redes de interconexiĂłn tolerantes a fallos.Otro estudio que hemos realizado en esta tesis trata sobre el diseño de redes densas fiables. Y hemos obtenido cuatro grafos (A,D,D,1) que mejoran cinco cotas presentadas en la tabla de grandes grafos (A,D,D,1)