Search CORE

11 research outputs found

Tolerating faults in hypercubes using subcube partitioning

Author: Bruck Jehoshua
Cypher Robert
Soroker Danny
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/1992
Field of study

We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an n-dimensional hypercube (n-cube) with n^c faulty components contains a fault-free subgraph that can implement a large class of hypercube algorithms with only a constant factor slowdown. In addition, our approach yields practical implementations for small numbers of faults. For example, we show that any regular algorithm can be implemented on an n-cube that has at most n-1 faults with slowdowns of at most 2 for computation and at most 4 for communication. To the best of our knowledge this is the first result showing that an n-cube can tolerate more than O(n) arbitrarily placed faults with a constant factor slowdown

Caltech Authors

Recommended from our members

Partitioning and broadcasting in hypercubes in the presence of faulty communication links

Author: Lee Suckjun
Publication venue: Oregon State University. Department of Computer Science
Publication date
Field of study

The problem of broadcasting in faulty hypercubes is considered, based upon a strategy of partitioning the faulty hypercube into subcubes in which currently known algorithms can be implemented. Three similar partitioning and broadcasting algorithms for an n-dimensional hypercube in the presence of up to (n² + 2n - c) / 4 faulty communication links are presented, where c = 4 if n is an even number or c = 3 if n is an odd number. The most efficient algorithm is implemented in 1.3n + 6log(n) + 9 time units. To the best of our knowledge, this algorithm is the most efficient one for an n-dimensional hypercube in the presence of O(n²) faults

ScholarsArchive@OSU

Embedding cube-connected cycles graphs into faulty hypercubes

Author: Bruck Jehoshua
Cypher Robert
Soroker Danny
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/1994
Field of study

We consider the problem of embedding a cube-connected cycles graph (CCC) into a hypercube with edge faults. Our main result is an algorithm that, given a list of faulty edges, computes an embedding of the CCC that spans all of the nodes and avoids all of the faulty edges. The algorithm has optimal running time and tolerates the maximum number of faults (in a worst-case setting). Because ascend-descend algorithms can be implemented efficiently on a CCC, this embedding enables the implementation of ascend-descend algorithms, such as bitonic sort, on hypercubes with edge faults. We also present a number of related results, including an algorithm for embedding a CCC into a hypercube with edge and node faults and an algorithm for embedding a spanning torus into a hypercube with edge faults

Caltech Authors

Fault-tolerance embedding of rings and arrays in star and pancake graphs

Author: Gajjala Ramesh Reddy
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/1995
Field of study

The star and pancake graphs are useful interconnection networks for connecting processors in a parallel and distributed computing environment. The star network has been widely studied and is shown to possess attactive features like sublogarithmic diameter, node and edge symmetry and high resilience. The star/pancake interconnection graphs, {dollar}S\sb{n}/P\sb{n}{dollar} of dimension n have n! nodes connected by {dollar}{(n-1).n!\over2}{dollar} edges. Due to their large number of nodes and interconnections, they are prone to failure of one or more nodes/edges; In this thesis, we present methods to embed Hamiltonian paths (H-path) and Hamiltonian cycles (H-cycle) in a star graph {dollar}S\sb{n}{dollar} and pancake graph {dollar}P\sb{n}{dollar} in a faulty environment. Such embeddings are important for solving computational problems, formulated for array and ring topologies, on star and pancake graphs. The models considered include single-processor failure, double-processor failure, and multiple-processor failures. All the models are applied to an H-cycle which is formed by visiting all the ({dollar}{n!\over4!})\ S\sb4/P\sb4{dollar}s in an {dollar}S\sb{n}/P\sb{n}{dollar} in a particular order. Each {dollar}S\sb4/P\sb4{dollar} has an entry node where the cycle/path enters that particular {dollar}S\sb4/P\sb4{dollar} and an exit node where the path leaves it. Distributed algorithms for embedding hamiltonian cycle in the presence of multiple faults, are also presented for both {dollar}S\sb{n}{dollar} and {dollar}P\sb{n}{dollar}

University of Nevada, Las Vegas Repository

Deadlock-free routing in a faulty hypercube

Author: Lehman Eric (Eric Allen), 1970-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 41-42).by Eric Lehman.M.S

DSpace@MIT

On strong fault tolerance (or strong Menger-connectivity) of multicomputer networks

Author: Oh Eunseuk
Publication venue: Texas A&M University
Publication date: 01/01/2004
Field of study

As the size of networks increases continuously, dealing with networks with faulty nodes becomes unavoidable. In this dissertation, we introduce a new measure for network fault tolerance, the strong fault tolerance (or strong Menger-connectivity)in multicomputer networks, and study the strong fault tolerance for popular multicomputer network structures. Let G be a network in which all nodes have degree d. We say that G is strongly fault tolerant if it has the following property: Let Gf be a copy of G with at most d - 2 faulty nodes. Then for any pair of non-faulty nodes u and v in Gf , there are min{degf (u), degf (v)} node-disjoint paths in Gf from u to v, where degf (u) and degf (v) are the degrees of the nodes u and v in Gf, respectively. First we study the strong fault tolerance for the popular network structures such as star networks and hypercube networks. We show that the star networks and the hypercube networks are strongly fault tolerant and develop efficient algorithms that construct the maximum number of node-disjoint paths of nearly optimal or optimal length in these networks when they contain faulty nodes. Our algorithms are optimal in terms of their time complexity. In addition to studying the strong fault tolerance, we also investigate a more realistic concept to describe the ability of networks for tolerating faults. The traditional definition of fault tolerance, sustaining at most d - 1faulty nodes for a regular graph G of degree d, reflects a very rare situation. In many cases, there is a chance that a routing path between two given nodes can be constructed though the network may have more faulty nodes than its degree. In this dissertation, we study the fault tolerance of hypercube networks under a probability model. When each node of the n-dimensional hypercube network has an independent failure probability p, we develop algorithms that, with very high probability, can construct a fault-free path when the hypercube network can sustain up to 2np faulty nodes

CiteSeerX

Texas A&M Repository

Recommended from our members

Resource placement, data rearrangement, and Hamiltonian cycles in torus networks

Author: Bae Myung Mun
Publication venue: 'Oregon State University'
Publication date
Field of study

Many parallel machines, both commercial and experimental, have been/are being designed with toroidal interconnection networks. For a given number of nodes, the torus has a relatively larger diameter, but better cost/performance tradeoffs, such as higher channel bandwidth, and lower node degree, when compared to the hypercube. Thus, the torus is becoming a popular topology for the interconnection network of a high performance parallel computers. In a multicomputer, the resources, such as I/O devices or software packages, are distributed over the networks. The first part of the thesis investigates efficient methods of distributing resources in a torus network. Three classes of placement methods are studied. They are (1) distant-t placement problem: in this case, any non-resource node is at a distance of at most t from some resource nodes, (2) j-adjacency problem: here, a non-resource node is adjacent to at least j resource nodes, and (3) generalized placement problem: a non-resource node must be a distance of at most t from at least j resource nodes. This resource placement technique can be applied to allocating spare processors to provide fault-tolerance in the case of the processor failures. Some efficient spare processor placement methods and reconfiguration schemes in the case of processor failures are also described. In a torus based parallel system, some algorithms give best performance if the data are distributed to processors numbered in Cartesian order; in some other cases, it is better to distribute the data to processors numbered in Gray code order. Since the placement patterns may be changed dynamically, it is essential to find efficient methods of rearranging the data from Gray code order to Cartesian order and vice versa. In the second part of the thesis, some efficient methods for data transfer from Cartesian order to radix order and vice versa are developed. The last part of the thesis gives results on generating edge disjoint Hamiltonian cycles in k-ary n-cubes, hypercubes, and 2D tori. These edge disjoint cycles are quite useful for many communication algorithms

ScholarsArchive@OSU

Reliable low latency I/O in torus-based interconnection networks

Author: Azeez Babatunde
Publication venue: Texas A&M University
Publication date: 25/04/2007
Field of study

In today's high performance computing environment I/O remains the main bottleneck in achieving the optimal performance expected of the ever improving processor and memory technologies. Interconnection networks therefore combines processing units, system I/O and high speed switch network fabric into a new paradigm of I/O based network. It decouples the system into computational and I/O interconnections each allowing "any-to-any" communications among processors and I/O devices unlike the shared model in bus architecture. The computational interconnection, a network of processing units (compute-nodes), is used for inter-processor communication in carrying out computation tasks, while the I/O interconnection manages the transfer of I/O requests between the compute-nodes and the I/O or storage media through some dedicated I/O processing units (I /O-nodes). Considering the special functions performed by the I/O nodes, their placement and reliability become important issues in improving the overall performance of the interconnection system. This thesis focuses on design and topological placement of I/O-nodes in torus based interconnection networks, with the aim of reducing I/O communication latency between compute-nodes and I/O-nodes even in the presence of faulty I/O-nodes. We propose an efficient and scalable relaxed quasi-perfect placement scheme using Lee distance error correction code such that compute-nodes are at distance-t or at most distance-t+1 from an I/O-node for a given t. This scheme provides a better and optimal alternative placement than quasi perfect placement when perfect placement cannot be found for a particular torus. Furthermore, in the occurrence of faulty I/O-nodes, the placement scheme is also used in determining other alternative I/O-nodes for rerouting I/O traffic from affected compute-nodes with minimal slowdown. In order to guarantee the quality of service required of inter-processor communication, a scheduling algorithm was developed at the router level to prioritize message forwarding according to inter-process and I/O messages with the former given higher priority. Our simulation results show that relaxed quasi-perfect outperforms quasi-perfect and the conventional I/O placement (where I/O nodes are concentrated at the base of the torus interconnection) with little degradation in inter-process communication performance. Also the fault tolerant redirection scheme provides a minimal slowdown, especially when the number of faulty I/O nodes is less than half of the initial available I/O nodes

Texas A&M Repository

Vulnerabilidad del diámetro de ciertas familias de grafos

Author: Simó Mezquita Ester
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/1995
Field of study

En este trabajo hemos realizado un estudio completo sobre la vulnerabilidad del diámetro de dos familias de grafos:Los grafos impares y los n-cubo plegados. En el caso de los grafos impares, hemos probado que la eliminación de cualquier conjunto de vértices o ramas de cardinalidad k menor que el grado incrementa el diámetro de los subgrafos resultantes a lo sumo en dos unidades.Asimismo, hemos estudiado como varían los parámetros d'k y d'k' cuando eliminamos k vértices o ramas del grafo.Análogamente, para los grafos cubo plegado hemos estudiado como varían estos parámetros cuando eliminamos k vértices o ramas del grafo, para valores de k inferiores al grado del grafo. Por los resultados obtenidos podemos afirmar que ambas familias de grafos son adecuadas para la implementación de redes de interconexión tolerantes a fallos.Otro estudio que hemos realizado en esta tesis trata sobre el diseño de redes densas fiables. Y hemos obtenido cuatro grafos (A,D,D,1) que mejoran cinco cotas presentadas en la tabla de grandes grafos (A,D,D,1)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura