Search CORE

374 research outputs found

Recommended from our members

Fault-tolerant routing algorithm in meshes with solid faults

Author: Bose Bella
Oregon State University. Dept. of Computer Science
Park Seungjin
Youn Jong-Hoon
Publication venue: Corvallis, OR : Oregon State University, Dept. of Computer Science
Publication date
Field of study

A fault-tolerant routing method that can tolerate solid faults using only two virtual channels is presented. The proposed routing algorithm not only uses a fewer number of virtual channels but also tolerates f-chains in the meshes. It is shown that the proposed algorithm is deadlock-free and livelock-free in meshes when it has nonoverlapping multiple f-regions.Keywords: wormhole routing, fault-tolerant, mesh networks, solid fault

ScholarsArchive@OSU

On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes

Author: Jiang Zhen
Wu Jie
Publication venue: Digital Commons @ West Chester University
Publication date: 01/01/2004
Field of study

Digital Commons @ West Chester University

Fault-tolerant meshes and hypercubes with minimal numbers of spares

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/1993
Field of study

Many parallel computers consist of processors connected in the form of a d-dimensional mesh or hypercube. Two- and three-dimensional meshes have been shown to be efficient in manipulating images and dense matrices, whereas hypercubes have been shown to be well suited to divide-and-conquer algorithms requiring global communication. However, even a single faulty processor or communication link can seriously affect the performance of these machines. This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. Our approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. We optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k = 1, we present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. We also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, we give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes

Caltech Authors

New Fault Tolerant Multicast Routing Techniques to Enhance Distributed-Memory Systems Performance

Author: Shaheen Masoud Esmail Masoud
Publication venue: The Aquila Digital Community
Publication date: 01/12/2013
Field of study

Distributed-memory systems are a key to achieve high performance computing and the most favorable architectures used in advanced research problems. Mesh connected multicomputer are one of the most popular architectures that have been implemented in many distributed-memory systems. These systems must support communication operations efficiently to achieve good performance. The wormhole switching technique has been widely used in design of distributed-memory systems in which the packet is divided into small flits. Also, the multicast communication has been widely used in distributed-memory systems which is one source node sends the same message to several destination nodes. Fault tolerance refers to the ability of the system to operate correctly in the presence of faults. Development of fault tolerant multicast routing algorithms in 2D mesh networks is an important issue. This dissertation presents, new fault tolerant multicast routing algorithms for distributed-memory systems performance using wormhole routed 2D mesh. These algorithms are described for fault tolerant routing in 2D mesh networks, but it can also be extended to other topologies. These algorithms are a combination of a unicast-based multicast algorithm and tree-based multicast algorithms. These algorithms works effectively for the most commonly encountered faults in mesh networks, f-rings, f-chains and concave fault regions. It is shown that the proposed routing algorithms are effective even in the presence of a large number of fault regions and large size of fault region. These algorithms are proved to be deadlock-free. Also, the problem of fault regions overlap is solved. Four essential performance metrics in mesh networks will be considered and calculated; also these algorithms are a limited-global-information-based multicasting which is a compromise of local-information-based approach and global-information-based approach. Data mining is used to validate the results and to enlarge the sample. The proposed new multicast routing techniques are used to enhance the performance of distributed-memory systems. Simulation results are presented to demonstrate the efficiency of the proposed algorithms

Aquila Digital Community

Data recovery in wormhole routing networks in hypercubes and meshes

Author: Alowayed Mohammad S.
Publication venue
Publication date: 01/12/1997
Field of study

SHAREOK repository

Reliability-aware and energy-efficient system level design for networks-on-chip

Author: Zou Yong
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2015
Field of study

2015 Spring.Includes bibliographical references.With CMOS technology aggressively scaling into the ultra-deep sub-micron (UDSM) regime and application complexity growing rapidly in recent years, processors today are being driven to integrate multiple cores on a chip. Such chip multiprocessor (CMP) architectures offer unprecedented levels of computing performance for highly parallel emerging applications in the era of digital convergence. However, a major challenge facing the designers of these emerging multicore architectures is the increased likelihood of failure due to the rise in transient, permanent, and intermittent faults caused by a variety of factors that are becoming more and more prevalent with technology scaling. On-chip interconnect architectures are particularly susceptible to faults that can corrupt transmitted data or prevent it from reaching its destination. Reliability concerns in UDSM nodes have in part contributed to the shift from traditional bus-based communication fabrics to network-on-chip (NoC) architectures that provide better scalability, performance, and utilization than buses. In this thesis, to overcome potential faults in NoCs, my research began by exploring fault-tolerant routing algorithms. Under the constraint of deadlock freedom, we make use of the inherent redundancy in NoCs due to multiple paths between packet sources and sinks and propose different fault-tolerant routing schemes to achieve much better fault tolerance capabilities than possible with traditional routing schemes. The proposed schemes also use replication opportunistically to optimize the balance between energy overhead and arrival rate. As 3D integrated circuit (3D-IC) technology with wafer-to-wafer bonding has been recently proposed as a promising candidate for future CMPs, we also propose a fault-tolerant routing scheme for 3D NoCs which outperforms the existing popular routing schemes in terms of energy consumption, performance and reliability. To quantify reliability and provide different levels of intelligent protection, for the first time, we propose the network vulnerability factor (NVF) metric to characterize the vulnerability of NoC components to faults. NVF determines the probabilities that faults in NoC components manifest as errors in the final program output of the CMP system. With NVF aware partial protection for NoC components, almost 50% energy cost can be saved compared to the traditional approach of comprehensively protecting all NoC components. Lastly, we focus on the problem of fault-tolerant NoC design, that involves many NP-hard sub-problems such as core mapping, fault-tolerant routing, and fault-tolerant router configuration. We propose a novel design-time (RESYN) and a hybrid design and runtime (HEFT) synthesis framework to trade-off energy consumption and reliability in the NoC fabric at the system level for CMPs. Together, our research in fault-tolerant NoC routing, reliability modeling, and reliability aware NoC synthesis substantially enhances NoC reliability and energy-efficiency beyond what is possible with traditional approaches and state-of-the-art strategies from prior work

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Constructing Two Edge-Disjoint Hamiltonian Cycles in Locally Twisted Cubes

Author: Hung Ruo-Wei
Publication venue
Publication date: 01/01/2010
Field of study

The

n

-dimensional hypercube network

Q_n

is one of the most popular interconnection networks since it has simple structure and is easy to implement. The

n

-dimensional locally twisted cube, denoted by

LTQ_n

, an important variation of the hypercube, has the same number of nodes and the same number of connections per node as

Q_n

. One advantage of

LTQ_n

is that the diameter is only about half of the diameter of

Q_n

. Recently, some interesting properties of

LTQ_n

were investigated. In this paper, we construct two edge-disjoint Hamiltonian cycles in the locally twisted cube

LTQ_n

, for any integer

n\geqslant 4

. The presence of two edge-disjoint Hamiltonian cycles provides an advantage when implementing algorithms that require a ring structure by allowing message traffic to be spread evenly across the locally twisted cube.Comment: 7 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX