Search CORE

2,644 research outputs found

Tolerating multiple faults in multistage interconnection networks with minimal extra stages

Author: Bruck Jehoshua
Fan Chenggong Charles
Publication venue
Publication date: 01/09/2000
Field of study

Adams and Siegel (1982) proposed an extra stage cube interconnection network that tolerates one switch failure with one extra stage. We extend their results and discover a class of extra stage interconnection networks that tolerate multiple switch failures with a minimal number of extra stages. Adopting the same fault model as Adams and Siegel, the faulty switches can be bypassed by a pair of demultiplexer/multiplexer combinations. It is easy to show that, to maintain point to point and broadcast connectivities, there must be at least S extra stages to tolerate I switch failures. We present the first known construction of an extra stage interconnection network that meets this lower-bound. This 12-dimensional multistage interconnection network has n+f stages and tolerates I switch failures. An n-bit label called mask is used for each stage that indicates the bit differences between the two inputs coming into a common switch. We designed the fault-tolerant construction such that it repeatedly uses the singleton basis of the n-dimensional vector space as the stage mask vectors. This construction is further generalized and we prove that an n-dimensional multistage interconnection network is optimally fault-tolerant if and only if the mask vectors of every n consecutive stages span the n-dimensional vector space

Caltech Authors

Generalized hypercube structures and hyperswitch communication network

Author: Young Steven D.
Publication venue
Publication date
Field of study

This paper discusses an ongoing study that uses a recent development in communication control technology to implement hybrid hypercube structures. These architectures are similar to binary hypercubes, but they also provide added connectivity between the processors. This added connectivity increases communication reliability while decreasing the latency of interprocessor message passing. Because these factors directly determine the speed that can be obtained by multiprocessor systems, these architectures are attractive for applications such as remote exploration and experimentation, where high performance and ultrareliability are required. This paper describes and enumerates these architectures and discusses how they can be implemented with a modified version of the hyperswitch communication network (HCN). The HCN is analyzed because it has three attractive features that enable these architectures to be effective: speed, fault tolerance, and the ability to pass multiple messages simultaneously through the same hyperswitch controller

NASA Technical Reports Server

Fault-tolerant interconnection networks for multiprocessor systems

Author: Nassar Hamed Mohamed
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1989
Field of study

Interconnection networks represent the backbone of multiprocessor systems. A failure in the network, therefore, could seriously degrade the system performance. For this reason, fault tolerance has been regarded as a major consideration in interconnection network design. This thesis presents two novel techniques to provide fault tolerance capabilities to three major networks: the Baseline network, the Benes network and the Clos network. First, the Simple Fault Tolerance Technique (SFT) is presented. The SFT technique is in fact the result of merging two widely known interconnection mechanisms: a normal interconnection network and a shared bus. This technique is most suitable for networks with small switches, such as the Baseline network and the Benes network. For the Clos network, whose switches may be large for the SFT, another technique is developed to produce the Fault-Tolerant Clos (FTC) network. In the FTC, one switch is added to each stage. The two techniques are described and thoroughly analyzed

Digital Commons @ New Jersey Institute of Technology (NJIT)

Correcting soft errors online in fast fourier transform

Author: Chen Jieyang
Chen Zizhong
Li Hongbo
Li Sihuan
Liang Xin
Liu Yuanlai
Ouyang Kaiming
Song Fengguang
Tao Dingwen
Wu Panruo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

While many algorithm-based fault tolerance (ABFT) schemes have been proposed to detect soft errors offline in the fast Fourier transform (FFT) after computation finishes, none of the existing ABFT schemes detect soft errors online before the computation finishes. This paper presents an online ABFT scheme for FFT so that soft errors can be detected online and the corrupted computation can be terminated in a much more timely manner. We also extend our scheme to tolerate both arithmetic errors and memory errors, develop strategies to reduce its fault tolerance overhead and improve its numerical stability and fault coverage, and finally incorporate it into the widely used FFTW library - one of the today's fastest FFT software implementations. Experimental results demonstrate that: (1) the proposed online ABFT scheme introduces much lower overhead than the existing offline ABFT schemes; (2) it detects errors in a much more timely manner; and (3) it also has higher numerical stability and better fault coverage

IUPUIScholarWorks

Evaluation of Two Terminal Reliability of Fault-tolerant Multistage Interconnection Networks

Author: Barpanda Dr. N. K.
Dash Dr. R. K.
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 21/08/2020
Field of study

This paper iOntroduces a new method based on multi-decomposition for predicting the two terminal reliability of fault-tolerant multistage interconnection networks. The method is well supported by an efficient algorithm which runs polynomially. The method is well illustrated by taking a network consists of eight nodes and twelve links as an example. The proposed method is found to be simple, general and efficient and thus is as such applicable to all types of fault-tolerant multistage interconnection networks. The results show this method provides a greater accurate probability when applied on fault-tolerant multistage interconnection networks. Reliability of two important MINs are evaluated by using the proposed method

Interscience Research Network

Algorithms in fault-tolerant CLOS networks

Author: Lee Hyunyeop
Publication venue: Digital Commons @ NJIT
Publication date: 31/10/1994
Field of study

Digital Commons @ New Jersey Institute of Technology (NJIT)