Search CORE

436 research outputs found

Parallel Architectures for Planetary Exploration Requirements (PAPER)

Author: Cezzar Ruknet
Sen Ranjan K.
Publication venue
Publication date
Field of study

The Parallel Architectures for Planetary Exploration Requirements (PAPER) project is essentially research oriented towards technology insertion issues for NASA's unmanned planetary probes. It was initiated to complement and augment the long-term efforts for space exploration with particular reference to NASA/LaRC's (NASA Langley Research Center) research needs for planetary exploration missions of the mid and late 1990s. The requirements for space missions as given in the somewhat dated Advanced Information Processing Systems (AIPS) requirements document are contrasted with the new requirements from JPL/Caltech involving sensor data capture and scene analysis. It is shown that more stringent requirements have arisen as a result of technological advancements. Two possible architectures, the AIPS Proof of Concept (POC) configuration and the MAX Fault-tolerant dataflow multiprocessor, were evaluated. The main observation was that the AIPS design is biased towards fault tolerance and may not be an ideal architecture for planetary and deep space probes due to high cost and complexity. The MAX concepts appears to be a promising candidate, except that more detailed information is required. The feasibility for adding neural computation capability to this architecture needs to be studied. Key impact issues for architectural design of computing systems meant for planetary missions were also identified

NASA Technical Reports Server

Approaches to multiprocessor error recovery using an on-chip interconnect subsystem

Author: Vadlamani Ramakrishna P
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2010
Field of study

For future multicores, a dedicated interconnect subsystem for on-chip monitors was found to be highly beneficial in terms of scalability, performance and area. In this thesis, such a monitor network (MNoC) is used for multicores to support selective error identification and recovery and maintain target chip reliability in the context of dynamic voltage and frequency scaling (DVFS). A selective shared memory multiprocessor recovery is performed using MNoC in which, when an error is detected, only the group of processors sharing an application with the affected processors are recovered. Although the use of DVFS in contemporary multicores provides significant protection from unpredictable thermal events, a potential side effect can be an increased processor exposure to soft errors. To address this issue, a flexible fault prevention and recovery mechanism has been developed to selectively enable a small amount of per-core dual modular redundancy (DMR) in response to increased vulnerability, as measured by the processor architectural vulnerability factor (AVF). Our new algorithm for DMR deployment aims to provide a stable effective soft error rate (SER) by using DMR in response to DVFS caused by thermal events. The algorithm is implemented in real-time on the multicore using MNoC and controller which evaluates thermal information and multicore performance statistics in addition to error information. DVFS experiments with a multicore simulator using standard benchmarks show an average 6% improvement in overall power consumption and a stable SER by using selective DMR versus continuous DMR deployment

CiteSeerX

ScholarWorks@UMass Amherst

Modeling and Analysis of Fault Tolerant Multistage Interconnection Networks

Author: Choi Minsu
Lombardi Fabrizio
Park Nohpill
Publication venue: Scholars\u27 Mine
Publication date: 01/10/2003
Field of study

Performance and reliability are two of the most crucial issues in today\u27s high-performance instrumentation and measurement systems. High speed and compact density multistage interconnection networks (MINs) are widely-used subsystems in different applications. New performance models are proposed to evaluate a novel fault tolerant MIN arrangement, thereby assuring performance and reliability with high confidence level. A concurrent fault detection and recovery scheme for MINs is considered by rerouting over redundant interconnection links under stringent real-time constraints for digital instrumentation as sensor networks. A switch architecture for concurrent testing and diagnosis is proposed. New performance models are developed and used to evaluate the compound effect of fault tolerant operation (inclusive of testing, diagnosis, and recovery) on the overall throughput and delay. Results are shown for single transient and permanent stuck-at faults on links and storage units in the switching elements. It is shown that performance degradation due to fault tolerance is graceful while performance degradation without fault recovery is unacceptable

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

CGIN : a modified gamma interconnection network with multiple disjoint paths

Author: Chuang Po-jen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

[[abstract]]To ensure high terminal reliability for the Gamma interconnection network (GIN), we propose a new modified GIN, referred to as CGIN (cyclic Gamma interconnection network) as its connecting patterns between stages exhibit a cyclic feature. The fact that there exist multiple disjoint paths between any communication pair for all types of CGINs makes it possible to tolerate any arbitrary single fault and to accomplish enhanced terminal reliability accordingly. The performance of the CGIN is also evaluated through simulation[[notice]]需補會議地點、主辦單位[[conferencetype]]國際[[conferencedate]]19941219~1994122

Tamkang University Institutional Repository

Fault-tolerant vertical link design for effective 3D stacking

Author: Duato Marín José Francisco
Flich Cardo José
Hernández Luz Carles
Roca Pérez Antoni
Silla Jiménez Federico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

[EN] Recently, 3D stacking has been proposed to alleviate the memory bandwidth limitation arising in chip multiprocessors (CMPs). As the number of integrated cores in the chip increases the access to external memory becomes the bottleneck, thus demanding larger memory amounts inside the chip. The most accepted solution to implement vertical links between stacked dies is by using Through Silicon Vias (TSVs). However, TSVs are exposed to misalignment and random defects compromising the yield of the manufactured 3D chip. A common solution to this problem is by over-provisioning, thus impacting on area and cost. In this paper, we propose a fault-tolerant vertical link design. With its adoption, fault-tolerant vertical links can be implemented in a 3D chip design at low cost without the need of adding redundant TSVs (no over-provision). Preliminary results are very promising as the fault-tolerant vertical link design increases switch area only by 6.69% while the achieved interconnect yield tends to 100%.This work was supported by the Spanish MEC and MICINN, as well as European Comission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04. It was also partly supported by the project NaNoC (project label 248972) which is funded by the European Commission within the Research Programme FP7.Hernández Luz, C.; Roca Pérez, A.; Flich Cardo, J.; Silla Jiménez, F.; Duato Marín, JF. (2011). Fault-tolerant vertical link design for effective 3D stacking. IEEE Computer Architecture Letters. 10(2):41-44. https://doi.org/10.1109/L-CA.2011.17S414410

Crossref

RiuNet

Simulation of Meshes in a Faulty Supercube with Unbounded Expansion

Author: [[corresponding]]Wu Shih-Jung
Lin Jen-Chih
Publication venue: 'AICIT'
Publication date
Field of study

[[abstract]]Reconfiguring meshes in a faulty Supercube is investigated in the paper. The result can readily be used in the optimal embedding of a mesh (or a torus) of processors in a faulty Supercube with unbounded expansion. There are embedding algorithms proposed in this paper. These embedding algorithms show a mesh with any number of nodes can be embedded into a faulty Supercube with load 1, congestion 1, and dilation 3 such that O(n2-w2) faults can be tolerated, where n is the dimension of the Supercube and 2w is the number of nodes of the mesh. The meshes and hypercubes are widely used interconnection architectures in parallel computing, grid computing, sensor network, and cloud computing. In addition, the Supercubes are superior to hypercube in terms of embedding a mesh and torus under faults. Therefore, we can easily port the parallel or distributed algorithms developed for these structuring of mesh and torus to the Supercube.[[notice]]補正完畢[[journaltype]]國外[[incitationindex]]EI[[ispeerreviewed]]Y[[booktype]]紙本[[countrycodes]]KO

Tamkang University Institutional Repository

Multiprocessor system design tutor : expert system approach

Author: Kamdar Rakesh
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1990
Field of study

To increase computational bandwidth and system resilience, integration of several microprocessors in a single system becomes necessary. The overall throughput and efficiency of such a system is directly dependent on the hardware and software interconnection supported by the basic microprocessor chip. Sometimes it becomes difficult to put together all the information for design criteria and all the design related formulas. The approach made here is to continuously update the hardware and software information in the database related to a given microprocessor. This information can be accessed at any time for efficient design solution. Intel 80386 and Motorola 68020 microprocessors are reviewed in detail and all the information is stored in a database. The above approach has been implemented in the Multiprocessor System Design - Tutor (MSDT) using the Informix relational database management system. MSDT is a menu driven system implemented to help the system design engineers. MSDT stores and maintains information related to multiprocessor system design, which includes multiprocessor system requirements, microprocessor characteristics, the role of microprocessor in multiprocessor system design and interconnection network configurations and their performance factors. This information is presented to the user via the screen building utility of Informix-4GL; the user can also get a hard copy of all the information within the database by running the report generation utility. MSDT also has security password protection. The system has a good help facility available for the design process. At any given time the user can update the data in the table using this menu driven system. The system is intended to grow into a complete evaluation system based on the Informix-4GL. It is developed on the basis of Fourth Generation Language which has a screen building utility, a menu building utility, a report writer and a window manager. This system will suggest the candidate microprocessor and suitable support chips and interconnection techniques for different applications

Digital Commons @ New Jersey Institute of Technology (NJIT)

Fault-tolerant interconnection networks for multiprocessor systems

Author: Nassar Hamed Mohamed
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1989
Field of study

Interconnection networks represent the backbone of multiprocessor systems. A failure in the network, therefore, could seriously degrade the system performance. For this reason, fault tolerance has been regarded as a major consideration in interconnection network design. This thesis presents two novel techniques to provide fault tolerance capabilities to three major networks: the Baseline network, the Benes network and the Clos network. First, the Simple Fault Tolerance Technique (SFT) is presented. The SFT technique is in fact the result of merging two widely known interconnection mechanisms: a normal interconnection network and a shared bus. This technique is most suitable for networks with small switches, such as the Baseline network and the Benes network. For the Clos network, whose switches may be large for the SFT, another technique is developed to produce the Fault-Tolerant Clos (FTC) network. In the FTC, one switch is added to each stage. The two techniques are described and thoroughly analyzed

Digital Commons @ New Jersey Institute of Technology (NJIT)