Search CORE

2,169 research outputs found

Fault-tolerant computer study

Author: Avizienis A. A.
Ercegovac M. D.
Rennels D. A.
Publication venue
Publication date
Field of study

A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed

NASA Technical Reports Server

Design of a fault tolerant airborne digital computer. Volume 1: Architecture

Author: Goldberg J.
Green M. W.
Levitt K. N.
Neumann P. G.
Wensley J. H.
Publication venue
Publication date
Field of study

This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive

NASA Technical Reports Server

Fault-tolerant building-block computer study

Author: Rennels D. A.
Publication venue
Publication date
Field of study

Ultra-reliable core computers are required for improving the reliability of complex military systems. Such computers can provide reliable fault diagnosis, failure circumvention, and, in some cases serve as an automated repairman for their host systems. A small set of building-block circuits which can be implemented as single very large integration devices, and which can be used with off-the-shelf microprocessors and memories to build self checking computer modules (SCCM) is described. Each SCCM is a microcomputer which is capable of detecting its own faults during normal operation and is described to communicate with other identical modules over one or more Mil Standard 1553A buses. Several SCCMs can be connected into a network with backup spares to provide fault-tolerant operation, i.e. automated recovery from faults. Alternative fault-tolerant SCCM configurations are discussed along with the cost and reliability associated with their implementation

NASA Technical Reports Server

Ultrafast Codes for Multiple Adjacent Error Correction and Double Error Detection

Author: Baraza Calvo Juan Carlos
Gil Tomás Daniel Antonio
Gil Pedro
Gracia-Morán Joaquín
Saiz-Adalid Luis-J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

(c) 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.[EN] Reliable computer systems employ error control codes (ECCs) to protect information from errors. For example, memories are frequently protected using single error correction-double error detection (SEC-DED) codes. ECCs are traditionally designed to minimize the number of redundant bits, as they are added to each word in the whole memory. Nevertheless, using an ECC introduces encoding and decoding latencies, silicon area usage and power consumption. In other computer units, these parameters should be optimized, and redundancy would be less important. For example, protecting registers against errors remains a major concern for deep sub-micron systems due to technology scaling. In this case, an important requirement for register protection is to keep encoding and decoding latencies as short as possible. Ultrafast error control codes achieve very low delays, independently of the word length, increasing the redundancy. This paper summarizes previous works on Ultrafast codes (SEC and SEC-DED), and proposes new codes combining double error detection and adjacent error correction. We have implemented, synthesized and compared different Ultrafast codes with other state-of-the-art fast codes. The results show the validity of the approach, achieving low latencies and a good balance with silicon area and power consumption.This work was supported in part by the Spanish Government under Project TIN2016-81075-R, and in part by the Primeros Proyectos de Investigacion, Vicerrectorado de Investigacion, Innovacion y Transferencia de la Universitat Politecnica de Valencia (UPV), Valencia, Spain, under Project PAID-06-18 20190032.Saiz-Adalid, L.; Gracia-Morán, J.; Gil Tomás, DA.; Baraza Calvo, JC.; Gil, P. (2019). Ultrafast Codes for Multiple Adjacent Error Correction and Double Error Detection. IEEE Access. 7:151131-151143. https://doi.org/10.1109/ACCESS.2019.2947315S151131151143

Crossref

RiuNet

DeSyRe: on-Demand System Reliability

Author: Armato Antonino
Bouganis Christos-Savvas
Falsafi Babak
Gaydadjiev Georgi
Isaza Sebastian
Malek Alirad
Mariani Riccardo
Pnevmatikatos Dionisios N
Pradhan Dhiraj K
Rauwerda Gerard
Seepers Robert
Shafik Rishad Ahmed
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michail
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

Southampton (e-Prints Soton)

Crossref

EUR Research Repository

Chalmers Research

Chalmers Publication Library

Explore Bristol Research