Search CORE

1,297 research outputs found

Advanced information processing system: Local system services

Author: Alger Linda
Burkhardt Laura
Stasiowski Peter
Whittredge Roy
Publication venue
Publication date
Field of study

The Advanced Information Processing System (AIPS) is a multi-computer architecture composed of hardware and software building blocks that can be configured to meet a broad range of application requirements. The hardware building blocks are fault-tolerant, general-purpose computers, fault-and damage-tolerant networks (both computer and input/output), and interfaces between the networks and the computers. The software building blocks are the major software functions: local system services, input/output, system services, inter-computer system services, and the system manager. The foundation of the local system services is an operating system with the functions required for a traditional real-time multi-tasking computer, such as task scheduling, inter-task communication, memory management, interrupt handling, and time maintenance. Resting on this foundation are the redundancy management functions necessary in a redundant computer and the status reporting functions required for an operator interface. The functional requirements, functional design and detailed specifications for all the local system services are documented

NASA Technical Reports Server

DeSyRe: on-Demand System Reliability

Author: Armato Antonino
Bouganis Christos-Savvas
Falsafi Babak
Gaydadjiev Georgi
Isaza Sebastian
Malek Alirad
Mariani Riccardo
Pnevmatikatos Dionisios N
Pradhan Dhiraj K
Rauwerda Gerard
Seepers Robert
Shafik Rishad Ahmed
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michail
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Chalmers Publication Library

Explore Bristol Research

On the use of embedded debug features for permanent and transient fault resilience in microprocessors

Author: Entrena Arrontes Luis Alfonso
Gallardo Campos Margarita
García Valderas Mario
Grosso M.
López Ongil Celia
Portela García Marta
Sonza Reorda Matteo
Publication venue: 'Elsevier BV'
Publication date: 01/07/2012
Field of study

Microprocessor-based systems are employed in an increasing number of applications where dependability is a major constraint. For this reason detecting faults arising during normal operation while introducing the least possible penalties is a main concern. Different forms of redundancy have been employed to ensure error-free behavior, while error detection mechanisms can be employed where some detection latency is tolerated. However, the high complexity and the low observability of microprocessors internal resources make the identification of adequate on-line error detection strategies a very challenging task, which can be tackled at circuit or system level. Concerning system-level strategies, a common limitation is in the mechanism used to monitor program execution and then detect errors as soon as possible, so as to reduce their impact on the application. In this work, an on-line error detection approach based on the reuse of available debugging infrastructures is proposed. The approach can be applied to different system architectures profiting from the debug trace port available in most of current microprocessors to observe possible misbehaviors. Two microprocessors have been used to study the applicability of the solution. LEON3 and ARM7TDMI. Results show that the presented fault detection technique enhances observability and thus error detection abilities in microprocessor-based systems without requiring modifications on the core architecture

Universidad Carlos III de Madrid e-Archivo

A New Hybrid Nonintrusive Error-Detection Technique Using Dual Control-Flow Monitoring

Author: Du B.
Entrena Arrontes Luis Alfonso
Lindoso Muñoz Almudena
Parra Avellaneda Luis Isaías
Portela García Marta
Sonza Reorda Matteo
Sterpone Luca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2014
Field of study

Hybrid error-detection techniques combine software techniques with an external hardware module that monitors the execution of a microprocessor. The external hardware module typically observes the control flow at the input or at the output of the microprocessor and compares it with the expected one. This paper proposes a new hybrid technique that monitors the control flow at both points and compares them to detect possible errors. The proposed approach does not require any software modification to detect control-flow errors. Fault-injection campaigns have been performed on an LEON3 microprocessor. The results show full control-flow error detection with no performance degradation and small area overhead. A complete solution can be obtained by complementing the proposed approach with software fault-tolerance techniques for data errors.This work was supported in part by the Spanish Government under Contract TEC2010-22095-C03-03

Universidad Carlos III de Madrid e-Archivo

Recommended from our members

Efficient Memory-Protected Integration of Add-On Software Subsystems in Small Embedded Automotive Applications

Author: Khan A
Schäfer A
Zetlmeisl M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2007
Field of study

Current innovations in the automotive industry evolve mainly in the electronics and software domain. This leads to an increasing integration of additional software subsystems into already existing electronic control units (ECUs) to cope with the raised amount and complexity of present ECUs in modern high-end vehicles. This paper discusses different approaches which are required to integrate such add-on software subsystems in an isolated memory domain, and considers particularly the special needs of small embedded systems—including the limited hardware support. Special focus is brought to the efficient detection of malicious memory accesses, as well as the benefits of a thereupon possible and adaptable failure-handling strategy. All investigations are based on a developed memory-protection framework which has been tailored to the special needs of a sample vehicle dynamics control system. Its usage allows the combination of. integrating additional subsystems without reducing the main application’s availability

Brunel University Research Archive

Study of fault-tolerant software technology

Author: Broglio C.
Goldberg J.
Hitt E.
Levitt K.
Slivinski T.
Webb J.
Wild C.
Publication venue
Publication date
Field of study

Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance

NASA Technical Reports Server

Airborne Advanced Reconfigurable Computer System (ARCS)

Author: Bjurman B. E.
Jenkins G. M.
Masreliez C. J.
Mcclellan K. L.
Templeman J. E.
Publication venue
Publication date
Field of study

A digital computer subsystem fault-tolerant concept was defined, and the potential benefits and costs of such a subsystem were assessed when used as the central element of a new transport's flight control system. The derived advanced reconfigurable computer system (ARCS) is a triple-redundant computer subsystem that automatically reconfigures, under multiple fault conditions, from triplex to duplex to simplex operation, with redundancy recovery if the fault condition is transient. The study included criteria development covering factors at the aircraft's operation level that would influence the design of a fault-tolerant system for commercial airline use. A new reliability analysis tool was developed for evaluating redundant, fault-tolerant system availability and survivability; and a stringent digital system software design methodology was used to achieve design/implementation visibility

NASA Technical Reports Server