Search CORE

56,455 research outputs found

Immunotronics - novel finite-state-machine architectures with built-in self-test using self-nonself differentiation

Author: Bradley D.W.
Tyrrell A.M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

A novel approach to hardware fault tolerance is demonstrated that takes inspiration from the human immune system as a method of fault detection. The human immune system is a remarkable system of interacting cells and organs that protect the body from invasion and maintains reliable operation even in the presence of invading bacteria or viruses. This paper seeks to address the field of electronic hardware fault tolerance from an immunological perspective with the aim of showing how novel methods based upon the operation of the immune system can both complement and create new approaches to the development of fault detection mechanisms for reliable hardware systems. In particular, it is shown that by use of partial matching, as prevalent in biological systems, high fault coverage can be achieved with the added advantage of reducing memory requirements. The development of a generic finite-state-machine immunization procedure is discussed that allows any system that can be represented in such a manner to be "immunized" against the occurrence of faulty operation. This is demonstrated by the creation of an immunized decade counter that can detect the presence of faults in real tim

CiteSeerX

Crossref

White Rose Research Online

A fault-tolerant intelligent robotic control system

Author: Marzwell Neville I.
Tso Kam Sing
Publication venue
Publication date
Field of study

This paper describes the concept, design, and features of a fault-tolerant intelligent robotic control system being developed for space and commercial applications that require high dependability. The comprehensive strategy integrates system level hardware/software fault tolerance with task level handling of uncertainties and unexpected events for robotic control. The underlying architecture for system level fault tolerance is the distributed recovery block which protects against application software, system software, hardware, and network failures. Task level fault tolerance provisions are implemented in a knowledge-based system which utilizes advanced automation techniques such as rule-based and model-based reasoning to monitor, diagnose, and recover from unexpected events. The two level design provides tolerance of two or more faults occurring serially at any level of command, control, sensing, or actuation. The potential benefits of such a fault tolerant robotic control system include: (1) a minimized potential for damage to humans, the work site, and the robot itself; (2) continuous operation with a minimum of uncommanded motion in the presence of failures; and (3) more reliable autonomous operation providing increased efficiency in the execution of robotic tasks and decreased demand on human operators for controlling and monitoring the robotic servicing routines

NASA Technical Reports Server

FaulTM: Fault-tolerance using hardware transactional memory

Author: Cristal Kestelman Adrián
Hur Ibrahim
Unsal Osman Sabri
Valero Cortés Mateo
Yalcin Gulay
Publication venue
Publication date: 01/01/2010
Field of study

Fault-tolerance has become an essential concern for processor designers due to increasing soft-error rates. In this study, we are motivated by the fact that Transactional Memory (TM) hardware provides an ideal base upon which to build a fault-tolerant system. We show how it is possible to provide low-cost faulttolerance for serial programs by using a minimallymodified Hardware Transactional Memory (HTM) that features lazy conflict detection, lazy data versioning. This scheme, called FaulTM, employs a hybrid hardware-software fault-tolerance technique. On the software side, FaulTM programming model is able to provide the flexibility for programmers to decide between performance and reliability. Our experimental results indicate that FaulTM produces relatively less performance overhead by reducing the number of comparisons and by leveraging already proposed TM hardware. We also conduct experiments which indicate that the baseline FaulTM design has a good error coverage. To the best of our knowledge, this is the first architectural fault-tolerance proposal using Hardware Transactional Memory.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

A FPGA-Based Reconfigurable Software Architecture for Highly Dependable Systems

Author: Di Carlo Stefano
Prinetto Paolo Ernesto
Scionti A.
Publication venue: IEEE Computer Society
Publication date: 01/01/2009
Field of study

Nowadays, systems-on-chip are commonly equipped with reconfigurable hardware. The use of hybrid architectures based on a mixture of general purpose processors and reconfigurable components has gained importance across the scientific community allowing a significant improvement of computational performance. Along with the demand for performance, the great sensitivity of reconfigurable hardware devices to physical defects lead to the request of highly dependable and fault tolerant systems. This paper proposes an FPGA-based reconfigurable software architecture able to abstract the underlying hardware platform giving an homogeneous view of it. The abstraction mechanism is used to implement fault tolerance mechanisms with a minimum impact on the system performanc

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Fault Injection for Embedded Microprocessor-based Systems

Author: Benso Alfredo
Rebaudengo Maurizio
Sonza Reorda Matteo
Publication venue: Verlag der Technischen Universität Graz
Publication date: 01/01/1999
Field of study

Microprocessor-based embedded systems are increasingly used to control safety-critical systems (e.g., air and railway traffic control, nuclear plant control, aircraft and car control). In this case, fault tolerance mechanisms are introduced at the hardware and software level. Debugging and verifying the correct design and implementation of these mechanisms ask for effective environments, and Fault Injection represents a viable solution for their implementation. In this paper we present a Fault Injection environment, named FlexFI, suitable to assess the correctness of the design and implementation of the hardware and software mechanisms existing in embedded microprocessor-based systems, and to compute the fault coverage they provide. The paper describes and analyzes different solutions for implementing the most critical modules, which differ in terms of cost, speed, and intrusiveness in the original system behavio

ZENODO

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

ARPHA OAI-PMH Endpoint

ARPHA Preprints

PORTO Publications Open Repository TOrino

Compiler-Injected SIHFT for Embedded Operating Systems

Author: Davide Baroffio
Federico Reghenzani
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2023
Field of study

Random hardware faults are a major concern for critical systems, especially when they are employed in high-radiation environments such as aerospace applications. While specialised hardware already exists for implementing fault tolerance, software solutions, named Software-Implemented Hardware Fault Tolerance (SIHFT), offer higher flexibility at a lower cost. This work describes a compiler-based approach for inserting instruction-level fault detection mechanisms in both the application code and the operating system. An experimental evaluation on a STM32 board running FreeRTOS shows the effectiveness of the proposed approach in detecting faults

Archivio istituzionale della ricerca - Politecnico di Milano

Aspect-oriented fault tolerance for real-time embedded systems

Author: Afonso Francisco
Brito Nuno
Montenegro Sérgio
Silva Carlos A.
Tavares Adriano
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our approach to introduce fault tolerance in distributed embedded systems applications, using aspect-oriented programming (AOP). A real-time operating system sup-porting middleware thread communication was integrated to a fault tolerant framework. The introduction of fault tolerance in the system is performed by AOP at the application thread level. The advantages of this approach include higher modularization, less efforts for legacy systems evolution and better configurability for testing and product line development. This work has been tested and evaluated successfully in several fault tolerant configurations and presented no significant performance or memory footprint costs.Fundação para a Ciência e a Tecnologia (FCT

CiteSeerX

Universidade do Minho: RepositoriUM

Crossref

Software Fault Tolerance in Real-Time Systems: Identifying the Future Research Questions

Author: FEDERICO REGHENZANI
WILLIAM FORNACIARI
ZHISHAN GUO
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2023
Field of study

Tolerating hardware faults in modern architectures is becoming a prominent problem due to the miniaturization of the hardware components, their increasing complexity, and the necessity to reduce the costs. Software-Implemented Hardware Fault Tolerance approaches have been developed to improve the system dependability to hardware faults without resorting to custom hardware solutions. However, these come at the expense of making the satisfaction of the timing constraints of the applications/activities harder from a scheduling standpoint. This paper surveys the current state of the art of fault tolerance approaches when used in the context real-time systems, identifying the main challenges and the cross-links between these two topics. We propose a joint scheduling-failure analysis model that highlights the formal interactions among software fault tolerance mechanisms and timing properties. This model allows us to present and discuss many open research questions with the final aim to spur the future research activities

Archivio istituzionale della ricerca - Politecnico di Milano

Application-level fault tolerance in real-time embedded systems

Author: Afonso Francisco
Montenegro Sérgio
Silva Carlos A.
Tavares Adriano
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2008
Field of study

Critical real-time embedded systems need to make use of fault tolerance techniques to cope with operation time errors, either in hardware or software. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This work proposes and evaluates a fault tolerance framework for supporting the development of dependable applications. This framework is build upon basic operating system services and middleware communications and brings flexible and transparent support for application threads. A case study involving radar filtering is described and the framework advantages and drawbacks are discussed.Fundação para a Ciência e a Tecnologia (FCT

Universidade do Minho: RepositoriUM

Crossref

Fault diagnostic instrumentation design for environmental control and life support systems

Author: Powell J. D., Jr.
Wynveen R. A.
Yang P. Y.
You K. C.
Publication venue
Publication date
Field of study

As a development phase moves toward flight hardware, the system availability becomes an important design aspect which requires high reliability and maintainability. As part of continous development efforts, a program to evaluate, design, and demonstrate advanced instrumentation fault diagnostics was successfully completed. Fault tolerance designs for reliability and other instrumenation capabilities to increase maintainability were evaluated and studied

NASA Technical Reports Server