Search CORE

30,762 research outputs found

Copilot: Monitoring Embedded Systems

Author: Goodloe Alwyn
Niller Sebastian
Pike Lee
Wegmann Nis
Publication venue
Publication date: 01/01/2013
Field of study

Runtime verification (RV) is a natural fit for ultra-critical systems, where correctness is imperative. In ultra-critical systems, even if the software is fault-free, because of the inherent unreliability of commodity hardware and the adversity of operational environments, processing units (and their hosted software) are replicated, and fault-tolerant algorithms are used to compare the outputs. We investigate both software monitoring in distributed fault-tolerant systems, as well as implementing fault-tolerance mechanisms using RV techniques. We describe the Copilot language and compiler, specifically designed for generating monitors for distributed, hard real-time systems. We also describe two case-studies in which we generated Copilot monitors in avionics systems

Copenhagen University Research Information System

NASA Technical Reports Server

The implementation and use of Ada on distributed systems with high reliability requirements

Author: Gregory S. T.
Knight J. C.
Urquhart J. I. A.
Publication venue
Publication date
Field of study

The use and implementation of Ada in distributed environments in which reliability is the primary concern were investigated. In particular, the concept that a distributed system may be programmed entirely in Ada so that the individual tasks of the system are unconcerned with which processors they are executing on, and that failures may occur in the software or underlying hardware was examined. Progress is discussed for the following areas: continued development and testing of the fault-tolerant Ada testbed; development of suggested changes to Ada so that it might more easily cope with the failure of interest; and design of new approaches to fault-tolerant software in real-time systems, and integration of these ideas into Ada

NASA Technical Reports Server

Aspect-oriented fault tolerance for real-time embedded systems

Author: Afonso Francisco
Brito Nuno
Montenegro Sérgio
Silva Carlos A.
Tavares Adriano
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our approach to introduce fault tolerance in distributed embedded systems applications, using aspect-oriented programming (AOP). A real-time operating system sup-porting middleware thread communication was integrated to a fault tolerant framework. The introduction of fault tolerance in the system is performed by AOP at the application thread level. The advantages of this approach include higher modularization, less efforts for legacy systems evolution and better configurability for testing and product line development. This work has been tested and evaluated successfully in several fault tolerant configurations and presented no significant performance or memory footprint costs.Fundação para a Ciência e a Tecnologia (FCT

CiteSeerX

Universidade do Minho: RepositoriUM

Crossref

Implications of VLSI Fault Models and Distributed Systems Failure Models -- A hardware designer\u27s view

Author: Fuchs Gottfried
Publication venue: Dagstuhl Seminar Proceedings. 08371 - Fault-Tolerant Distributed Algorithms on VLSI Chips
Publication date: 01/01/2009
Field of study

The fault and failure models as well as their semantics within the VLSI and the distributed systems/algorithms community are quite different. Pointing out the mismatch of those fault respectively failure models is the main part of this work. The impact of the implemented failure model in terms of hardware effort and system complexity will be shown on different VLSI implementations of distributed algorithms. However, still, there are a lot of open questions left mostly related to the coverage analysis of hardware implemented fault-tolerant algorithms

Dagstuhl Research Online Publication Server

Fault-tolerant Algorithms for Tick-Generation in Asynchronous Logic: Robust Pulse Generation

Author: Dolev Danny
Fuegger Matthias
Lenzen Christoph
Schmid Ulrich
Publication venue
Publication date: 14/10/2011
Field of study

Today's hardware technology presents a new challenge in designing robust systems. Deep submicron VLSI technology introduced transient and permanent faults that were never considered in low-level system designs in the past. Still, robustness of that part of the system is crucial and needs to be guaranteed for any successful product. Distributed systems, on the other hand, have been dealing with similar issues for decades. However, neither the basic abstractions nor the complexity of contemporary fault-tolerant distributed algorithms match the peculiarities of hardware implementations. This paper is intended to be part of an attempt striving to overcome this gap between theory and practice for the clock synchronization problem. Solving this task sufficiently well will allow to build a very robust high-precision clocking system for hardware designs like systems-on-chips in critical applications. As our first building block, we describe and prove correct a novel Byzantine fault-tolerant self-stabilizing pulse synchronization protocol, which can be implemented using standard asynchronous digital logic. Despite the strict limitations introduced by hardware designs, it offers optimal resilience and smaller complexity than all existing protocols.Comment: 52 pages, 7 figures, extended abstract published at SSS 201

arXiv.org e-Print Archive

CiteSeerX

Classes of Byzantine Fault-Tolerant Algorithms for Dependable Distributed Systems.

Author: Postma André
Publication venue: Universiteit Twente
Publication date: 01/01/1998
Field of study

This thesis concentrates on the design of new algorithms for fault-tolerant systems based on system-level hardware masking redundancy. It is argued that any system in which a reliability improvement of at least a factor 100 is required should be based on system-level hardware masking redundancy. The technique of system-level hardware masking redundancy is applicable in a redundant system consisting of a number of processors, in which the system services are replicated on the different processors, and provides resilience to a limited number of faulty processors in the system. The technique is most effective in a distributed system, since the autonomous nature and geographical distribution of the processors in such a system largely contribute to achieve independency between failures of different processors, which improves the reliability of the system

CiteSeerX

University of Twente Research Information

How to speedup fault-tolerant clock generation in VLSI systems-on-chip via pipelining

Author: Andreas Dielacher
Matthias Függer
Ulrich Schmid
Publication venue
Publication date: 01/01/2009
Field of study

Fault-tolerant clocking schemes become inevitable when it comes to highly-reliable chip designs. Because of the additional hardware overhead, existing solutions are considerably slower than their non-reliable counterparts. In this paper, we demonstrate that pipelining is a viable approach to speed up the distributed fault-tolerant DARTS clock generation approach introduced in (Függer, Schmid, Fuchs, Kempf, EDCC'06), where a distributed Byzantine fault-tolerant tick generation algorithm has been used to replace the traditional quartz oscillator and highly balanced clock tree in VLSI Systems-on-Chip (SoCs). We provide a pipelined version of the original DARTS algorithm, termed pDARTS, together with a novel modeling and analysis framework for hardware-implemented asynchronous fault-tolerant distributed algorithms, which is employed for rigorously analyzing its correctness & performance. Our results, which have also been confirmed by the experimental evaluation of an FPGA prototype implementation, reveal that pipelining indeed allows to entirely remove the adverse effect of large interconnect delays on the achievable clock frequency, and demonstrate again that methods and results from distributed algorithms research can successfully be applied in the VLSI context

CiteSeerX

Fault Tolerant Real Time Dynamic Scheduling Algorithm For Heterogeneous Distributed System

Author: Ekka A A
Publication venue
Publication date: 01/01/2007
Field of study

Fault-tolerance becomes an important key to establish dependability in Real Time Distributed Systems (RTDS). In fault-tolerant Real Time Distributed systems, detection of fault and its recovery should be executed in timely manner so that in spite of fault occurrences the intended output of real-time computations always take place on time. Hardware and software redundancy are well-known e ective methods for faulttolerance, where extra hard ware (e.g., processors, communication links) and software (e.g., tasks, messages) are added into the system to deal with faults. Performances of RTDS are mostly guided by eciency of scheduling algorithm and schedulability analysis are performed on the system to ensure the timing constrains. This thesis examines the scenarios where a real time system requires very little redundant hardware resources to tolerate failures in heterogeneous real time distributed systems with point-to-point communication links. Fault tolerance can be achieved by..

ethesis@nitr