99,728 research outputs found
ARIANE 5 ON BOARD SOFTWARE : REDUNDANCY MANAGEMENT
International audienceAriane 5 software, embedded both in the OBC (On Board Computer) and in electrical equipments, deal with Flight control sub-system management, Flight Management activities but also with Telemetry and Safeguard sub-system.Flight Program Software (FPS) embedded in each OBC (nominal and back-up), deals as well with FDIR (Fault Detection Isolation and Recovery) management. This, by using avionics redundancies and taking system and electrical architecture into account in order to reach the high level of reliability required for the Ariane 5 launcher. The running of FPS in nominal and back-up OBC is not dynamically similar due to the fact that, at one time, only one OBC controls both electrical chains through two MIL-STD-1553B buses. Therefore, the FDIR and the FPS redundancy management in each OBC (nominal and back-up) are quite different. Regarding to equipments, even if most of them are redundant (except those used for telemetry functions), their associated embedded software running are identical for both the nominal and the redundant. All equipments are remote terminals on the two redundant MIL-STD-1553B buses (except those used for telemetry which are only spies) and so, do not manage any system or avionics reconfiguration.This paper presents the algorithms and mechanisms implemented in the A5 embedded software (mainly in the FPS running in OBC's) which deal with FDIR and redundancy management
Microprocessor fault-tolerance via on-the-fly partial reconfiguration
This paper presents a novel approach to exploit FPGA dynamic partial reconfiguration to improve the fault tolerance of complex microprocessor-based systems, with no need to statically reserve area to host redundant components. The proposed method not only improves the survivability of the system by allowing the online replacement of defective key parts of the processor, but also provides performance graceful degradation by executing in software the tasks that were executed in hardware before a fault and the subsequent reconfiguration happened. The advantage of the proposed approach is that thanks to a hardware hypervisor, the CPU is totally unaware of the reconfiguration happening in real-time, and there's no dependency on the CPU to perform it. As proof of concept a design using this idea has been developed, using the LEON3 open-source processor, synthesized on a Virtex 4 FPG
Real-time and fault tolerance in distributed control software
Closed loop control systems typically contain multitude of spatially distributed sensors and actuators operated simultaneously. So those systems are parallel and distributed in their essence. But mapping this parallelism onto the given distributed hardware architecture, brings in some additional requirements: safe multithreading, optimal process allocation, real-time scheduling of bus and network resources. Nowadays, fault tolerance methods and fast even online reconfiguration are becoming increasingly important. All those often conflicting requirements, make design and implementation of real-time distributed control systems an extremely difficult task, that requires substantial knowledge in several areas of control and computer science. Although many design methods have been proposed so far, none of them had succeeded to cover all important aspects of the problem at hand. [1] Continuous increase of production in embedded market, makes a simple and natural design methodology for real-time systems needed more then ever
Restart-Based Fault-Tolerance: System Design and Schedulability Analysis
Embedded systems in safety-critical environments are continuously required to
deliver more performance and functionality, while expected to provide verified
safety guarantees. Nonetheless, platform-wide software verification (required
for safety) is often expensive. Therefore, design methods that enable
utilization of components such as real-time operating systems (RTOS), without
requiring their correctness to guarantee safety, is necessary.
In this paper, we propose a design approach to deploy safe-by-design embedded
systems. To attain this goal, we rely on a small core of verified software to
handle faults in applications and RTOS and recover from them while ensuring
that timing constraints of safety-critical tasks are always satisfied. Faults
are detected by monitoring the application timing and fault-recovery is
achieved via full platform restart and software reload, enabled by the short
restart time of embedded systems. Schedulability analysis is used to ensure
that the timing constraints of critical plant control tasks are always
satisfied in spite of faults and consequent restarts. We derive schedulability
results for four restart-tolerant task models. We use a simulator to evaluate
and compare the performance of the considered scheduling models
Prototype of Fault Adaptive Embedded Software for Large-Scale Real-Time Systems
This paper describes a comprehensive prototype of large-scale fault adaptive
embedded software developed for the proposed Fermilab BTeV high energy physics
experiment. Lightweight self-optimizing agents embedded within Level 1 of the
prototype are responsible for proactive and reactive monitoring and mitigation
based on specified layers of competence. The agents are self-protecting,
detecting cascading failures using a distributed approach. Adaptive,
reconfigurable, and mobile objects for reliablility are designed to be
self-configuring to adapt automatically to dynamically changing environments.
These objects provide a self-healing layer with the ability to discover,
diagnose, and react to discontinuities in real-time processing. A generic
modeling environment was developed to facilitate design and implementation of
hardware resource specifications, application data flow, and failure mitigation
strategies. Level 1 of the planned BTeV trigger system alone will consist of
2500 DSPs, so the number of components and intractable fault scenarios involved
make it impossible to design an `expert system' that applies traditional
centralized mitigative strategies based on rules capturing every possible
system state. Instead, a distributed reactive approach is implemented using the
tools and methodologies developed by the Real-Time Embedded Systems group.Comment: 2nd Workshop on Engineering of Autonomic Systems (EASe), in the 12th
Annual IEEE International Conference and Workshop on the Engineering of
Computer Based Systems (ECBS), Washington, DC, April, 200
A FPGA-Based Reconfigurable Software Architecture for Highly Dependable Systems
Nowadays, systems-on-chip are commonly equipped with reconfigurable hardware. The use of hybrid architectures based on a mixture of general purpose processors and reconfigurable components has gained importance across the scientific community allowing a significant improvement of computational performance. Along with the demand for performance, the great sensitivity of reconfigurable hardware devices to physical defects lead to the request of highly dependable and fault tolerant systems. This paper proposes an FPGA-based reconfigurable software architecture able to abstract the underlying hardware platform giving an homogeneous view of it. The abstraction mechanism is used to implement fault tolerance mechanisms with a minimum impact on the system performanc
- …