Search CORE

48,252 research outputs found

A fault-tolerant intelligent robotic control system

Author: Marzwell Neville I.
Tso Kam Sing
Publication venue
Publication date
Field of study

This paper describes the concept, design, and features of a fault-tolerant intelligent robotic control system being developed for space and commercial applications that require high dependability. The comprehensive strategy integrates system level hardware/software fault tolerance with task level handling of uncertainties and unexpected events for robotic control. The underlying architecture for system level fault tolerance is the distributed recovery block which protects against application software, system software, hardware, and network failures. Task level fault tolerance provisions are implemented in a knowledge-based system which utilizes advanced automation techniques such as rule-based and model-based reasoning to monitor, diagnose, and recover from unexpected events. The two level design provides tolerance of two or more faults occurring serially at any level of command, control, sensing, or actuation. The potential benefits of such a fault tolerant robotic control system include: (1) a minimized potential for damage to humans, the work site, and the robot itself; (2) continuous operation with a minimum of uncommanded motion in the presence of failures; and (3) more reliable autonomous operation providing increased efficiency in the execution of robotic tasks and decreased demand on human operators for controlling and monitoring the robotic servicing routines

Development and analysis of the Software Implemented Fault-Tolerance (SIFT) computer

Author: Goldberg J.
Green M. W.
Kautz W. H.
Levitt K. N.
Melliar-Smith P. M.
Schwartz R. L.
Weinstock C. B.
Publication venue
Publication date
Field of study

SIFT (Software Implemented Fault Tolerance) is an experimental, fault-tolerant computer system designed to meet the extreme reliability requirements for safety-critical functions in advanced aircraft. Errors are masked by performing a majority voting operation over the results of identical computations, and faulty processors are removed from service by reassigning computations to the nonfaulty processors. This scheme has been implemented in a special architecture using a set of standard Bendix BDX930 processors, augmented by a special asynchronous-broadcast communication interface that provides direct, processor to processor communication among all processors. Fault isolation is accomplished in hardware; all other fault-tolerance functions, together with scheduling and synchronization are implemented exclusively by executive system software. The system reliability is predicted by a Markov model. Mathematical consistency of the system software with respect to the reliability model has been partially verified, using recently developed tools for machine-aided proof of program correctness

Architecting fault-tolerant software systems

Author: Sözer Hasan
Publication venue: University of Twente
Publication date: 01/01/2009
Field of study

The increasing size and complexity of software systems makes it hard to prevent or remove all possible faults. Faults that remain in the system can eventually lead to a system failure. Fault tolerance techniques are introduced for enabling systems to recover and continue operation when they are subject to faults. Many fault tolerance techniques are available but incorporating them in a system is not always trivial. We consider the following problems in designing a fault-tolerant system. First, existing reliability analysis techniques generally do not prioritize potential failures from the end-user perspective and accordingly do not identify sensitivity points of a system. \ud Second, existing architecture styles are not well-suited for specifying, communicating and analyzing design decisions that are particularly related to the fault-tolerant aspects of a system. Third, there are no adequate analysis techniques that evaluate the impact of fault tolerance techniques on the functional decomposition of software architecture. Fourth, realizing a fault-tolerant design usually requires a substantial development and maintenance effort. \ud To tackle the first problem, we propose a scenario-based software architecture reliability analysis method, called SARAH that benefits from mature reliability engineering techniques (i.e. FMEA, FTA) to provide an early reliability analysis of the software architecture design. SARAH evaluates potential failures from the end-user perspective to identify sensitive points of a system without requiring an implementation. \ud As a new architectural style, we introduce Recovery Style for specifying fault-tolerant aspects of software architecture. Recovery Style is used for communicating and analyzing architectural design decisions and for supporting detailed design with respect to recovery. \ud As a solution for the third problem, we propose a systematic method for optimizing the decomposition of software architecture for local recovery, which is an effective fault tolerance technique to attain high system availability. To support the method, we have developed an integrated set of tools that employ optimization techniques, state-based analytical models (i.e. CTMCs) and dynamic analysis on the system. The method enables the following: i ) modeling the design space of the possible decomposition alternatives, ii ) reducing the design space with respect to domain and stakeholder constraints and iii ) making the desired trade-off between availability and performance metrics. \ud To reduce the development and maintenance effort, we propose a framework, FLORA that supports the decomposition and implementation of software architecture for local recovery. The framework provides reusable abstractions for defining recoverable units and for incorporating the necessary coordination and communication protocols for recovery

CiteSeerX

University of Twente Research Information

A metaobject architecture for fault-tolerant distributed systems : the FRIENDS approach

Author: Fabre Jean-Charles
Pérennou Tanguy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

The FRIENDS system developed at LAAS-CNRS is a metalevel architecture providing libraries of metaobjects for fault tolerance, secure communication, and group-based distributed applications. The use of metaobjects provides a nice separation of concerns between mechanisms and applications. Metaobjects can be used transparently by applications and can be composed according to the needs of a given application, a given architecture, and its underlying properties. In FRIENDS, metaobjects are used recursively to add new properties to applications. They are designed using an object oriented design method and implemented on top of basic system services. This paper describes the FRIENDS software-based architecture, the object-oriented development of metaobjects, the experiments that we have done, and summarizes the advantages and drawbacks of a metaobject approach for building fault-tolerant system

CiteSeerX

Database System Architecture for Fault tolerance and Disaster Recovery

Author: Nguyen Anthony
Publication venue: ePublications at Regis University
Publication date: 23/10/2009
Field of study

Application systems being used today rely heavily on the availability of the database system. Disruption of database system can be damaging and catastrophic to the organization that depends on the availability of the database system for its business and service operations. To ensure business continuity under foreseeable and unforeseeable man-made or natural disasters, the database system has to be designed and built with fault tolerance and disaster recovery capabilities. This project explored existing technologies and solutions to design, build, and implement database system architecture for fault tolerance and disaster recovery using Oracle database software products. The project goal was to implement database system architecture for migrating multiple web applications and databases onto a consolidated system architecture providing high availability database application systems

Conversion and verification procedure for goal-based control programs

Author: Braman J. M. B.
Murray R. M.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2007
Field of study

Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System, developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is developed and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems

CiteSeerX

Caltech Authors

Safety verification of a fault tolerant reconfigurable autonomous goal-based robotic control system

Author: Braman Julia M. B.
Murray Richard M.
Wagner David A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/10/2007
Field of study

Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems

Caltech Authors

A FPGA-Based Reconfigurable Software Architecture for Highly Dependable Systems

Author: Di Carlo Stefano
Prinetto Paolo Ernesto
Scionti A.
Publication venue: IEEE Computer Society
Publication date: 01/01/2009
Field of study

Nowadays, systems-on-chip are commonly equipped with reconfigurable hardware. The use of hybrid architectures based on a mixture of general purpose processors and reconfigurable components has gained importance across the scientific community allowing a significant improvement of computational performance. Along with the demand for performance, the great sensitivity of reconfigurable hardware devices to physical defects lead to the request of highly dependable and fault tolerant systems. This paper proposes an FPGA-based reconfigurable software architecture able to abstract the underlying hardware platform giving an homogeneous view of it. The abstraction mechanism is used to implement fault tolerance mechanisms with a minimum impact on the system performanc

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Safety Verification of Fault Tolerant Goal-based Control Programs with Estimation Uncertainty

Author: Braman Julia M. B.
Murray Richard M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2008
Field of study

Fault tolerance and safety verification of control systems that have state variable estimation uncertainty are essential for the success of autonomous robotic systems. A software control architecture called mission data system, developed at the Jet Propulsion Laboratory, uses goal networks as the control program for autonomous systems. Certain types of goal networks can be converted into linear hybrid systems and verified for safety using existing symbolic model checking software. A process for calculating the probability of failure of certain classes of verifiable goal networks due to state estimation uncertainty is presented. A verifiable example task is presented and the failure probability of the control program based on estimation uncertainty is found

Caltech Authors