86,012 research outputs found

    Aspect-oriented fault tolerance for real-time embedded systems

    Get PDF
    Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our approach to introduce fault tolerance in distributed embedded systems applications, using aspect-oriented programming (AOP). A real-time operating system sup-porting middleware thread communication was integrated to a fault tolerant framework. The introduction of fault tolerance in the system is performed by AOP at the application thread level. The advantages of this approach include higher modularization, less efforts for legacy systems evolution and better configurability for testing and product line development. This work has been tested and evaluated successfully in several fault tolerant configurations and presented no significant performance or memory footprint costs.Fundação para a Ciência e a Tecnologia (FCT

    A Novel Technique for Task Re-Allocation in Distributed Computing System

    Get PDF
    A distributed computing is software system in which components are located on different attached computers can communicate and organize their actions by transferring messages. A task applied on the distributed system must be reliable and feasible. The distributed system for instance grid networks, robotics, air traffic control systems, etc. exceedingly depends on time. If not detected accurately and recovered at the proper time, a single error in real time distributed system can cause a whole system failure. Fault-tolerance is the key method which is mostly used to provide continuous reliability in these systems. There are some challenges in distributed computing system such as resource sharing, transparency, dependability, Complex mappings, concurrency, Fault tolerance etc. In this paper, we focus on fault tolerance which is responsible for the degradation of the system. A novel technique is proposed based upon reliability to overcome fault tolerance problem and re-allocate the task. DOI: 10.17762/ijritcc2321-8169.15080

    Real-time and fault tolerance in distributed control software

    Get PDF
    Closed loop control systems typically contain multitude of spatially distributed sensors and actuators operated simultaneously. So those systems are parallel and distributed in their essence. But mapping this parallelism onto the given distributed hardware architecture, brings in some additional requirements: safe multithreading, optimal process allocation, real-time scheduling of bus and network resources. Nowadays, fault tolerance methods and fast even online reconfiguration are becoming increasingly important. All those often conflicting requirements, make design and implementation of real-time distributed control systems an extremely difficult task, that requires substantial knowledge in several areas of control and computer science. Although many design methods have been proposed so far, none of them had succeeded to cover all important aspects of the problem at hand. [1] Continuous increase of production in embedded market, makes a simple and natural design methodology for real-time systems needed more then ever

    Fault Tolerant Real Time Dynamic Scheduling Algorithm For Heterogeneous Distributed System

    Get PDF
    Fault-tolerance becomes an important key to establish dependability in Real Time Distributed Systems (RTDS). In fault-tolerant Real Time Distributed systems, detection of fault and its recovery should be executed in timely manner so that in spite of fault occurrences the intended output of real-time computations always take place on time. Hardware and software redundancy are well-known e ective methods for faulttolerance, where extra hard ware (e.g., processors, communication links) and software (e.g., tasks, messages) are added into the system to deal with faults. Performances of RTDS are mostly guided by eciency of scheduling algorithm and schedulability analysis are performed on the system to ensure the timing constrains. This thesis examines the scenarios where a real time system requires very little redundant hardware resources to tolerate failures in heterogeneous real time distributed systems with point-to-point communication links. Fault tolerance can be achieved by..

    Copilot: Monitoring Embedded Systems

    Get PDF
    Runtime verification (RV) is a natural fit for ultra-critical systems, where correctness is imperative. In ultra-critical systems, even if the software is fault-free, because of the inherent unreliability of commodity hardware and the adversity of operational environments, processing units (and their hosted software) are replicated, and fault-tolerant algorithms are used to compare the outputs. We investigate both software monitoring in distributed fault-tolerant systems, as well as implementing fault-tolerance mechanisms using RV techniques. We describe the Copilot language and compiler, specifically designed for generating monitors for distributed, hard real-time systems. We also describe two case-studies in which we generated Copilot monitors in avionics systems

    Towards Middleware for Fault-tolerance in Distributed Real-time and Embedded Systems

    Get PDF
    Abstract. Distributed real-time and embedded (DRE) systems often require support for multiple simultaneous quality of service (QoS) properties, such as real-timeliness and fault tolerance, that operate within resource constrained environments. These resource constraints motivate the need for a lightweight middleware infrastructure, while the need for simultaneous QoS properties require the middleware to provide fault tolerance capabilities that respect time-critical needs of DRE systems. Conventional middleware solutions, such as Fault-tolerant CORBA (FT-CORBA) and Continuous Availability API for J2EE, have limited utility for DRE systems because they are heavyweight (e.g., the complexity of their feature-rich fault tolerance capabilities consumes excessive runtime resources), yet incomplete (e.g., they lack mechanisms that enable fault tolerance while maintaining real-time predictability). This paper provides three contributions to the development and standardization of lightweight real-time and fault-tolerant middleware for DRE systems. First, we discuss the challenges in realizing real-time faulttolerant solutions for DRE systems using contemporary middleware. Second, we describe recent progress towards standardizing a CORBA lightweight fault-tolerance specification for DRE systems. Third, we present the architecture of FLARe, which is a prototype based on the OMG real-time fault-tolerant CORBA middleware standardization efforts that is lightweight (e.g., leverages only those server-and client-side mechanisms required for real-time systems) and predictable (e.g., provides fault-tolerant mechanisms that respect time-critical performance needs of DRE systems)

    Incorporating application-transparent node-crash tolerance to a soft real-time self-planned agent framework

    Get PDF
    Fault tolerance is essential to any soft real-time distributed system; besides correctness and timeliness. Traditionally system designers are required to consider both real-time and fault-tolerance requirements while building real-time applications. This is a complex task for a designer. In general distributed systems, fault tolerance has been researched well. However, significantly less work has been done in the field of fault tolerance in soft real-time systems. This thesis focuses on achieving application transparent fault-tolerance in a soft real-time system framework and addresses the issue of redundancy management in the presence of deadlines. Specifically, the thesis focuses on incorporating application-transparent node-crash tolerance in a soft real-time self-planned agent framework (SPAF). A SPAF application is decomposed into several missions and each mission is completed by successfully completing multi-agent tasks through a sequence of phases. Each task in a mission can have many solutions and the choice of the solution depends on the remaining time and available resources. Fault tolerance is achieved by using the conventional primary-backup approach in conjunction with the dynamic task planning feature of SPAF. A cold backup and hot backup are used to accomplish application and system recovery during a node crash respectively. The model and the design of the fault tolerance solution are presented in detail. The functionality and efficiency of the fault tolerance design is illustrated through the implementation and simulations using a custom built application respectively. The test results are very encouraging and the application performance is almost the same even after inclusion of the fault tolerance mechanisms

    Design and Performance of a Fault-Tolerant Real-Time CORBA Event Service

    Get PDF
    Developing distributed real-time and embedded (DRE)systems in which multiple quality-of-service (QoS) dimen-sions must be managed is an important and challenging R&D problem. This paper makes three contributions to re-search on multi-dimensional QoS for DRE systems. First, itdescribes the design and implementation of a fault-tolerantreal-time CORBA event service for The ACE ORB (TAO).Second, it describes our enhancements and extensions tofeatures in TAO, to integrate real-time and fault toleranceproperties. Third, it presents an empirical evaluation ofour approach. Our results show that with some refinements,real-time and fault-tolerance features can be integrated ef-fectively and efficiently in a CORBA event service
    corecore