77,086 research outputs found

    Pinwheel Scheduling for Fault-tolerant Broadcast Disks in Real-time Database Systems

    Full text link
    The design of programs for broadcast disks which incorporate real-time and fault-tolerance requirements is considered. A generalized model for real-time fault-tolerant broadcast disks is defined. It is shown that designing programs for broadcast disks specified in this model is closely related to the scheduling of pinwheel task systems. Some new results in pinwheel scheduling theory are derived, which facilitate the efficient generation of real-time fault-tolerant broadcast disk programs.National Science Foundation (CCR-9308344, CCR-9596282

    Study of fault-tolerant software technology

    Get PDF
    Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance

    Energy-Efficient Fault-Tolerant Scheduling Algorithm for Real-Time Tasks in Cloud-Based 5G Networks

    Full text link
    © 2013 IEEE. Green computing has become a hot issue for both academia and industry. The fifth-generation (5G) mobile networks put forward a high request for energy efficiency and low latency. The cloud radio access network provides efficient resource use, high performance, and high availability for 5G systems. However, hardware and software faults of cloud systems may lead to failure in providing real-time services. Developing fault tolerance technique can efficiently enhance the reliability and availability of real-time cloud services. The core idea of fault-tolerant scheduling algorithm is introducing redundancy to ensure that the tasks can be finished in the case of permanent or transient system failure. Nevertheless, the redundancy incurs extra overhead for cloud systems, which results in considerable energy consumption. In this paper, we focus on the problem of how to reduce the energy consumption when providing fault tolerance. We first propose a novel primary-backup-based fault-tolerant scheduling architecture for real-time tasks in the cloud environment. Based on the architecture, we present an energy-efficient fault-tolerant scheduling algorithm for real-time tasks (EFTR). EFTR adopts a proactive strategy to increase the system processing capacity and employs a rearrangement mechanism to improve the resource utilization. Simulation experiments are conducted on the CloudSim platform to evaluate the feasibility and effectiveness of EFTR. Compared with the existing fault-tolerant scheduling algorithms, EFTR shows excellent performance in energy conservation and task schedulability

    Towards Middleware for Fault-tolerance in Distributed Real-time and Embedded Systems

    Get PDF
    Abstract. Distributed real-time and embedded (DRE) systems often require support for multiple simultaneous quality of service (QoS) properties, such as real-timeliness and fault tolerance, that operate within resource constrained environments. These resource constraints motivate the need for a lightweight middleware infrastructure, while the need for simultaneous QoS properties require the middleware to provide fault tolerance capabilities that respect time-critical needs of DRE systems. Conventional middleware solutions, such as Fault-tolerant CORBA (FT-CORBA) and Continuous Availability API for J2EE, have limited utility for DRE systems because they are heavyweight (e.g., the complexity of their feature-rich fault tolerance capabilities consumes excessive runtime resources), yet incomplete (e.g., they lack mechanisms that enable fault tolerance while maintaining real-time predictability). This paper provides three contributions to the development and standardization of lightweight real-time and fault-tolerant middleware for DRE systems. First, we discuss the challenges in realizing real-time faulttolerant solutions for DRE systems using contemporary middleware. Second, we describe recent progress towards standardizing a CORBA lightweight fault-tolerance specification for DRE systems. Third, we present the architecture of FLARe, which is a prototype based on the OMG real-time fault-tolerant CORBA middleware standardization efforts that is lightweight (e.g., leverages only those server-and client-side mechanisms required for real-time systems) and predictable (e.g., provides fault-tolerant mechanisms that respect time-critical performance needs of DRE systems)

    Fault-free performance validation of fault-tolerant multiprocessors

    Get PDF
    A validation methodology for testing the performance of fault-tolerant computer systems was developed and applied to the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. This methodology was claimed to be general enough to apply to any ultrareliable computer system. The goal of this research was to extend the validation methodology and to demonstrate the robustness of the validation methodology by its more extensive application to NASA's Fault-Tolerant Multiprocessor System (FTMP) and to the Software Implemented Fault-Tolerance (SIFT) Computer System. Furthermore, the performance of these two multiprocessors was compared by conducting similar experiments. An analysis of the results shows high level language instruction execution times for both SIFT and FTMP were consistent and predictable, with SIFT having greater throughput. At the operating system level, FTMP consumes 60% of the throughput for its real-time dispatcher and 5% on fault-handling tasks. In contrast, SIFT consumes 16% of its throughput for the dispatcher, but consumes 66% in fault-handling software overhead

    Problems related to the integration of fault tolerant aircraft electronic systems

    Get PDF
    Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included

    Copilot: Monitoring Embedded Systems

    Get PDF
    Runtime verification (RV) is a natural fit for ultra-critical systems, where correctness is imperative. In ultra-critical systems, even if the software is fault-free, because of the inherent unreliability of commodity hardware and the adversity of operational environments, processing units (and their hosted software) are replicated, and fault-tolerant algorithms are used to compare the outputs. We investigate both software monitoring in distributed fault-tolerant systems, as well as implementing fault-tolerance mechanisms using RV techniques. We describe the Copilot language and compiler, specifically designed for generating monitors for distributed, hard real-time systems. We also describe two case-studies in which we generated Copilot monitors in avionics systems

    Comparative Analysis Of Fault-Tolerance Techniques For Space Applications

    Get PDF
    Fault-tolerance technique enables a system or application to continue working even if some fault /error occurs in a system. Therefore, it is vital to choose appropriate fault tolerant technique best suited to our application. In case of real-time embedded systems in a space project, the importance of such techniques becomes more critical. In space applications, there is minor or no possibility of maintenance and faults occurrence may lead to serious consequences in terms of partial or complete mission failure. This paper describes the comparison of various fault tolerant techniques for space applications. This also suggests the suitability of these techniques in particular scenario.  The study of fault tolerance techniques relevant to real-time embedded systems and on-board space applications (satellites) is given due importance. This study will not only summarize fault tolerant techniques but also describe their strengths. The paper describes the future trends of faults-tolerance techniques in space applications. This effort may help space system engineers and scientists to select suitable fault-tolerance technique for their mission.

    Aspect-oriented fault tolerance for real-time embedded systems

    Get PDF
    Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our approach to introduce fault tolerance in distributed embedded systems applications, using aspect-oriented programming (AOP). A real-time operating system sup-porting middleware thread communication was integrated to a fault tolerant framework. The introduction of fault tolerance in the system is performed by AOP at the application thread level. The advantages of this approach include higher modularization, less efforts for legacy systems evolution and better configurability for testing and product line development. This work has been tested and evaluated successfully in several fault tolerant configurations and presented no significant performance or memory footprint costs.Fundação para a Ciência e a Tecnologia (FCT
    corecore