9,123 research outputs found

    Service-based Fault Tolerance for Cyber-Physical Systems: A Systems Engineering Approach

    Get PDF
    Cyber-physical systems (CPSs) comprise networked computing units that monitor and control physical processes in feedback loops. CPSs have potential to change the ways people and computers interact with the physical world by enabling new ways to control and optimize systems through improved connectivity and computing capabilities. Compared to classical control theory, these systems involve greater unpredictability which may affect the stability and dynamics of the physical subsystems. Further uncertainty is introduced by the dynamic and open computing environments with rapidly changing connections and system configurations. However, due to interactions with the physical world, the dependable operation and tolerance of failures in both cyber and physical components are essential requirements for these systems.The problem of achieving dependable operations for open and networked control systems is approached using a systems engineering process to gain an understanding of the problem domain, since fault tolerance cannot be solved only as a software problem due to the nature of CPSs, which includes close coordination among hardware, software and physical objects. The research methodology consists of developing a concept design, implementing prototypes, and empirically testing the prototypes. Even though modularity has been acknowledged as a key element of fault tolerance, the fault tolerance of highly modular service-oriented architectures (SOAs) has been sparsely researched, especially in distributed real-time systems. This thesis proposes and implements an approach based on using loosely coupled real-time SOA to implement fault tolerance for a teleoperation system.Based on empirical experiments, modularity on a service level can be used to support fault tolerance (i.e., the isolation and recovery of faults). Fault recovery can be achieved for certain categories of faults (i.e., non-deterministic and aging-related) based on loose coupling and diverse operation modes. The proposed architecture also supports the straightforward integration of fault tolerance patterns, such as FAIL-SAFE, HEARTBEAT, ESCALATION and SERVICE MANAGER, which are used in the prototype systems to support dependability requirements. For service failures, systems rely on fail-safe behaviours, diverse modes of operation and fault escalation to backup services. Instead of using time-bounded reconfiguration, services operate in best-effort capabilities, providing resilience for the system. This enables, for example, on-the-fly service changes, smooth recoveries from service failures and adaptations to new computing environments, which are essential requirements for CPSs.The results are combined into a systems engineering approach to dependability, which includes an analysis of the role of safety-critical requirements for control system software architecture design, architectural design, a dependability-case development approach for CPSs and domain-specific fault taxonomies, which support dependability case development and system reliability analyses. Other contributions of this work include three new patterns for fault tolerance in CPSs: DATA-CENTRIC ARCHITECTURE, LET IT CRASH and SERVICE MANAGER. These are presented together with a pattern language that shows how they relate to other patterns available for the domain

    Robust fault detection for networked systems with distributed sensors

    Get PDF
    Copyright [2011] IEEE. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.This paper is concerned with the robust fault detection problem for a class of discrete-time networked systems with distributed sensors. Since the bandwidth of the communication channel is limited, packets from different sensors may be dropped with different missing rates during the transmission. Therefore, a diagonal matrix is introduced to describe the multiple packet dropout phenomenon and the parameter uncertainties are supposed to reside in a polytope. The aim is to design a robust fault detection filter such that, for all probabilistic packet dropouts, all unknown inputs and admissible uncertain parameters, the error between the residual (generated by the fault detection filter) and the fault signal is made as small as possible. Two parameter-dependent approaches are proposed to obtain less conservative results. The existence of the desired fault detection filter can be determined from the feasibility of a set of linear matrix inequalities that can be easily solved by the efficient convex optimization method. A simulation example on a networked three-tank system is provided to illustrate the effectiveness and applicability of the proposed techniques.This work was supported by national 973 project under Grants 2009CB320602 and 2010CB731800, and the NSFC under Grants 60721003 and 60736026

    Distributed Operating Systems

    Get PDF
    • …
    corecore