1,210 research outputs found

    Restart-Based Fault-Tolerance: System Design and Schedulability Analysis

    Full text link
    Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models

    REMIND: A Framework for the Resilient Design of Automotive Systems

    Get PDF
    In the past years, great effort has been spent on enhancing the security and safety of vehicular systems. Current advances in information and communication technology have increased the complexity of these systems and lead to extended functionalities towards self-driving and more connectivity. Unfortunately, these advances open the door for diverse and newly emerging attacks that hamper the security and, thus, the safety of vehicular systems. In this paper, we contribute to supporting the design of resilient automotive systems. We review and analyze scientific literature on resilience techniques, fault tolerance, and dependability. As a result, we present the REMIND resilience framework providing techniques for attack detection, mitigation, recovery, and resilience endurance. Moreover, we provide guidelines on how the REMIND framework can be used against common security threats and attacks and further discuss the trade-offs when applying these guidelines

    A Multi-Agent Systems Approach for Analysis of Stepping Stone Attacks

    Get PDF
    Stepping stone attacks are one of the most sophisticated cyber-attacks, in which attackers make a chain of compromised hosts to reach a victim target. In this Dissertation, an analytic model with Multi-Agent systems approach has been proposed to analyze the propagation of stepping stones attacks in dynamic vulnerability graphs. Because the vulnerability configuration in a network is inherently dynamic, in this Dissertation a biased min-consensus technique for dynamic graphs with fixed and switching topology is proposed as a distributed technique to calculate the most vulnerable path for stepping stones attacks in dynamic vulnerability graphs. We use min-plus algebra to analyze and provide necessary and sufficient convergence conditions to the shortest path in the fixed topology case. A necessary condition for the switching topology case is provided. Most cyber-attacks involve an attacker launching a multi-stage attack by exploiting a sequence of hosts. This multi-stage attack generates a chain of ``stepping stones” from the origin to target. The choice of stepping stones is a function of the degree of exploitability, the impact, attacker’s capability, masking origin location, and intent. In this Dissertation, we model and analyze scenarios wherein an attacker employs multiple strategies to choose stepping stones. The problem is modeled as an Adjacency Quadratic Shortest Path using dynamic vulnerability graphs with multi-agent dynamic system approach. With this approach, the shortest stepping stone path with maximum node degree and the shortest stepping stone path with maximum impact are modeled and analyzed. Because embedded controllers are omnipresent in networks, in this Dissertation as a Risk Mitigation Strategy, a cyber-attack tolerant control strategy for embedded controllers is proposed. A dual redundant control architecture that combines two identical controllers that are switched periodically between active and restart modes is proposed. The strategy is addressed to mitigate the impact due to the corruption of the controller software by an adversary. We analyze the impact of the resetting and restarting the controller software and performance of the switching process. The minimum requirements in the control design, for effective mitigation of cyber-attacks to the control software that implies a “fast” switching period is provided. The simulation results demonstrate the effectiveness of the proposed strategy when the time to fully reset and restart the controller is faster than the time taken by an adversary to compromise the controller. The results also provide insights into the stability and safety regions and the factors that determine the effectiveness of the proposed strategy

    A Pattern-Language for Self-Healing Internet-of-Things Systems

    Get PDF
    Internet-of-Things systems are assemblies of highly-distributed and heterogeneous parts that, in orchestration, work to provide valuable services to end-users in many scenarios. These systems depend on the correct operation of sensors, actuators, and third-party services, and the failure of a single one can hinder the proper functioning of the whole system, making error detection and recovery of paramount importance, but often overlooked. By drawing inspiration from other research areas, such as cloud, embedded, and mission-critical systems, we present a set of patterns for self-healing IoT systems. We discuss how their implementation can improve system reliability by providing error detection, error recovery, and health mechanisms maintenance. (c) 2020 ACM

    Service-based Fault Tolerance for Cyber-Physical Systems: A Systems Engineering Approach

    Get PDF
    Cyber-physical systems (CPSs) comprise networked computing units that monitor and control physical processes in feedback loops. CPSs have potential to change the ways people and computers interact with the physical world by enabling new ways to control and optimize systems through improved connectivity and computing capabilities. Compared to classical control theory, these systems involve greater unpredictability which may affect the stability and dynamics of the physical subsystems. Further uncertainty is introduced by the dynamic and open computing environments with rapidly changing connections and system configurations. However, due to interactions with the physical world, the dependable operation and tolerance of failures in both cyber and physical components are essential requirements for these systems.The problem of achieving dependable operations for open and networked control systems is approached using a systems engineering process to gain an understanding of the problem domain, since fault tolerance cannot be solved only as a software problem due to the nature of CPSs, which includes close coordination among hardware, software and physical objects. The research methodology consists of developing a concept design, implementing prototypes, and empirically testing the prototypes. Even though modularity has been acknowledged as a key element of fault tolerance, the fault tolerance of highly modular service-oriented architectures (SOAs) has been sparsely researched, especially in distributed real-time systems. This thesis proposes and implements an approach based on using loosely coupled real-time SOA to implement fault tolerance for a teleoperation system.Based on empirical experiments, modularity on a service level can be used to support fault tolerance (i.e., the isolation and recovery of faults). Fault recovery can be achieved for certain categories of faults (i.e., non-deterministic and aging-related) based on loose coupling and diverse operation modes. The proposed architecture also supports the straightforward integration of fault tolerance patterns, such as FAIL-SAFE, HEARTBEAT, ESCALATION and SERVICE MANAGER, which are used in the prototype systems to support dependability requirements. For service failures, systems rely on fail-safe behaviours, diverse modes of operation and fault escalation to backup services. Instead of using time-bounded reconfiguration, services operate in best-effort capabilities, providing resilience for the system. This enables, for example, on-the-fly service changes, smooth recoveries from service failures and adaptations to new computing environments, which are essential requirements for CPSs.The results are combined into a systems engineering approach to dependability, which includes an analysis of the role of safety-critical requirements for control system software architecture design, architectural design, a dependability-case development approach for CPSs and domain-specific fault taxonomies, which support dependability case development and system reliability analyses. Other contributions of this work include three new patterns for fault tolerance in CPSs: DATA-CENTRIC ARCHITECTURE, LET IT CRASH and SERVICE MANAGER. These are presented together with a pattern language that shows how they relate to other patterns available for the domain

    Towards a Secure and Resilient Vehicle Design: Methodologies, Principles and Guidelines

    Get PDF
    The advent of autonomous and connected vehicles has brought new cyber security challenges to the automotive industry. It requires vehicles to be designed to remain dependable in the occurrence of cyber-attacks. A modern vehicle can contain over 150 computers, over 100 million lines of code, and various connection interfaces such as USB ports, WiFi, Bluetooth, and 4G/5G. The continuous technological advancements within the automotive industry allow safety enhancements due to increased control of, e.g., brakes, steering, and the engine. Although the technology is beneficial, its complexity has the side-effect to give rise to a multitude of vulnerabilities that might leverage the potential for cyber-attacks. Consequently, there is an increase in regulations that demand compliance with vehicle cyber security and resilience requirements that state vehicles should be designed to be resilient to cyber-attacks with the capability to detect and appropriately respond to these attacks. Moreover, increasing requirements for automotive digital forensic capabilities are beginning to emerge. Failures in automated driving functions can be caused by hardware and software failures as well as cyber security issues. It is imperative to investigate the cause of these failures. However, there is currently no clear guidance on how to comply with these regulations from a technical perspective.In this thesis, we propose a methodology to predict and mitigate vulnerabilities in vehicles using a systematic approach for security analysis; a methodology further used to develop a framework ensuring a resilient and secure vehicle design concerning a multitude of analyzed vehicle cyber-attacks. Moreover, we review and analyze scientific literature on resilience techniques, fault tolerance, and dependability for attack detection, mitigation, recovery, and resilience endurance. These techniques are then further incorporated into the above-mentioned framework. Finally, to meet requirements to hastily and securely patch the increasing number of bugs in vehicle software, we propose a versatile framework for vehicle software updates
    • …
    corecore