2,384 research outputs found

    A Survey of Research into Mixed Criticality Systems

    Get PDF
    This survey covers research into mixed criticality systems that has been published since Vestal’s seminal paper in 2007, up until the end of 2016. The survey is organised along the lines of the major research areas within this topic. These include single processor analysis (including fixed priority and EDF scheduling, shared resources and static and synchronous scheduling), multiprocessor analysis, realistic models, and systems issues. The survey also explores the relationship between research into mixed criticality systems and other topics such as hard and soft time constraints, fault tolerant scheduling, hierarchical scheduling, cyber physical systems, probabilistic real-time systems, and industrial safety standards

    FANTOM: Fault Tolerant Task-Drop Aware Scheduling for Mixed-Criticality Systems

    Get PDF
    Mixed-Criticality (MC) systems have emerged as an effective solution in various industries, where multiple tasks with various real-time and safety requirements (different levels of criticality) are integrated onto a common hardware platform. In these systems, a fault may occur due to different reasons, e.g., hardware defects, software errors or the arrival of unexpected events. In order to tolerate faults in MC systems, the re-execution technique is typically employed, which may lead to overrun of high-criticality tasks (HCTs), which necessitates the drop of low-criticality tasks (LCTs) or degrading their quality. However, frequent drops or relatively long execution times of LCTs (especially mission-critical tasks) are not always desirable and it may impose a negative impact on the performance, or the functionality of MC systems. In this regard, this article proposes a realistic MC task model and develops a design-time task-drop aware schedulability analysis based on the Earliest Deadline First with Virtual Deadline (EDF-VD) algorithm. According to this analysis and the proposed scheduling policy based on the new MC task model, in the high-criticality (HI) mode, when an HCT overruns and the system switches to the HI mode, the number of drops per LCT is prohibited from passing a predefined threshold. In addition, to guarantee the real-time constraints and safety requirements of MC tasks in the presence of faults (assuming transient faults in this article), a corresponding scheduling mechanism has been developed. According to the obtained results from an extensive set of simulations, which have been validated through a realistic avionic application, the proposed method improves the acceptance ratio by up to 43.9% compared to state-of-the-art

    Distributed real-time fault tolerance in a virtualized separation kernel

    Full text link
    Computers are increasingly being placed in scenarios where a computer error could result in the loss of human life or significant financial loss. Fault tolerant techniques must be employed to prevent an error from resulting in a fault causing such losses. Two types of errors that are common in real-time and embedded system are soft errors, i.e. data bit corruption, and timing errors, such as missed deadlines. Purely software based techniques to address these types of errors have the advantage of not requiring specialized hardware and are able to use more readily available commercial off-the-shelf hardware. Timing errors are addressed using Adaptive Mixed-Criticality, a scheduling technique where higher criticality tasks are given precedence over those of lower criticality when it is impossible to guarantee the schedulability of all tasks. While mixed-criticality scheduling has gained attention in recent years, most approaches assume a periodic task model and that the system has a single criticality level which dictates the available budget to all tasks. In practice these assumptions do not hold: different types of tasks are better served by different scheduling approaches and only a subset of high critical tasks might require additional capacity to meet deadlines. In the latter case, this occurs when a process has experienced a fault and requires additional capacity to perform the recovery. In this thesis, soft errors are addressed using a novel real-time fault tolerance method based on a virtualized separation kernel. Instead of executing redundant copies of an application on separate machines, the applications are consolidated onto one multi-core processor and use hardware virtualization extensions to partition the applications. This allows new recovery schemes to be explored. In addition, the maximum recovery time is sufficiently bounded to ensure recovery occurs in a timely manner without affecting the normal execution of the application. A virtualized separation kernel in combination with Adaptive Mixed-Criticality techniques creates a fault tolerant system that predictably detects and recovers from timing and soft errors

    A Survey of Fault-Tolerance Techniques for Embedded Systems from the Perspective of Power, Energy, and Thermal Issues

    Get PDF
    The relentless technology scaling has provided a significant increase in processor performance, but on the other hand, it has led to adverse impacts on system reliability. In particular, technology scaling increases the processor susceptibility to radiation-induced transient faults. Moreover, technology scaling with the discontinuation of Dennard scaling increases the power densities, thereby temperatures, on the chip. High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip. To assure a reliable system operation, despite these potential reliability concerns, fault-tolerance techniques have emerged. Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements. However, the integration of fault-tolerance techniques into real-time embedded systems complicates preserving timing constraints. As a remedy, many task mapping/scheduling policies have been proposed to consider the integration of fault-tolerance techniques and enforce both timing and reliability guarantees for real-time embedded systems. More advanced techniques aim additionally at minimizing power and energy while at the same time satisfying timing and reliability constraints. Recently, some scheduling techniques have started to tackle a new challenge, which is the temperature increase induced by employing fault-tolerance techniques. These emerging techniques aim at satisfying temperature constraints besides timing and reliability constraints. This paper provides an in-depth survey of the emerging research efforts that exploit fault-tolerance techniques while considering timing, power/energy, and temperature from the real-time embedded systems’ design perspective. In particular, the task mapping/scheduling policies for fault-tolerance real-time embedded systems are reviewed and classified according to their considered goals and constraints. Moreover, the employed fault-tolerance techniques, application models, and hardware models are considered as additional dimensions of the presented classification. Lastly, this survey gives deep insights into the main achievements and shortcomings of the existing approaches and highlights the most promising ones

    Fault Tolerance and the Five-Second Rule

    Get PDF
    We propose a new approach to fault tolerance that we call bounded-time recovery (BTR). BTR is intended for systems that need strong timeliness guarantees during normal operation but can tolerate short outages in an emergency, e.g., when they are under attack. We argue that BTR could be a good fit for many cyber-physical systems. We also sketch a technical approach to providing BTR, and we discuss some challenges that still remain

    Timing Predictability in Future Multi-Core Avionics Systems

    Full text link
    • …
    corecore