56,055 research outputs found

    Restart-Based Fault-Tolerance: System Design and Schedulability Analysis

    Full text link
    Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models

    Parallel Real-Time Scheduling for Latency-Critical Applications

    Get PDF
    In order to provide safety guarantees or quality of service guarantees, many of today\u27s systems consist of latency-critical applications, e.g. applications with timing constraints. The problem of scheduling multiple latency-critical jobs on a multiprocessor or multicore machine has been extensively studied for sequential (non-parallizable) jobs and different system models and different objectives have been considered. However, the computational requirement of a single job is still limited by the capacity of a single core. To provide increasingly complex functionalities of applications and to complete their higher computational demands within the same or even more stringent timing constraints, we must exploit the internal parallelism of jobs, where individual jobs are parallel programs and can potentially utilize more than one core in parallel. However, there is little work considering scheduling multiple parallel jobs that are latency-critical. This dissertation focuses on developing new scheduling strategies, analysis tools, and practical platform design techniques to enable efficient and scalable parallel real-time scheduling for latency-critical applications on multicore systems. In particular, the research is focused on two types of systems: (1) static real-time systems for tasks with deadlines where the temporal properties of the tasks that need to execute is known a priori and the goal is to guarantee the temporal correctness of the tasks prior to their executions; and (2) online systems for latency-critical jobs where multiple jobs arrive over time and the goal to optimize for a performance objective of jobs during the execution. For static real-time systems for parallel tasks, several scheduling strategies, including global earliest deadline first, global rate monotonic and a novel federated scheduling, are proposed, analyzed and implemented. These scheduling strategies have the best known theoretical performance for parallel real-time tasks under any global strategy, any fixed priority scheduling and any scheduling strategy, respectively. In addition, federated scheduling is generalized to systems with multiple criticality levels and systems with stochastic tasks. Both numerical and empirical experiments show that federated scheduling and its variations have good schedulability performance and are efficient in practice. For online systems with multiple latency-critical jobs, different online scheduling strategies are proposed and analyzed for different objectives, including maximizing the number of jobs meeting a target latency, maximizing the profit of jobs, minimizing the maximum latency and minimizing the average latency. For example, a simple First-In-First-Out scheduler is proven to be scalable for minimizing the maximum latency. Based on this theoretical intuition, a more practical work-stealing scheduler is developed, analyzed and implemented. Empirical evaluations indicate that, on both real world and synthetic workloads, this work-stealing implementation performs almost as well as an optimal scheduler

    Timing analysis in existing and emerging cyber physical systems

    Get PDF
    A main mission of safety-critical cyber-physical systems is to guarantee timing correctness. The examples of safety- critical systems are avionic, automotive or medical systems in which timing violations could have disastrous effects, from loss of human life to damage to machines and/or the environment. Over the past decade, multicore processors have become increasingly common for their potential of efficiency, which has made new single-core processors become relatively scarce. As a result, it has created a pressing need to transition to multicore processors. However, existing safety-critical software that has been certified on single-core processors is not allowed to be fielded on a multicore system as is. The issue stems from, namely, serious inter- core interference problems on shared resources in current multicore processors, which create non-deterministic timing behavior. Since meeting the timing constraints is the crucial requirement of safety-critical real-time systems, the use of more than one core in a multicore chip is currently not certified yet by the authorities. Academia has paid relatively little attention to non-determinism due to uncoordinated I/O communications, as compared with other resources such as cache or memory, although industry considers it as one of the most troublesome challenges. Hence we focused on I/O synchronization, requiring no information of Worst Case Execution Time (WCET) that can get impacted by other interference sources. Traditionally, a two-level scheduling, such as Integrated Modular Avionics system (IMA), has been used for providing temporal isolation capability. However, such hierarchical approaches introduce significant priority inversions across applications, especially in multicore systems, ultimately leading to lower system utilization. To address these issues, we have proposed a novel scheduling mechanism called budgeted generalized rate monotonic analysis (Budgeted GRMS) in which different applications’ tasks are globally scheduled for avoiding unnecessary priority inversions, yet the CPU resource is still partitioned for temporal isolation among applications. Incorporating the issues of no information of WCETs and I/O synchronization, this new scheduling paradigm enables the “safe” use of multicore processors in safety-critical real-time systems. Recently, newly emerging Internet of Things (IoT) and Smart City applications are becoming a part of cyber- physical systems, as the needs are required and the feasibility are getting visible. What we need to pay attention to is that the promises and challenges arising from IoT and Smart City applications are providing new research landscapes and opportunities and fundamentally transforming real-time scheduling. As mentioned earlier, in traditional real-time systems, an instance of a program execution (a process) is described as a scheduling entity, while, in the emerging applications, the fundamental schedulable units are chunks of data transported over communication media. Another transformation is that, in IoT and Smart City applications, there are multiple options and combinations to choose to utilize and schedule since there are massively deployed heterogeneous kinds of sensing devices. This is contrary to the existing real-time work which is given a fixed task set to be analyzed. For that reason, they also suggest variants of performance or quality optimization problems. Suppose a disaster response infrastructure in a troubled area to ensure safety of humanitarian missions. Cameras and other sensors are deployed along key routes to monitor local conditions, but turned off by default and turned on on-demand to save limited battery life. To determine a safe route to deliver humanitarian shipments, a decision-maker must collect reconnaissance information and schedule the data items to support timely decision-making. Such data items acquired from the time-evolving physical world are in general time-sensitive - a retrieved item may become stale and no longer be accurate/relevant as conditions in the physical environment change. Therefore, “when to acquire” affects the performance and correctness of such applications and thus the overall system safety and data timeliness should be carefully considered. For the addressed problem, we explored various algorithmic options for maximizing quality of information, and developed the optimal algorithm for the order of retrievals of data items to make multiple decisions. I believe this is a significant initial step toward expanding timing-safety research landscapes and opportunities in the emerging CPS area

    Energy-Aware Real-Time Scheduling on Heterogeneous and Homogeneous Platforms in the Era of Parallel Computing

    Get PDF
    Multi-core processors increasingly appear as an enabling platform for embedded systems, e.g., mobile phones, tablets, computerized numerical controls, etc. The parallel task model, where a task can execute on multiple cores simultaneously, can efficiently exploit the multi-core platform\u27s computational ability. Many computation-intensive systems (e.g., self-driving cars) that demand stringent timing requirements often evolve in the form of parallel tasks. Several real-time embedded system applications demand predictable timing behavior and satisfy other system constraints, such as energy consumption. Motivated by the facts mentioned above, this thesis studies the approach to integrating the dynamic voltage and frequency scaling (DVFS) policy with real-time embedded system application\u27s internal parallelism to reduce the worst-case energy consumption (WCEC), an essential requirement for energy-constrained systems. First, we propose an energy-sub-optimal scheduler, assuming the per-core speed tuning feature for each processor. Then we extend our solution to adapt the clustered multi-core platform, where at any given time, all the processors in the same cluster run at the same speed. We also present an analysis to exploit a task\u27s probabilistic information to improve the average-case energy consumption (ACEC), a common non-functional requirement of embedded systems. Due to the strict requirement of temporal correctness, the majority of the real-time system analysis considered the worst-case scenario, leading to resource over-provisioning and cost. The mixed-criticality (MC) framework was proposed to minimize energy consumption and resource over-provisioning. MC scheduling has received considerable attention from the real-time system research community, as it is crucial to designing safety-critical real-time systems. This thesis further addresses energy-aware scheduling of real-time tasks in an MC platform, where tasks with varying criticality levels (i.e., importance) are integrated into a common platform. We propose an algorithm GEDF-VD for scheduling MC tasks with internal parallelism in a multiprocessor platform. We also prove the correctness of GEDF-VD, provide a detailed quantitative evaluation, and reported extensive experimental results. Finally, we present an analysis to exploit a task\u27s probabilistic information at their respective criticality levels. Our proposed approach reduces the average-case energy consumption while satisfying the worst-case timing requirement

    Modeling and checking Real-Time system designs

    Get PDF
    Real-time systems are found in an increasing variety of application fields. Usually, they are embedded systems controlling devices that may risk lives or damage properties: they are safety critical systems. Hard Real-Time requirements (late means wrong) make the development of such kind of systems a formidable and daunting task. The need to predict temporal behavior of critical real-time systems has encouraged the development of an useful collection of models, results and tools for analyzing schedulability of applications (e.g., [log]). However, there is no general analytical support for verifying other kind of high level timing requirements on complex software architectures. On the other hand, the verification of specifications and designs of real-time systems has been considered an interesting application field for automatic analysis techniques such as model-checking. Unfortunately, there is a natural trade-off between sophistication of supported features and the practicality of formal analysis. To cope with the challenges of formal analysis real-time system designs we focus on three aspects that, we believe, are fundamental to get practical tools: model-generation, modelreduction and model-checking. Then, firstly, we extend our ideas presented in [30] and develop an automatic approach to model and verify designs of real-time systems for complex timing requirements based on scheduling theory and timed automata theory [7] (a wellknown and studied formalism to model and verify timed systems). That is, to enhance practicality of formal analysis, we focus our analysis on designs adhering to Fixed-Priority scheduling. In essence, we exploit known scheduling theory to automatically derive simple and compositional formal models. To the best of our knowledge, this is the first proposal to integrate scheduling theory into the framework of automatic formal verification. To model such systems, we present I/O Timed Components, a notion and discipline to build non-blocking live timed systems. I/O Timed Components, which are build on top of Timed Automata, provide other important methodological advantages like influence detection or compositional reasoning. Secondly, we provide a battery of automatic and rather generic abstraction techniques that, given a requirement to be analyzed, reduces the model while preserving the relevant behaviors to check it. Thus, we do not feed the verification tools with the whole model as previous formal approaches. To provide arguments about the correctness of those abstractions, we present a notion of Continuous Observational Bismulation that is weaker than strong timed bisimulation yet preserving many well-known logics for timed systems like TCTL [3]. Finally, since we choose timed automata as formal kernel, we adapt and apply their deeply studied and developed analysis theory, as well as their practical tools. Moreover, we also describe from scratch an algorithm to model-check duration properties, a feature that is not addressed by available tools. That algorithm extends the one presented in [28].Fil:Braberman, Víctor Adrián. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina

    Securing Real-Time Internet-of-Things

    Full text link
    Modern embedded and cyber-physical systems are ubiquitous. A large number of critical cyber-physical systems have real-time requirements (e.g., avionics, automobiles, power grids, manufacturing systems, industrial control systems, etc.). Recent developments and new functionality requires real-time embedded devices to be connected to the Internet. This gives rise to the real-time Internet-of-things (RT-IoT) that promises a better user experience through stronger connectivity and efficient use of next-generation embedded devices. However RT- IoT are also increasingly becoming targets for cyber-attacks which is exacerbated by this increased connectivity. This paper gives an introduction to RT-IoT systems, an outlook of current approaches and possible research challenges towards secure RT- IoT frameworks

    Reasoning About the Reliability of Multi-version, Diverse Real-Time Systems

    Get PDF
    This paper is concerned with the development of reliable real-time systems for use in high integrity applications. It advocates the use of diverse replicated channels, but does not require the dependencies between the channels to be evaluated. Rather it develops and extends the approach of Little wood and Rush by (for general systems) by investigating a two channel system in which one channel, A, is produced to a high level of reliability (i.e. has a very low failure rate), while the other, B, employs various forms of static analysis to sustain an argument that it is perfect (i.e. it will never miss a deadline). The first channel is fully functional, the second contains a more restricted computational model and contains only the critical computations. Potential dependencies between the channels (and their verification) are evaluated in terms of aleatory and epistemic uncertainty. At the aleatory level the events ''A fails" and ''B is imperfect" are independent. Moreover, unlike the general case, independence at the epistemic level is also proposed for common forms of implementation and analysis for real-time systems and their temporal requirements (deadlines). As a result, a systematic approach is advocated that can be applied in a real engineering context to produce highly reliable real-time systems, and to support numerical claims about the level of reliability achieved

    Scheduling policies and system software architectures for mixed-criticality computing

    Get PDF
    Mixed-criticality model of computation is being increasingly adopted in timing-sensitive systems. The model not only ensures that the most critical tasks in a system never fails, but also aims for better systems resource utilization in normal condition. In this report, we describe the widely used mixed-criticality task model and fixed-priority scheduling algorithms for the model in uniprocessors. Because of the necessity by the mixed-criticality task model and scheduling policies, isolation, both temporal and spatial, among tasks is one of the main requirements from the system design point of view. Different virtualization techniques have been used to design system software architecture with the goal of isolation. We discuss such a few system software architectures which are being and can be used for mixed-criticality model of computation
    • …
    corecore