28 research outputs found

    An Analysis of Lazy and Eager Limited Preemption Approaches under DAG-Based Global Fixed Priority Scheduling

    Get PDF
    DAG-based scheduling models have been shown to effectively express the parallel execution of current many-core heterogeneous architectures. However, their applicability to real-time settings is limited by the difficulties to find tight estimations of the worst-case timing parameters of tasks that may arbitrarily be preempted/migrated at any instruction. An efficient approach to increase the system predictability is to limit task preemptions to a set of pre-defined points. This limited preemption model supports two different preemption approaches, eager and lazy, which have been analyzed only for sequential task-sets. This paper proposes a new response time analysis that computes an upper bound on the lower priority blocking that each task may incur with eager and lazy preemptions. We evaluate our analysis with both, synthetic DAG-based task-sets and a real case-study from the automotive domain. Results from the analysis demonstrate that, despite the eager approach generates a higher number of priority inversions, the blocking impact is generally smaller than in the lazy approach, leading to a better schedulability performance.This work was funded by the EU projects P-SOCRATES (FP7-ICT-2013-10) and HERCULES (H2020/ICT/2015/688860), and the Spanish Ministry of Science and Innovation under contract TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

    Scheduling and locking in multiprocessor real-time operating systems

    Get PDF
    With the widespread adoption of multicore architectures, multiprocessors are now a standard deployment platform for (soft) real-time applications. This dissertation addresses two questions fundamental to the design of multicore-ready real-time operating systems: (1) Which scheduling policies offer the greatest flexibility in satisfying temporal constraints; and (2) which locking algorithms should be used to avoid unpredictable delays? With regard to Question 1, LITMUSRT, a real-time extension of the Linux kernel, is presented and its design is discussed in detail. Notably, LITMUSRT implements link-based scheduling, a novel approach to controlling blocking due to non-preemptive sections. Each implemented scheduler (22 configurations in total) is evaluated under consideration of overheads on a 24-core Intel Xeon platform. The experiments show that partitioned earliest-deadline first (EDF) scheduling is generally preferable in a hard real-time setting, whereas global and clustered EDF scheduling are effective in a soft real-time setting. With regard to Question 2, real-time locking protocols are required to ensure that the maximum delay due to priority inversion can be bounded a priori. Several spinlock- and semaphore-based multiprocessor real-time locking protocols for mutual exclusion (mutex), reader-writer (RW) exclusion, and k-exclusion are proposed and analyzed. A new category of RW locks suited to worst-case analysis, termed phase-fair locks, is proposed and three efficient phase-fair spinlock implementations are provided (one with few atomic operations, one with low space requirements, and one with constant RMR complexity). Maximum priority-inversion blocking is proposed as a natural complexity measure for semaphore protocols. It is shown that there are two classes of schedulability analysis, namely suspension-oblivious and suspension-aware analysis, that yield two different lower bounds on blocking. Five asymptotically optimal locking protocols are designed and analyzed: a family of mutex, RW, and k-exclusion protocols for global, partitioned, and clustered scheduling that are asymptotically optimal in the suspension-oblivious case, and a mutex protocol for partitioned scheduling that is asymptotically optimal in the suspension-aware case. A LITMUSRT-based empirical evaluation is presented that shows these protocols to be practical

    A Methodology for Transforming Java Applications Towards Real-Time Performance

    Get PDF
    The development of real-time systems has traditionally been based on low-level programming languages, such as C and C++, as these provide a fine-grained control of the applications temporal behavior. However, the usage of such programming languages suffers from increased complexity and high error rates compared to high-level languages such as Java. The Java programming language provides many benefits to software development such as automatic memory management and platform independence. However, Java is unable to provide any real-time guarantees, as the high-level benefits come at the cost of unpredictable temporal behavior.This thesis investigates the temporal characteristics of the Java language and analyses several possibilities for introducing real-time guarantees, including official language extensions and commercial runtime environments. Based on this analysis a new methodology is proposed for Transforming Java Applications towards Real-time Performance (TJARP). This method motivates a clear definition of timing requirements, followed by an analysis of the system through use of the formal modeling languageVDM-RT. Finally, the method provides a set of structured guidelines to facilitate the choice of strategy for obtaining real-time performance using Java. To further support this choice, an analysis is presented of available solutions, supported by a simple case study and a series of benchmarks.Furthermore, this thesis applies the TJARP method to a complex industrialcase study provided by a leading supplier of mission critical systems. Thecase study proves how the TJARP method is able to analyze an existing and complex system, and successfully introduce hard real-time guaranteesin critical sub-components

    Scratchpad Memory Management For Multicore Real-Time Embedded Systems

    Get PDF
    Multicore systems will continue to spread in the domain of real-time embedded systems due to the increasing need for high-performance applications. This research discusses some of the challenges associated with employing multicore systems for safety-critical real-time applications. Mainly, this work is concerned with providing: 1) efficient inter-core timing isolation for independent tasks, and 2) predictable task communication for communicating tasks. Principally, we introduce a new task execution model, based on the 3-phase execution model, that exploits the Direct Memory Access (DMA) controllers available in modern embedded platforms along with ScratchPad Memories (SPMs) to enforce strong timing isolation between tasks. The DMA and the SPMs are explicitly managed to pre-load tasks from main memory into the local (private) scratchpad memories. Tasks are then executed from the local SPMs without accessing main memory. This model allows CPU execution to be overlapped with DMA loading/unloading operations from and to main memory. We show that by co-scheduling task execution on CPUs and using DMA to access memory and I/O, we can efficiently hide access latency to physical resources. In turn, this leads to significant improvements in system schedulability, compared to both the case of unregulated contention for access to physical resources and to previous cache and SPM management techniques for real-time systems. The presented SPM-centric scheduling algorithms and analyses cover single-core, partitioned, and global real-time systems. The proposed scheme is also extended to support large tasks that do not fit entirely into the local SPM. Moreover, the schedulability analysis considers the case of recovering from transient soft errors (bit flips caused by a single event upset) in several levels of memories, that cannot be automatically corrected in hardware by the ECC unit. The proposed SPM-centric scheduling is integrated at the OS level; thus it is transparent to applications. The proposed scheme is implemented and evaluated on an FPGA platform and a Commercial-Off-The-Shelf (COTS) platform. In regards to real-time task communication, two types of communication are considered. 1) Asynchronous inter-task communication, between either sequential tasks (single-threaded) or parallel tasks (multi-threaded). 2) Intra-task communication, where parallel threads of the same application exchange data. A new task scheduling model for parallel tasks (Bundled Scheduling) is proposed to facilitate intra-task communication and reduce synchronization overheads. We show that the proposed bundled scheduling model can be applied to several parallel programming models, such as fork-join and DAG-based applications, leading to improved system schedulability. Finally, intra-task communication is governed by a predictable inter-core communication platform. Specifically, we propose HopliteRT, a lean and predictable Network-on-Chip that connects the private SPMs

    Sharing GPUs for Real-Time Autonomous-Driving Systems

    Get PDF
    Autonomous vehicles at mass-market scales are on the horizon. Cameras are the least expensive among common sensor types and can preserve features such as color and texture that other sensors cannot. Therefore, realizing full autonomy in vehicles at a reasonable cost is expected to entail computer-vision techniques. These computer-vision applications require massive parallelism provided by the underlying shared accelerators, such as graphics processing units, or GPUs, to function “in real time.” However, when computer-vision researchers and GPU vendors refer to “real time,” they usually mean “real fast”; in contrast, certifiable automotive systems must be “real time” in the sense of being predictable. This dissertation addresses the challenging problem of how GPUs can be shared predictably and efficiently for real-time autonomous-driving systems. We tackle this challenge in four steps. First, we investigate NVIDIA GPUs with respect to scheduling, synchronization, and execution. We conduct an extensive set of experiments to infer NVIDIA GPU scheduling rules, which are unfortunately undisclosed by NVIDIA and are beyond access owing to their closed-source software stack. We also expose a list of pitfalls pertaining to CPU-GPU synchronization that can result in unbounded response times of GPU-using applications. Lastly, we examine a fundamental trade-off for designing real-time tasks under different execution options. Overall, our investigation provides an essential understanding of NVIDIA GPUs, allowing us to further model and analyze GPU tasks. Second, we develop a new model and conduct schedulability analysis for GPU tasks. We extend the well-studied sporadic task model with additional parameters that characterize the parallel execution of GPU tasks. We show that NVIDIA scheduling rules are subject to fundamental capacity loss, which implies a necessary total utilization bound. We derive response-time bounds for GPU task systems that satisfy our schedulability conditions. Third, we address an industrial challenge of supplying the throughput performance of computer-vision frameworks to support adequate coverage and redundancy offered by an array of cameras. We re-think the design of convolution neural network (CNN) software to better utilize hardware resources and achieve increased throughput (number of simultaneous camera streams) without any appreciable increase in per-frame latency (camera to CNN output) or reduction of per-stream accuracy. Fourth, we apply our analysis to a finer-grained graph scheduling of a computer-vision standard, OpenVX, which explicitly targets embedded and real-time systems. We evaluate both the analytical and empirical real-time performance of our approach.Doctor of Philosoph

    Predicting software performance in symmetric multi-core and multiprocessor Environments

    Get PDF
    With today\u27s rise of multi-core processors, concurrency becomes a ubiquitous challenge in software development.Performance prediction methods have to reflect the influence of multiprocessing environments on software performance in order to help software architects to find potential performance problems during early development phases. In this thesis, we address the influence of the operating system scheduler on software performance in symmetric multiprocessing environments

    Distributed Real-time Systems - Deterministic Protocols for Wireless Networks and Model-Driven Development with SDL

    Get PDF
    In a networked system, the communication system is indispensable but often the weakest link w.r.t. performance and reliability. This, particularly, holds for wireless communication systems, where the error- and interference-prone medium and the character of network topologies implicate special challenges. However, there are many scenarios of wireless networks, in which a certain quality-of-service has to be provided despite these conditions. In this regard, distributed real-time systems, whose realization by wireless multi-hop networks becomes increasingly popular, are a particular challenge. For such systems, it is of crucial importance that communication protocols are deterministic and come with the required amount of efficiency and predictability, while additionally considering scarce hardware resources that are a major limiting factor of wireless sensor nodes. This, in turn, does not only place demands on the behavior of a protocol but also on its implementation, which has to comply with timing and resource constraints. The first part of this thesis presents a deterministic protocol for wireless multi-hop networks with time-critical behavior. The protocol is referred to as Arbitrating and Cooperative Transfer Protocol (ACTP), and is an instance of a binary countdown protocol. It enables the reliable transfer of bit sequences of adjustable length and deterministically resolves contest among nodes based on a flexible priority assignment, with constant delays, and within configurable arbitration radii. The protocol's key requirement is the collision-resistant encoding of bits, which is achieved by the incorporation of black bursts. Besides revisiting black bursts and proposing measures to optimize their detection, robustness, and implementation on wireless sensor nodes, the first part of this thesis presents the mode of operation and time behavior of ACTP. In addition, possible applications of ACTP are illustrated, presenting solutions to well-known problems of distributed systems like leader election and data dissemination. Furthermore, results of experimental evaluations with customary wireless transceivers are outlined to provide evidence of the protocol's implementability and benefits. In the second part of this thesis, the focus is shifted from concrete deterministic protocols to their model-driven development with the Specification and Description Language (SDL). Though SDL is well-established in the domain of telecommunication and distributed systems, the predictability of its implementations is often insufficient as previous projects have shown. To increase this predictability and to improve SDL's applicability to time-critical systems, real-time tasks, an approved concept in the design of real-time systems, are transferred to SDL and extended to cover node-spanning system tasks. In this regard, a priority-based execution and suspension model is introduced in SDL, which enables task-specific priority assignments in the SDL specification that are orthogonal to the static structure of SDL systems and control transition execution orders on design as well as on implementation level. Both the formal incorporation of real-time tasks into SDL and their implementation in a novel scheduling strategy are discussed in this context. By means of evaluations on wireless sensor nodes, evidence is provided that these extensions reduce worst-case execution times substantially, and improve the predictability of SDL implementations and the language's applicability to real-time systems
    corecore