Search CORE

11 research outputs found

Deterministic Scheduling of Real-Time Tasks on Heterogeneous Multicore Platforms

Author: Ali Waqar
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2020
Field of study

In recent years, the problem of real-time scheduling has increasingly become more important as well as more complicated. The former is due to the proliferation of safety critical systems into our day-to-day life; such as autonomous vehicles, fueled by the recent advances in artificial intelligence. The latter is caused by the increasing demand for high performance which is driving the adoption of highly integrated complex heterogeneous system-on-chip (SoC) processors to deliver the performance while meeting strict size, weight, power (SWaP) and cost constraints. Motivated by these trends, this dissertation tackles the following main question: how can we guarantee predictable real-time execution on heterogeneous multicore SoCs while preserving high utilization? The fundamental problem in preserving the determinism of the real-time system realized on a heterogeneous multicore SoC is ensuring that the worst-case execution time (WCET) of each task, measured in isolation, will stay within a reasonable bound during the actual execution of the system. The primary challenge in achieving this goal---tightly bounding task WCETs---is that the execution time of a task can be highly non-deterministic, often varying significantly depending on which tasks are co-scheduled and how they contend on various shared hardware resources in the memory hierarchy. The particular scheduling requirements (e.g., non-preemption) of the different computing resources (e.g., integrated GPU) in the heterogeneous SoC and the possible cross-contention among their workloads can also exacerbate this problem. In light of these considerations, this dissertation presents new real-time scheduling techniques for predictable and efficient scheduling of mixed criticality workloads on heterogeneous SoCs. The contributions of this dissertation include the following: 1) A novel CPU-GPU scheduling framework that ensures predictable execution of critical GPU kernels on integrated CPU-GPU platforms. 2) A novel gang scheduling framework which guarantees deterministic execution of parallel real-time tasks on the multicore CPU cluster of a heterogeneous SoC. 3) Optimal and heuristic algorithms for gang formation that increase real-time schedulability under the RT-Gang framework and their extension to incorporate scheduling on accelerators in a heterogeneous SoC. 4) Concrete evaluation results using simulated tasksets as well as real-world workloads that demonstrate the analytical and practical benefits of the proposed techniques

DuoMC: Tight DRAM Latency Bounds with Shared Banks and Near-COTS Performance

Author: Hassan Mohamed
Mirosanlou Reza
Pellizzoni Rodolfo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/09/2021
Field of study

© {Reza Mirosanlou, Mohamed Hassan, and Rodolfo Pellizzoni | ACM} {2021}. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in {In The International Symposium on Memory Systems }, https://doi.org/10.1145/3488423.3488431DRAM memory controllers (MCs) in COTS systems are designed primarily for average performance, offering no worst-case guarantees, while real-time MCs provide timing guarantees at the cost of a significant average performance degradation. For this reason, hardware vendors have been reluctant to integrate real-time solutions in high-performance platforms. In this paper, we overcome this performance-predictability trade-off by introducing DuoMC, a novel memory controller that promotes to augment COTS MCs with a real-time scheduler and run-time monitoring to provide predictability guarantees. Leveraging the fact that the resource is barely overloaded, DuoMC allows the system to enjoy the high performance of the conventional MC most of the time, while switching to the real-time scheduler only when timing guarantees risk being violated, which rarely occurs. In addition, unlike most existing real-time MCs, DuoMC enables the utilization of both private and shared DRAM banks among cores to facilitate communication among tasks. We evaluate DuoMC using a cycle-accurate multi-core simulator. Results show that DuoMC can provide better or comparable latency guarantees than state-of-the-art real-time MCs with limited performance loss (only 8% in the worst scenario) compared to the COTS MC

University of Waterloo's Institutional Repository

Timing Predictable and High-Performance Hardware Cache Coherence Mechanisms for Real-Time Multi-Core Platforms

Author: Kaushik Anirudh Mohan
Publication venue: 'University of Waterloo'
Publication date: 14/06/2021
Field of study

Multi-core platforms are becoming primary compute platforms for real-time systems such as avionics and autonomous vehicles. This adoption is primarily driven by the increasing application demands deployed in real-time systems, and the cost and performance benefits of multi-core platforms. For real-time applications, satisfying safety properties in the form of timing predictability, is the paramount consideration. Providing such guarantees on safety properties requires applying some timing analysis on the application executing on the compute platform. The timing analysis computes an upper bound on the application’s execution time on the compute platform, which is referred to as the worst-case execution time (WCET). However, multi-core platforms pose challenges that complicate the timing analysis. Among these challenges are timing challenges caused due to simultaneous accesses from multiple cores to shared hardware resources such as shared caches, interconnects, and off-chip memories. Supporting timing predictable shared data communication between real-time applications further compounds this challenge as a core’s access to shared data is dependent on the simultaneous memory activity from other cores on the shared data. Although hardware cache coherence mechanisms are the primary high-performance data communication mechanisms in current multi-core platforms, there has been very little use of these mechanisms to support timing predictable shared data communication in real-time multi-core platforms. Rather, current state-of-the-art approaches to timing predictable shared data communication sidestep hardware cache coherence. These approaches enforce memory and execution constraints on the shared data to simplify the timing analysis at the expense of application performance. This thesis makes the case for timing predictable hardware cache coherence mechanisms as viable shared data communication mechanisms for real-time multi-core platforms. A key takeaway from the contributions in this thesis is that timing predictable hardware cache coherence mechanisms offer significant application performance over prior state-of-the-art data communication approaches while guaranteeing timing predictability. This thesis has three main contributions. First, this thesis shows how a hardware cache coherence mechanism can be designed to be timing predictable by defining design invariants that guarantee timing predictability. We apply these design invariants and design timing predictable variants of existing conventional cache coherence mechanisms. Evaluation of these timing predictable cache coherence mechanisms show that they provide significant application performance over state-of-the-art approaches while delivering timing predictability. Second, we observe that the large worst-case memory access latency under timing predictable hardware cache coherence mechanisms questions their applicability as a data communication mechanism in real-time multi-core platforms. To this end, we present a systematic framework to design better timing predictable cache coherence mechanisms that balance high application performance and low worst-case memory access latency. Our systematic framework concisely captures the design features of timing predictable cache coherence mechanisms that impacts their WCET, and identifies a spectrum of approaches to reduce the worst-case memory access latency. We describe one approach and show that this approach reduces the worst-case memory access latency of timing predictable cache coherence mechanisms to be the same as alternative approaches while trading away minimal performance in the original cache coherence mechanisms. Third, we design a timing predictable hardware cache coherence mechanism for multi-core platforms used in mixed-critical real-time systems (MCS). Applications in MCS have varying performance and timing predictability requirements. We design a timing predictable cache coherence mechanism that considers these differing requirements and ensures that applications with no timing predictability requirements do not impact applications with strict predictability requirements

University of Waterloo's Institutional Repository

Proceedings of the 7th Junior Researcher Workshop on Real-Time Computing: JRWRTC 2013: Sophia Antipolis, France, October 16-18, 2013

Author: Altmeyer S.
Publication venue: Faculty of Science, University of Amsterdam
Publication date: 01/01/2013
Field of study