917 research outputs found

    MARACAS: a real-time multicore VCPU scheduling framework

    Full text link
    This paper describes a multicore scheduling and load-balancing framework called MARACAS, to address shared cache and memory bus contention. It builds upon prior work centered around the concept of virtual CPU (VCPU) scheduling. Threads are associated with VCPUs that have periodically replenished time budgets. VCPUs are guaranteed to receive their periodic budgets even if they are migrated between cores. A load balancing algorithm ensures VCPUs are mapped to cores to fairly distribute surplus CPU cycles, after ensuring VCPU timing guarantees. MARACAS uses surplus cycles to throttle the execution of threads running on specific cores when memory contention exceeds a certain threshold. This enables threads on other cores to make better progress without interference from co-runners. Our scheduling framework features a novel memory-aware scheduling approach that uses performance counters to derive an average memory request latency. We show that latency-based memory throttling is more effective than rate-based memory access control in reducing bus contention. MARACAS also supports cache-aware scheduling and migration using page recoloring to improve performance isolation amongst VCPUs. Experiments show how MARACAS reduces multicore resource contention, leading to improved task progress.http://www.cs.bu.edu/fac/richwest/papers/rtss_2016.pdfAccepted manuscrip

    Analysis of Dynamic Memory Bandwidth Regulation in Multi-core Real-Time Systems

    Full text link
    One of the primary sources of unpredictability in modern multi-core embedded systems is contention over shared memory resources, such as caches, interconnects, and DRAM. Despite significant achievements in the design and analysis of multi-core systems, there is a need for a theoretical framework that can be used to reason on the worst-case behavior of real-time workload when both processors and memory resources are subject to scheduling decisions. In this paper, we focus our attention on dynamic allocation of main memory bandwidth. In particular, we study how to determine the worst-case response time of tasks spanning through a sequence of time intervals, each with a different bandwidth-to-core assignment. We show that the response time computation can be reduced to a maximization problem over assignment of memory requests to different time intervals, and we provide an efficient way to solve such problem. As a case study, we then demonstrate how our proposed analysis can be used to improve the schedulability of Integrated Modular Avionics systems in the presence of memory-intensive workload.Comment: Accepted for publication in the IEEE Real-Time Systems Symposium (RTSS) 2018 conferenc

    Bounding Worst-Case Data Cache Behavior by Analytically Deriving Cache Reference Patterns

    Get PDF
    While caches have become invaluable for higher-end architectures due to their ability to hide, in part, the gap between processor speed and memory access times, caches (and particularly data caches) limit the timing predictability for data accesses that may reside in memory or in cache. This is a significant problem for real-time systems. The objective our work is to provide accurate predictions of data cache behavior of scalar and non-scalar references whose reference patterns are known at compile time. Such knowledge about cache behavior provides the basis for significant improvements in bounding the worst-case execution time (WCET) of real-time programs, particularly for hard-to-analyze data caches. We exploit the power of the Cache Miss Equations (CME) framework but lift a number of limitations of traditional CME to generalize the analysis to more arbitrary programs. We further devised a transformation, coined “forced” loop fusion, which facilitates the analysis across sequential loops. Our contributions result in exact data cache reference patterns — in contrast to approximate cache miss behavior of prior work. Experimental results indicate improvements on the accuracy of worst-case data cache behavior up to two orders of magnitude over the original approach. In fact, our results closely bound and sometimes even exactly match those obtained by trace-driven simulation for worst-case inputs. The resulting WCET bounds of timing analysis confirm these findings in terms of providing tight bounds. Overall, our contributions lift analytical approaches to predict data cache behavior to a level suitable for efficient static timing analysis and, subsequently, real-time schedulability of tasks with predictable WCET

    Removing Abstraction Overhead in the Composition of Hierarchical Real-Time System

    Get PDF
    The hierarchical real-time scheduling framework is a widely accepted model to facilitate the design and analysis of the increasingly complex real-time systems. Interface abstraction and composition are the key issues in the hierarchical scheduling framework analysis. Schedulability is essential to guarantee that the timing requirements of all components are satisfied. In order for the design to be resource efficient, the composition must be bandwidth optimal. Associativity is desirable for open systems in which components may be added or deleted at run time. Previous techniques on compositional scheduling are either not resource efficient in some aspects, or cannot achieve optimality and associativity at the same time. In this paper, several important properties regarding the periodic resource model are identified. Based on those properties, we propose a novel interface abstraction and composition framework which achieves schedulability, optimality, and associativity. Our approach eliminates abstraction overhead in the composition

    WCET Derivation under Single Core Equivalence with Explicit Memory Budget Assignment

    Get PDF
    In the last decade there has been a steady uptrend in the popularity of embedded multi-core platforms. This represents a turning point in the theory and implementation of real-time systems. From a real-time standpoint, however, the extensive sharing of hardware resources (e.g. caches, DRAM subsystem, I/O channels) represents a major source of unpredictability. Budget-based memory regulation (throttling) has been extensively studied to enforce a strict partitioning of the DRAM subsystem’s bandwidth. The common approach to analyze a task under memory bandwidth regulation is to consider the budget of the core where the task is executing, and assume the worst-case about the remaining cores' budgets. In this work, we propose a novel analysis strategy to derive the WCET of a task under memory bandwidth regulation that takes into account the exact distribution of memory budgets to cores. In this sense, the proposed analysis represents a generalization of approaches that consider (i) even budget distribution across cores; and (ii) uneven but unknown (except for the core under analysis) budget assignment. By exploiting the additional piece of information, we show that it is possible to derive a more accurate WCET estimation. Our evaluations highlight that the proposed technique can reduce overestimation by 30% in average, and up to 60%, compared to the state of the art.Accepted manuscrip

    Dynamic Memory Bandwidth Allocation for Real-Time GPU-Based SoC Platforms

    Get PDF
    Heterogeneous SoC platforms, comprising both general purpose CPUs and accelerators such as a GPU, are becoming increasingly attractive for real-time and mixed-criticality systems to cope with the computational demand of data parallel applications. However, contention for access to shared main memory can lead to significant performance degradation on both CPU and GPU. Existing work has shown that memory bandwidth throttling is effective in protecting real-time applications from memory-intensive, best-effort ones; however, due to the inherent pessimism involved in worst-case execution time estimation, such approaches can unduly restrict the bandwidth available to best-effort applications. In this work, we propose a novel memory bandwidth allocation scheme where we dynamically monitor the progress of a real-time application and increase the bandwidth share of best-effort ones whenever it is safe to do so. Specifically, we demonstrate our approach by protecting a real-time GPU kernel from best-effort CPU tasks. Based on profiling information, we first build a worst case execution time estimation model for the GPU kernel. Using such model, we then show how to dynamically recompute on-line the maximum memory budget that can be allocated to best-effort tasks without exceeding the kernel’s assigned execution budget. We implement our proposed technique on NVIDIA embedded SoC and demonstrate its effectiveness on a variety of GPU and CPU benchmarks

    Towards a Reconfiguration Service for Distributed Real-Time Java

    Get PDF
    REACTION 2012. 1st International workshop on Real-time and distributed computing in emerging applications. December 4th, 2012, San Juan, Puerto Rico.Ancient monolithic distributed systems were attached to well-known development practices and offline analysis. Current scenarios are more dynamic, and open, plenty of applications and services which appear and disappear dynamically at runtime. Likewise, these scenarios require taking into account actions that were traditionally addressed offline, this time in an online scenario. This paper contributes a reconfiguration service in the context of distributed real-time Java application as a means to include real-time reconfiguration into next generation real-time Java systems. The paper addresses the integration taking into account changes required in the API and the cost of some reconfiguration strategies.This research was partially supported by the European Commission (ARTIST2 NoE, ST-2004-004527; iLAND ARTEMIS-JU Call 1) and by the Spanish national project REM4VSS (TIN-2011-28339)

    IRQ Coloring: Mitigating Interrupt-Generated Interference on ARM Multicore Platforms

    Get PDF
    Mixed-criticality systems, which consolidate workloads with different criticalities, must comply with stringent spatial and temporal isolation requirements imposed by safety-critical standards (e.g., ISO26262). This, per se, has proven to be a challenge with the advent of multicore platforms due to the inner interference created by multiple subsystems while disputing access to shared resources. With this work, we pioneer the concept of Interrupt (IRQ) coloring as a novel mechanism to minimize the interference created by co-existing interrupt-driven workloads. The main idea consists of selectively deactivating specific ("colored") interrupts if the QoS of critical workloads (e.g., Virtual Machines) drops below a well-defined threshold. The IRQ Coloring approach encompasses two artifacts, i.e., the IRQ Coloring Design-Time Tool (IRQ DTT) and the IRQ Coloring Run-Time Mechanism (IRQ RTM). In this paper, we focus on presenting the conceptual IRQ coloring design, describing the first prototype of the IRQ RTM on Bao hypervisor, and providing initial evidence about the effectiveness of the proposed approach on a synthetic use case
    • …
    corecore