4,026 research outputs found

    MARACAS: a real-time multicore VCPU scheduling framework

    Full text link
    This paper describes a multicore scheduling and load-balancing framework called MARACAS, to address shared cache and memory bus contention. It builds upon prior work centered around the concept of virtual CPU (VCPU) scheduling. Threads are associated with VCPUs that have periodically replenished time budgets. VCPUs are guaranteed to receive their periodic budgets even if they are migrated between cores. A load balancing algorithm ensures VCPUs are mapped to cores to fairly distribute surplus CPU cycles, after ensuring VCPU timing guarantees. MARACAS uses surplus cycles to throttle the execution of threads running on specific cores when memory contention exceeds a certain threshold. This enables threads on other cores to make better progress without interference from co-runners. Our scheduling framework features a novel memory-aware scheduling approach that uses performance counters to derive an average memory request latency. We show that latency-based memory throttling is more effective than rate-based memory access control in reducing bus contention. MARACAS also supports cache-aware scheduling and migration using page recoloring to improve performance isolation amongst VCPUs. Experiments show how MARACAS reduces multicore resource contention, leading to improved task progress.http://www.cs.bu.edu/fac/richwest/papers/rtss_2016.pdfAccepted manuscrip

    A survey of techniques for reducing interference in real-time applications on multicore platforms

    Get PDF
    This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas Técnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de Próxima Generación" under Grant IND2019/TIC-17261

    Software development of reconfigurable real-time systems : from specification to implementation

    Get PDF
    This thesis deals with reconfigurable real-time systems solving real-time tasks scheduling problems in a mono-core and multi-core architectures. The main focus in this thesis is on providing guidelines, methods, and tools for the synthesis of feasible reconfigurable real-time systems in a mono-processor and multi-processor architectures. The development of these systems faces various challenges particularly in terms of stability, energy consumption, response and blocking time. To address this problem, we propose in this work a new strategy of i) placement and scheduling of tasks to execute real-time applications on mono-core and multi-core architectures, ii) optimization step based on Mixed integer linear programming (MILP), and iii) guidance tool that assists designers to implement a feasible multi-core reconfigurable real-time from specification level to implementation level. We apply and simulate the contribution to a case study, and compare the proposed results with related works in order to show the originality of this methodology.Echtzeitsysteme laufen unter harten Bedingungen an ihre Ausführungszeit. Die Einhaltung der Echtzeit-Bedingungen bestimmt die Zuverlässigkeit und Genauigkeit dieser Systeme. Neben den Echtzeit-Bedingungen müssen rekonfigurierbare Echtzeitsysteme zusätzliche Rekonfigurations-Bedingungen erfüllen. Diese Arbeit beschäftigt sich mit rekonfigurierbaren Echtzeitsystemen in Mono- und Multicore-Architekturen. An die Entwicklung dieser Systeme sind verschiedene Anforderungen gestellt. Insbesondere muss die Rekonfigurierbarkeit beachtet werden. Dabei sind aber Echtzeit-Bedingungen und Ressourcenbeschränkungen weiterhin zu beachten. Darüber hinaus werden die Kosten für die Entwicklung dieser Systeme insbesondere durch falsche Designentscheidungen in den frühen Phasen der Entwicklung stark beeinträchtigt. Das Hauptziel in dieser Arbeit liegt deshalb auf der Bereitstellung von Handlungsempfehlungen, Methoden und Werkzeugen für die zielgerichtete Entwicklung von realisierbaren rekonfigurierbaren Echtzeitsystemen in Mono- und Multicore-Architekturen. Um diese Herausforderungen zu adressieren wird eine neue Strategie vorgeschlagen, die 1) die Funktionsallokation, 2) die Platzierung und das Scheduling von Tasks, 3) einen Optimierungsschritt auf der Basis von Mixed Integer Linear Programming (MILP) und 4) eine entscheidungsunterstützende Lösung umfasst, die den Designern hilft, eine realisierbare rekonfigurierbare Echtzeitlösung von der Spezifikationsebene bis zur Implementierungsebene zu entwickeln. Die vorgeschlagene Methodik wird auf eine Fallstudie angewendet und mit verwandten Arbeiten vergliche

    Model-driven Scheduling for Distributed Stream Processing Systems

    Full text link
    Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink, Spark streaming. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.Comment: 54 page

    Fixed-Priority Memory-Centric Scheduler for COTS-Based Multiprocessors

    Get PDF
    Memory-centric scheduling attempts to guarantee temporal predictability on commercial-off-the-shelf (COTS) multiprocessor systems to exploit their high performance for real-time applications. Several solutions proposed in the real-time literature have hardware requirements that are not easily satisfied by modern COTS platforms, like hardware support for strict memory partitioning or the presence of scratchpads. However, even without said hardware support, it is possible to design an efficient memory-centric scheduler. In this article, we design, implement, and analyze a memory-centric scheduler for deterministic memory management on COTS multiprocessor platforms without any hardware support. Our approach uses fixed-priority scheduling and proposes a global "memory preemption" scheme to boost real-time schedulability. The proposed scheduling protocol is implemented in the Jailhouse hypervisor and Erika real-time kernel. Measurements of the scheduler overhead demonstrate the applicability of the proposed approach, and schedulability experiments show a 20% gain in terms of schedulability when compared to contention-based and static fair-share approaches

    Cache Interference-aware Task Partitioning for Non-preemptive Real-time Multi-core Systems

    Get PDF
    Shared caches in multi-core processors introduce serious difficulties in providing guarantees on the real-time properties of embedded software due to the interaction and the resulting contention in the shared caches. Prior work has studied the schedulability analysis of global scheduling for real-time multi-core systems with shared caches. This article considers another common scheduling paradigm: partitioned scheduling in the presence of shared cache interference. To achieve this, we propose CITTA, a cache interference-aware task partitioning algorithm. We first analyze the shared cache interference between two programs for set-associative instruction and data caches. Then, an integer programming formulation is constructed to calculate the upper bound on cache interference exhibited by a task, which is required by CITTA. We conduct schedulability analysis of CITTA and formally prove its correctness. A set of experiments is performed to evaluate the schedulability performance of CITTA against global EDF scheduling and other greedy partition approaches such as First-fit and Worst-fit over randomly generated tasksets and realistic workloads in embedded systems. Our empirical evaluations show that CITTA outperforms global EDF scheduling and greedy partition approaches in terms of task sets deemed schedulable

    Optimisation for Optical Data Centre Switching and Networking with Artificial Intelligence

    Get PDF
    Cloud and cluster computing platforms have become standard across almost every domain of business, and their scale quickly approaches O(106)\mathbf{O}(10^6) servers in a single warehouse. However, the tier-based opto-electronically packet switched network infrastructure that is standard across these systems gives way to several scalability bottlenecks including resource fragmentation and high energy requirements. Experimental results show that optical circuit switched networks pose a promising alternative that could avoid these. However, optimality challenges are encountered at realistic commercial scales. Where exhaustive optimisation techniques are not applicable for problems at the scale of Cloud-scale computer networks, and expert-designed heuristics are performance-limited and typically biased in their design, artificial intelligence can discover more scalable and better performing optimisation strategies. This thesis demonstrates these benefits through experimental and theoretical work spanning all of component, system and commercial optimisation problems which stand in the way of practical Cloud-scale computer network systems. Firstly, optical components are optimised to gate in 500ps\approx 500 ps and are demonstrated in a proof-of-concept switching architecture for optical data centres with better wavelength and component scalability than previous demonstrations. Secondly, network-aware resource allocation schemes for optically composable data centres are learnt end-to-end with deep reinforcement learning and graph neural networks, where 3×3\times less networking resources are required to achieve the same resource efficiency compared to conventional methods. Finally, a deep reinforcement learning based method for optimising PID-control parameters is presented which generates tailored parameters for unseen devices in O(103)s\mathbf{O}(10^{-3}) s. This method is demonstrated on a market leading optical switching product based on piezoelectric actuation, where switching speed is improved >20%>20\% with no compromise to optical loss and the manufacturing yield of actuators is improved. This method was licensed to and integrated within the manufacturing pipeline of this company. As such, crucial public and private infrastructure utilising these products will benefit from this work

    Real-time systems on multicore platforms: managing hardware resources for predictable execution

    Full text link
    Shared hardware resources in commodity multicore processors are subject to contention from co-running threads. The resultant interference can lead to highly-variable performance for individual applications. This is particularly problematic for real-time applications, which require predictable timing guarantees. It also leads to a pessimistic estimate of the Worst Case Execution Time (WCET) for every real-time application. More CPU time needs to be reserved, thus less applications can enter the system. As the average execution time is usually far less than the WCET, a significant amount of reserved CPU resource would be wasted. Previous works have attempted partitioning the shared resources, amongst either CPUs or processes, to improve performance isolation. However, they have not proven to be both efficient and effective. In this thesis, we propose several mechanisms and frameworks that manage the shared caches and memory buses on multicore platforms. Firstly, we introduce a multicore real-time scheduling framework with the foreground/background scheduling model. Combining real-time load balancing with background scheduling, CPU utilization is greatly improved. Besides, a memory bus management mechanism is implemented on top of the background scheduling, making sure bus contention is under control while utilizing unused CPU cycles. Also, cache partitioning is thoroughly studied in this thesis, with a cache-aware load balancing algorithm and a dynamic cache partitioning framework proposed. Lastly, we describe a system architecture to integrate the above solutions all together. It tackles one of the toughest problems in OS innovation, legacy support, by converting existing OSes into libraries in a virtualized environment. Thus, within a single multicore platform, we benefit from the fine-grained resource control of a real-time OS and the richness of functionality of a general-purpose OS
    corecore