188 research outputs found
Arbitration-Induced Preemption Delays
The interactions among concurrent tasks pose a challenge in the design of real-time multi-core systems, where blocking delays that tasks may experience while accessing shared memory have to be taken into consideration. Various memory arbitration schemes have been devised that address these issues, by providing trade-offs between predictability, average-case performance, and analyzability. Time-Division Multiplexing (TDM) is a well-known arbitration scheme due to its simplicity and analyzability. However, it suffers from low resource utilization due to its non-work-conserving nature. We proposed in our recent work dynamic schemes based on TDM, showing work-conserving behavior in practice, while retaining the guarantees of TDM. These approaches have only been evaluated in a restricted setting. Their applicability in a preemptive setting appears problematic, since they may induce long memory blocking times depending on execution history. These blocking delays may induce significant jitter and consequently increase the tasks\u27 response times.
This work explores means to manage and, finally, bound these blocking delays. Three different schemes are explored and compared with regard to their analyzability, impact on response-time analysis, implementation complexity, and runtime behavior. Experiments show that the various approaches behave virtually identically at runtime. This allows to retain the approach combining low implementation complexity with analyzability
Recommended from our members
A Globally Arbitrated Memory Tree for Mixed-Time-Criticality Systems
Embedded systems are increasingly based on multi-core platforms to accommodate a growing number of applications, some of which have real-time requirements. Resources, such as off-chip DRAM, are typically shared between the applications using memory interconnects with different arbitration polices to cater to diverse bandwidth and latency requirements. However, traditional centralized interconnects are not scalable as the number of clients increase. Similarly, current distributed interconnects either cannot satisfy the diverse requirements or have decoupled arbitration stages, resulting in larger area, power and worst-case latency. The four main contributions of this article are: 1) a Globally Arbitrated Memory Tree (GAMT) with a distributed architecture that scales well with the number of cores, 2) an RTL-level implementation that can be configured with five arbitration policies (three distinct and two as special cases), 3) the concept of mixed arbitration policies that allows the policy to be selected individually per core, and 4) a worst-case analysis for a mixed arbitration policy that combines TDM and FBSP arbitration.We compare the performance of GAMT with centralized implementations and show that it can run up to four times faster and have over 51 and 37 percent reduction in area and power consumption, respectively, for a given bandwidth
Isolation-Aware Timing Analysis and Design Space Exploration for Predictable and Composable Many-Core Systems
Composable many-core systems enable the independent development and analysis of applications which will be executed on a shared platform where the mix of concurrently executed applications may change dynamically at run time. For each individual application, an off-line DSE is performed to compute several mapping alternatives on the platform, offering Pareto-optimal trade-offs in terms of real-time guarantees, resource usage, etc. At run time, one mapping is then chosen to launch the application on demand. In this context, to enable an independent analysis of each individual application at design time, so-called inter-application isolation schemes are applied which specify temporal/spatial isolation policies between applications. State-of-the-art composable many-core systems are developed based on a fixed isolation scheme that is exclusively applied to every resource in every mapping of every application and use a timing analysis tailored to that isolation scheme to derive timing guarantees for each mapping. A fixed isolation scheme, however, heavily restricts the explored space of solutions and can, therefore, lead to suboptimality. Lifting this restriction necessitates a timing analysis that is applicable to mappings with an arbitrary mix of isolation schemes on different resources. To address this issue, in this paper, we (a) present an isolation-aware timing analysis that - unlike existing analyses - can handle multiple isolation schemes in combination within one mapping and delivers safe yet tight timing bounds by identifying and excluding interference scenarios that can never happen under the given combination of isolation schemes. Based on the timing analysis, we (b) present a DSE which explores the choices of isolation scheme per resource within each mapping and uses the proposed timing analysis for timing verification. Experimental results demonstrate that, for a variety of real-time applications and many-core platforms, the proposed approach achieves an improvement of up to 67% in the quality of delivered mappings compared to approaches based on a fixed isolation scheme
Erreichen von Performance in Netzwerken-On-Chip fĂĽr Echtzeitsysteme
In many new applications, such as in automatic driving, high performance requirements have reached safety critical real-time systems. Consequently, Networks-on-Chip (NoCs) must efficiently host new sets of highly dynamic workloads e.g., high resolution sensor fusion and data processing, autonomous decision’s making combined with machine learning.
The static platform management, as used in current safety critical systems, is no more sufficient to provide the needed level of service. A dynamic platform management could meet the challenge, but it usually suffers from a lack of predictability and the simplicity necessary for certification of safety and real-time properties. In this work, we propose a novel, global and dynamic arbitration for NoCs
with real-time QoS requirements. The mechanism decouples the admission control from arbitration in routers thereby simplifying a dynamic adaptation and real-time analysis. Consequently, the proposed solution allows the deployment of a sophisticated contract-based QoS provisioning without introducing complicated and hard to maintain schemes, known from the frequently applied static arbiters.
The presented work introduces an overlay network to synchronize transmissions using arbitration units called Resource Managers (RMs), which allows global and work-conserving scheduling. The description of resource allocation strategies is supplemented by protocol design and verification methodology bringing adaptive control to NoC communication in setups with different QoS requirements and traffic classes. For doing that, a formal worst-case timing analysis for the mechanism has been proposed which demonstrates that this solution not only exposes higher performance in simulation but, even more importantly, consistently reaches smaller formally guaranteed worst-case latencies than other strategies for realistic levels of system's utilization.
The approach is not limited to a specific network architecture or topology as the mechanism does not require modifications of routers and therefore can be used together with the majority of existing manycore systems. Indeed, the evaluation followed using the generic performance optimized router designs, as well as two systems-on-chip focused on real-time deployments. The results confirmed that the proposed approach proves to exhibit significantly higher average performance in simulation and execution.In vielen neuen sicherheitskritische Anwendungen, wie z.B. dem automatisierten
Fahren, werden groĂźe Anforderungen an die Leistung von Echtzeitsysteme gestellt.
Daher mĂĽssen Networks-on-Chip (NoCs) neue, hochdynamische Workloads
wie z.B. hochauflösende Sensorfusion und Datenverarbeitung oder autonome Entscheidungsfindung
kombiniert mit maschineller Lernen, effizient auf einem System unterbringen.
Die Steuerung der zugrunde liegenden NoC-Architektur, muss die Systemsicherheit vor Fehlern,
resultierend aus dem dynamischen Verhalten des Systems schĂĽtzen und
gleichzeitig die geforderte Performance bereitstellen.
In dieser Arbeit schlagen wir eine neuartige, globale und dynamische Steuerung
fĂĽr NoCs mit Echtzeit QoS Anforderungen vor. Das Schema entkoppelt die Zutrittskontrolle
von der Arbitrierung in Routern. Hierdurch wird eine dynamische Anpassung
ermöglicht und die Echtzeitanalyse vereinfacht. Der Einsatz einer ausgefeilten
vertragsbasierten Ressourcen-Zuweisung wird so ermöglicht, ohne komplexe und schwer wartbare Mechanismen, welche bereits aus dem statischen Plattformmanagement bekannt sind einzuführen.
Diese Arbeit stellt ein ĂĽbergelagertes Netzwerk vor, welches Ăśbertragungen mit
Hilfe von Arbitrierungseinheiten, den so genannten Resource Managern (RMs),
synchronisiert. Dieses überlagerte Netzwerk ermöglicht eine globale und lasterhaltende
Steuerung. Die Beschreibung verschiedener Ressourcenzuweisungstrategien
wird ergänzt durch ein Protokolldesign und Methoden zur Verifikation der
adaptiven NoC Steuerung mit unterschiedlichen QoS Anforderungen und Verkehrsklassen.
Hierfür wird eine formale Worst Case Timing Analyse präsentiert,
welche das vorgestellte Verfahren abbildet. Die Resultate bestätitgen, dass die präsentierte
Lösung nicht nur eine höhere Performance in der Simulation bietet, sondern
auch formal kleinere Worst-Case Latenzen fĂĽr realistische Systemauslastungen
als andere Strategien garantiert.
Der vorgestellte Ansatz ist nicht auf eine bestimmte Netzwerkarchitektur oder
Topologie beschränkt, da der Mechanismus keine Änderungen an den unterliegenden
Routern erfordert und kann daher zusammen mit bestehenden Manycore-Systemen
eingesetzt werden. Die Evaluierung erfolgte auf Basis eines leistungsoptimierten
Router-Designs sowie zwei auf Echtzeit-Anwendungen fokusierten Platformen.
Die Ergebnisse bestätigten, dass der vorgeschlagene Ansatz im Durchschnitt
eine deutlich höhere Leistung in der Simulation und Ausführung liefert
A survey of techniques for reducing interference in real-time applications on multicore platforms
This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas TĂ©cnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de PrĂłxima GeneraciĂłn" under Grant IND2019/TIC-17261
A Scenario-Aware Dataflow Programming Model
The FSM-SADF model of computation allows to find a tight bound on the throughput of firm real-time applications by capturing dynamic variations in scenarios. We explore an FSM-SADF programming model, and propose three different alternatives for scenario switching. The best candidate for our CompSOC platform was implemented, and experiments confirm that the tight throughput bound results in a reduced resource budget. This comes at the cost of a predictable overhead at run-time as well as increased communication and memory budgets. We show that design choices offer interesting trade-offs between run-time cost and resource budgets
- …