Search CORE

30 research outputs found

Real-Time Application Mapping for Many-Cores Using a Limited Migrative Model

Author: Nikolic Borislav
Petters Stefan M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/01/2015
Field of study

Many-core platforms are an emerging technology in the real-time embedded domain. These devices offer various options for power savings, cost reductions and contribute to the overall system flexibility, however, issues such as unpredictability, scalability and analysis pessimism are serious challenges to their integration into the aforementioned area. The focus of this work is on many-core platforms using a limited migrative model (LMM). LMM is an approach based on the fundamental concepts of the multi-kernel paradigm, which is a promising step towards scalable and predictable many-cores. In this work, we formulate the problem of real-time application mapping on a many-core platform using LMM, and propose a three-stage method to solve it. An extended version of the existing analysis is used to assure that derived mappings (i) guarantee the fulfilment of timing constraints posed on worst-case communication delays of individual applications, and (ii) provide an environment to perform load balancing for e.g. energy/thermal management, fault tolerance and/or performance reasons

Many-Core Platforms in the Real-Time Embedded Computing Domain

Author: Borislav Nikolic
Publication venue
Publication date: 24/04/2015
Field of study

Modeling high-performance wormhole NoCs for critical real-time embedded systems

Author: Abella Ferrer Jaume
Cazorla Almeida Francisco Javier
Hernández Carles
Panic Milos
Quiñones Eduardo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Manycore chips are a promising computing platform to cope with the increasing performance needs of critical real-time embedded systems (CRTES). However, manycores adoption by CRTES industry requires understanding task's timing behavior when their requests use manycore's network-on-chip (NoC) to access hardware shared resources. This paper analyzes the contention in wormhole-based NoC (wNoC) designs - widely implemented in the high-performance domain - for which we introduce a new metric: worst-contention delay (WCD) that captures wNoC impact on worst-case execution time (WCET) in a tighter manner than the existing metric, worst-case traversal time (WCTT). Moreover, we provide an analytical model of the WCD that requests can suffer in a wNoC and we validate it against wNoC designs resembling those in the Tilera-Gx36 and the Intel-SCC 48-core processors. Building on top of our WCD analytical model, we analyze the impact on WCD that different design parameters such as the number of virtual channels, and we make a set of recommendations on what wNoC setups to use in the context of CRTES.Peer ReviewedPostprint (author's final draft

Real-time scheduling with resource sharing on heterogeneous multiprocessors

Author: Andersson Björn
Raravi Gurulingesh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Consider the problem of scheduling a task set τ of implicit-deadline sporadic tasks to meet all deadlines on a t-type heterogeneous multiprocessor platform where tasks may access multiple shared resources. The multiprocessor platform has m k processors of type-k, where k∈{1,2,…,t}. The execution time of a task depends on the type of processor on which it executes. The set of shared resources is denoted by R. For each task τ i , there is a resource set R i ⊆R such that for each job of τ i , during one phase of its execution, the job requests to hold the resource set R i exclusively with the interpretation that (i) the job makes a single request to hold all the resources in the resource set R i and (ii) at all times, when a job of τ i holds R i , no other job holds any resource in R i . Each job of task τ i may request the resource set R i at most once during its execution. A job is allowed to migrate when it requests a resource set and when it releases the resource set but a job is not allowed to migrate at other times. Our goal is to design a scheduling algorithm for this problem and prove its performance. We propose an algorithm, LP-EE-vpr, which offers the guarantee that if an implicit-deadline sporadic task set is schedulable on a t-type heterogeneous multiprocessor platform by an optimal scheduling algorithm that allows a job to migrate only when it requests or releases a resource set, then our algorithm also meets the deadlines with the same restriction on job migration, if given processors 4×(1+MAXP×⌈|P|×MAXPmin{m1,m2,…,mt}⌉) times as fast. (Here MAXP and |P| are computed based on the resource sets that tasks request.) For the special case that each task requests at most one resource, the bound of LP-EE-vpr collapses to 4×(1+⌈|R|min{m1,m2,…,mt}⌉). To the best of our knowledge, LP-EE-vpr is the first algorithm with proven performance guarantee for real-time scheduling of sporadic tasks with resource sharing on t-type heterogeneous multiprocessors

Contention-Free Execution of Automotive Applications on a Clustered Many-Core Platform

Author: Becker Matthias
Dasari Dakshina
Nelis Vincent
Nikolic Borislav
Nolte Thomas
Åkesson Benny
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2117
Field of study

28th Euromicro Conference on Real-Time Systems (ECRTS 2016). 5 to 8, Jul, 2016. Toulouse, France.Next generations of compute-intensive real-time applications in automotive systems will require more powerful computing platforms. One promising power-efficient solution for such applications is to use clustered many-core architectures. However, ensuring that real-time requirements are satisfied in the presence of contention in shared resources, such as memories, remains an open issue. This work presents a novel contention-free execution framework to execute automotive applications on such platforms. Privatization of memory banks together with defined access phases to shared memory resources is the backbone of the framework. An Integer Linear Programming (ILP) formulation is presented to find the optimal time-triggered schedule for the on-core execution as well as for the access to shared memory. Additionally a heuristic solution is presented that generates the schedule in a fraction of the time required by the ILP. Extensive evaluations show that the proposed heuristic performs only 0.5% away from the optimal solution while it outperforms a baseline heuristic by 67%. The applicability of the approach to industrially sized problems is demonstrated in a case study of a software for Engine Management Systems.info:eu-repo/semantics/publishedVersio

Real-Time Scheduling on Heterogeneous Multiprocessors

Author: Gurulingesh Raravi
Publication venue
Publication date: 01/01/2014
Field of study

Embedded computing is one of the most important areas in computer science today, witnessed by the fact that 98 % of all computers are embedded. Given that many embedded systems have to interact “promptly” with their physical environment, the scientific community has invested signifi-cant efforts in developing algorithms for scheduling the workload, which is generally implemented as a set of tasks, at the right time and in proving before run-time that all the timing requirements will be satisfied at run-time. This field of study is referred to as the real-time scheduling theory. The scheduling theory for a unicore processor is well-developed; the scientific results are taught at all major universities world-wide and the results are widely-used in industry. Scheduling theory for multicores is emerging but the focus so far has been for multicores with identical pro-cessing units. This is unfortunate because the computer industry is moving towards heterogeneous multicores with a constant number of distinct processor types — AMD Fusion, Intel Atom and NVIDIA Tegra are some of the examples of such multicores. This work deals with the problem of scheduling a set of tasks to meet their deadlines on het-erogeneous multiprocessors with a constant number of distinct processor types. On heterogeneou

CiteSeerX

Real-Time Analysis of Priority-Preemptive NoCs with Arbitrary Buffer Sizes and Router Delays

Author: Burns Alan
Ernst Rolf
Nikolic Borislav
Soares Indrusiak Leandro
Tobuschat Sebastian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/01/2019
Field of study

Nowadays available multiprocessor platforms predominantly use a network-on-chip (NoC) architecture as an interconnect medium, due to its good scalability and performance. During the last decade, NoCs received a significant amount of attention from the real-time community. One promising category of approaches suggests to employ already existing hardware features called virtual channels, and dedicate them, exclusively, to individual communication traffic flows. In this way, NoCs become more amenable to the real-time analysis, which is an essential requirement for providing both safe and tight worst-case analysis methods, and consequently deriving real-time guarantees. In this manuscript, we present the approach which falls in the aforementioned category. Specifically, we propose a novel method for the worst-case analysis of the NoC traffic, assuming the existence of per-flow dedicated virtual channels. Compared to the state-of-the-art techniques, our approach yields substantially tighter upper-bounds on the worst-case traversal times (WCTTs) of communication traffic flows. By employing the proposed method, resource over-provisioning can be mitigated to a large extent, and significant design-cost reductions can be achieved. Moreover, we implemented a cycle-accurate simulator of the assumed NoC architecture, and used it to assess the tightness of derived WCTT bounds. Finally, we reached an interesting conclusion that bigger virtual channel buffers do not necessarily lead to better results, and in many cases can be counter-productive, which is a very important finding for system designers

Fast simulation of networks-on-chip with priority-preemptive arbitration

Author: Dos Santos Osmar Marchi
Harbin James
Indrusiak Leandro Soares
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/09/2015
Field of study

An increasingly time-consuming part of the design flow of on-chip multiprocessors is the simulation of the interconnect architecture. The accurate simulation of state-of-the art network-on-chip interconnects can take hours, and this process is repeated for each design iteration because it provides valuable insights on communication latencies that can greatly affect the overall performance of the system. In this article, we identify a time-predictable network-on-chip architecture and show that its timing behaviour can be predicted using models which are far less complex than the architecture itself. We then explore such a feature to produce simplified and lightweight simulation models that can produce latency figures with more than 90% accuracy and simulate more than 1,000 times faster when compared to a cycle-accurate model of the same interconnect