Search CORE

78 research outputs found

Computing Safe Contention Bounds for Multicore Resources with Round-Robin and FIFO Arbitration

Author: Abella Ferrer Jaume
Cazorla Francisco J.
Fernandez Gabriel
Jalle Javier
Quiñones Eduardo
Vardanega Tullio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/10/2016
Field of study

Numerous researchers have studied the contention that arises among tasks running in parallel on a multicore processor. Most of those studies seek to derive a tight and sound upper-bound for the worst-case delay with which a processor resource may serve an incoming request, when its access is arbitrated using time-predictable policies such as round-robin or FIFO. We call this value upper-bound delay ( ubd ). Deriving trustworthy ubd statically is possible when sufficient public information exists on the timing latency incurred on access to the resource of interest. Unfortunately however, that is rarely granted for commercial-of-the-shelf (COTS) processors. Therefore, the users resort to measurement observations on the target processor and thus compute a “measured” ubdm . However, using ubdm to compute worst-case execution time values for programs running on COTS multicore processors requires qualification on the soundness of the result. In this paper, we present a measurement-based methodology to derive a ubdm under round-robin (RoRo) and first-in-first-out (FIFO) arbitration, which accurately approximates ubd from above, without needing latency information from the hardware provider. Experimental results, obtained on multiple processor configurations, demonstrate the robustness of the proposed methodology.The research leading to this work has received funding from: the European Union’s Horizon 2020 research and innovation programme under grant agreement No 644080(SAFURE); the European Space Agency under Contract 789.2013 and NPI Contract 40001102880; and COST Action IC1202, Timing Analysis On Code-Level (TACLe). This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. The authors would like to thanks Paul Caheny for his help with the proofreading of this document.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

ZENODO

Recommended from our members

A SIMD architecture for hard real-time systems

Author: Spliet Roy
Publication venue: University of Cambridge
Publication date: 31/03/2020
Field of study

Emerging safety-critical systems require high-performance data-parallel architectures and, problematically, ones that can guarantee tight and safe worst-case execution times. Given the complexity of existing architectures like GPUs, it is unlikely that sufficiently accurate models and algorithms for timing analysis will emerge in the foreseeable future. This motivates a clean-slate approach to designing a real-time data-parallel architecture. In this work I present Sim-D: a wide-SIMD architecture for hard real-time systems. Similar to GPUs, Sim-D performs hardware strip-mining to schedule the work for a compute kernel in entities called work-groups. Sim-D schedules the work for each work-group as a sequence of uninterruptible access- and execute program phases, interleaving the phases of two work-groups. By providing performance isolation between the memory- and compute resources, the execution time of each phase can be tightly bound through static analysis. I present a predictable closed-page DRAM controller that processes requests for large 1D- and 2D blocks of data, as well as indirect indexed transfers. These large transfers coalesce the data requests of a whole work-group. For a linear 4KiB transfer over a 64-bit data bus, the utilisation provably exceeds 78% for DDR4-3200AA DRAM. For 2D blocks, a well-chosen tiling configuration can achieve near-similar efficiency. I show that bounds on the execution time of indexed transfers are pessimistic by nature, but propose a novel snoopy indexed transfer mechanism that permits more reasonable bounds when the buffer size is limited. Finally, I present a worst-case execution time calculation algorithm for Sim-D. This algorithm is paired with two hardware work-group scheduling policies that deterministically reduce run-time variance. The worst-case execution time analysis algorithm combines static control flow analysis with a simulation-based cost model for execution and DRAM transfers. Its key novelty is the addition of a stage that considers work-group scheduling effects. I show that the work-group scheduling policies degrade performance on average by 8.9%, but permit the calculation of worst-case execution time bounds that are tight within 14.3% on average for benchmarks that avoid inefficient indexed transfers

Apollo (Cambridge)

Simultaneous Multithreading and Hard Real Time: Can It Be Safe?

Author: Anderson James H.
Osborne Sims Hill
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020)
Publication date: 01/01/2020
Field of study

The applicability of Simultaneous Multithreading (SMT) to real-time systems has been hampered by the difficulty of obtaining reliable execution costs in an SMT-enabled system. This problem is addressed by introducing a scheduling framework, called CERT-MT, that combines scheduling-aware timing analysis with a cyclic-executive scheduler in a way that minimizes SMT-related timing variations. The proposed scheduling-aware timing analysis is based on maximum observed execution times and accounts for the uncertainty inherent in measurement-based timing analysis. The timing analysis is found to work for tasks with and without SMT, though some adjustments are required in the former case. A large-scale schedulability study is presented that shows CERT-MT can schedule systems with total utilizations approaching 1.4 times the core count, without sacrificing safety

Dagstuhl Research Online Publication Server

Proceedings Work-In-Progress Session of the 13th Real-Time and Embedded Technology and Applications Symposium

Author: Lu Chenyang
Publication venue: Washington University Open Scholarship
Publication date: 03/04/2007
Field of study

The Work-In-Progress session of the 13th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS\u2707) presents papers describing contributions both to state of the art and state of the practice in the broad field of real-time and embedded systems. The 17 accepted papers were selected from 19 submissions. This proceedings is also available as Washington University in St. Louis Technical Report WUCSE-2007-17, at http://www.cse.seas.wustl.edu/Research/FileDownload.asp?733. Special thanks go to the General Chairs – Steve Goddard and Steve Liu and Program Chairs - Scott Brandt and Frank Mueller for their support and guidance

Washington University St. Louis: Open Scholarship

Semantics-based Program Verification: an Abstract Interpretation Approach

Author: Vítor Gabriel dos Reis Machado Rodrigues
Publication venue
Publication date: 15/05/2013
Field of study

Repositório Aberto da Universidade do Porto

Analyzable dataflow executions with adaptive redundancy

Author: Kühbacher Christoph
Publication venue
Publication date: 04/11/2022
Field of study

Increasing performance requirements in the embedded systems domain have encouraged a drift from singlecore to multicore processors, and thus multicore processors are widely used in embedded systems today. Cars are an example for complex embedded systems in which the use of multicore processors is continuously increasing. A major reason for this is to consolidate different software components on one chip and thus reduce the number of electronic control units. However, the de facto standard in the automotive industry, AUTOSAR (AUTomotive Open System ARchitecture), was originally designed for singlecore processors. Although basic support for multicore processors was added, more complex architectures are currently not compatible with the software stack. Regarding the software components running on the ECUS of modern cars, requirements are diverse. On the one hand, there are safety-critical tasks, like the airbag control, anti-lock braking system, electronic stability control and emergency brake assist, and on the other hand, tasks which do not have any safety-related requirements at all, for example tasks controlling the infotainment system. Trends like autonomous driving lead to even more demanding tasks in the system since such tasks are both safety-critical and data-intensive. As embedded applications, like those in the automotive domain, become more complex, new approaches are necessary. Data-intensive tasks are usually tackled with large-scale computing frameworks. In this thesis, some major concepts of such frameworks are transferred to the high-performance embedded systems domain. For this purpose, the thesis describes a runtime environment (RTE) that is suitable for different kinds of multi- and manycore hardware architectures. The RTE follows a dataflow execution model based on directed acyclic graphs (DAGs). Graphs are divided into sections which are scheduled separately. For each section, the RTE uses a DAG scheduling heuristic to compute multiple schedules covering different redundancy configurations. This allows the RTE to dynamically change the redundancy of parts of the graph at runtime despite the use of fixed schedules. Alternatively, the RTE also provides an online scheduler. To specify suitable graphs, the RTE also provides a programming model which shares similarities with common large-scale computing frameworks, for example Apache Spark. Using this programming model, three common distributed algorithms, namely Cannon's algorithm, the Cooley-Tukey algorithm and bitonic sort, were implemented. With these three programs, the performance of the RTE was evaluated for a variety of configurations on two different hardware architectures. The results show that the proposed RTE is able to reach the performance of established parallel computation frameworks and that for suitable graphs with reasonable sectionings the negative influence on the runtime is either small or non-existent.Aufgrund steigender Anforderungen an die Leistungsfähigkeit von eingebetteten Systemen finden Mehrkernprozessoren mittlerweile auch in eingebetteten Systemen Verwendung. Autos sind ein Beispiel für eingebettete Systeme, in denen die Verbreitung von Mehrkernprozessoren kontinuierlich zunimmt. Ein Hauptgrund ist, dass es dadurch möglich wird, mehrere Applikationen, für die ursprünglich mehrere Electronic Control Units (ECUs) notwendig waren, auf ein und demselben Chip auszuführen und dadurch die Anzahl der ECUs im Gesamtsystem zu verringern. Der De-facto-Standard AUTOSAR (AUTomotive Open System ARchitecture) wurde jedoch ursprünglich nur im Hinblick auf Einkernprozessoren entworfen und, obwohl der Softwarestack um grundlegende Unterstützung für Mehrkernprozessoren erweitert wurde, sind komplexere Architekturen nicht damit kompatibel. Die Anforderungen der Softwarekomponenten von modernen Autos sind vielfältig. Einerseits gibt es hochgradig sicherheitskritische Tasks, die beispielsweise die Airbags, das Antiblockiersystem, die Fahrdynamikregelung oder den Notbremsassistenten steuern und andererseits Tasks, die keinerlei sicherheitskritische Anforderungen aufweisen, wie zum Beispiel Tasks zur Steuerung des Infotainment-Systems. Neue Trends wie autonomes Fahren führen zu weiteren anspruchsvollen Tasks, die sowohl hohe Leistungs- als auch Sicherheitsanforderungen aufweisen. Da die Komplexität eingebetteter Anwendungen, beispielsweise im Automobilbereich, stetig zunimmt, sind neue Ansätze erforderlich. Für komplexe, datenintensive Aufgaben werden in der Regel Cluster-Computing-Frameworks eingesetzt. In dieser Arbeit werden Konzepte solcher Frameworks auf den Bereich der eingebetteten Systeme übertragen. Dazu beschreibt die Arbeit eine Laufzeitumgebung (RTE) für eingebettete Mehrkernarchitekturen. Die RTE folgt einem Datenfluss-Ausführungsmodell, das auf gerichteten azyklischen Graphen basiert. Graphen können in Abschnitte eingeteilt werden, für welche separat mehrere unterschiedlich redundante Schedules mit Hilfe einer Scheduling-Heuristik berechnet werden. Dieser Ansatz erlaubt es, die Redundanz von Teilen der Anwendung zur Laufzeit zu verändern. Alternativ unterstützt die RTE auch Scheduling zur Laufzeit. Zur Erzeugung von Graphen stellt die RTE ein Programmiermodell bereit, welches sich an etablierten Frameworks, insbesondere Apache Spark, orientiert. Damit wurden drei Beispielanwendungen implementiert, die auf gängigen Algorithmen basieren. Konkret handelt es sich um Cannon's Algorithmus, den Cooley-Tukey-Algorithmus und bitonisches Sortieren. Um die Leistungsfähigkeit der RTE zu ermitteln, wurden diese drei Anwendungen mehrfach mit verschiedenen Konfigurationen auf zwei Hardware-Architekturen ausgeführt. Die Ergebnisse zeigen, dass die RTE in ihrer Leistungsfähigkeit mit etablierten Systemen vergleichbar ist und die Laufzeit bei einer sinnvollen Graphaufteilung im besten Fall nur geringfügig beeinflusst wird

OPUS Augsburg

Integrated hardware garbage collection for real-time embedded systems

Author: Amaya Garcia Andres
Publication venue
Publication date: 28/09/2021
Field of study

Explore Bristol Research

Power management techniques for conserving energy in multiple system components

Author: Abou Gazala Neven M.
Publication venue
Publication date: 10/06/2008
Field of study

Energy consumption is a limiting constraint for both embedded and high performance systems. CPU-core, caches and memory contribute a large fraction of energy consumption in most computing systems. As a result, reducing the energy consumption in these components can significantly reduce the system's overall energy consumption. However, applying multiple independent power management policies in the system (one for each component) may interfere with each other and in some occasions increase the combined energy consumption.In this dissertation, I present three power management techniques that target more than a single component in a system. The focus is on reducing the total energy consumption in processors, caches and memory combinations. First, I present a memory-aware processor power management using collaboration between the OS and the compiler. The technique objectives are: (1) finish the application execution before its deadline and (2) minimize the combined energy consumption in processor and memory. Second, I present an Integrated Dynamic Voltage Scaling (IDVS) techniques for processor and on-chip cache power management in multi-voltage domains systems. IDVS co-ordinates power management decisions across voltage domains rather that being applied in isolation in each domain. Third, I present a Power-Aware Cached DRAM (PA-CDRAM) memory organization for reducing the energy consumption in DRAM memory and off-chip caches. PA-CDRAM exploits the high internal memory bandwidth by bringing the off-chip caches "closer" to the memory, which also improves the overall performance.The techniques in my thesis highlight the importance of designing power management schemes that consider multiple components and their interactions (in terms of power and performance) in the system rather than applying multiple isolated power management polices. This study should lay the foundation for further research in the domain of integrated power management, where a single power manager controls many system components

D-Scholarship@Pitt

Proceedings of Junior Researcher Workshop on Real-Time Computing

Author: Cucu Liliana
Publication venue: Impressions et Reliures, Institut National Polytechnique de Lorraine
Publication date: 01/01/2007
Field of study

It is our great pleasure to welcome you to Junior Researcher Workshop on Real-Time Computing 2007, which is held conjointly with the 15th conference on Real-Time and Network Systems (RTNS'07). The first successful edition was held conjointly with the French Summer School on Real-Time Systems 2005 (http://etr05.loria.fr). Its main purpose is to bring together junior researchers (Ph.D. students, postdoc, ...) working on real-time systems. This workshop is a good opportunity to present our works and share ideas with other junior researchers and not only, since we will present our work to the audience of the main conference. In response to the call for papers, 14 papers were submitted and the international Program Committee provided detailed comments to improve these work-in-progress papers. We hope that our remarks will help the authors to submit improved long versions of theirs papers to the next edition of RTNS. JRWRTC'07 would not be possible without the generous contribution of many volunteers and institutions which supported RTNS'07. First, we would like to express our sincere gratitude to our sponsors for their financial support : Conseil Général de Meuthe et Moselle, Conseil Régional de Lorraine, Communauté Urbaine du Grand Nancy, Université Henri Poincaré, Institut National Polytechnique de Lorraine and LORIA and INRIA Lorraine. We are thankful to Pascal Mary for authorizing us to use his nice picture of “place Stanislas” for the proceedings and web site (many others are available at www.laplusbelleplacedumonde.com). Finally, we are most grateful to the local organizing committee that helped to organize the conference

INRIA a CCSD electronic archive server

Un meta-modèle de composants pour la réalisation d'applications temps-réel flexibles et modulaires

Author: Rodrigues Americo Joao Claudio
Publication venue: HAL CCSD
Publication date: 04/11/2013
Field of study

The increase of software complexity along the years has led researchers in the software engineering field to look for approaches for conceiving and designing new systems. For instance, the service-oriented architectures approach is considered nowadays as the most advanced way to develop and integrate fastly modular and flexible applications. One of the software engineering solutions principles is re-usability, and consequently generality, which complicates its appilication in systems where optimizations are often used, like real-time systems. Thus, create real-time systems is expensive, because they must be conceived from scratch. In addition, most real-time systems do not beneficiate of the advantages which comes with software engineering approches, such as modularity and flexibility. This thesis aim to take real time aspects into account on popular and standard SOA solutions, in order to ease the design and development of modular and flexible applications. This will be done by means of a component-based real-time application model, which allows the dynamic reconfiguration of the application architecture. The component model will be an extension to the SCA standard, which integrates quality of service attributs onto the service consumer and provider in order to stablish a real-time specific service level agreement. This model will be executed on the top of a OSGi service platform, the standard de facto for development of modular applications in Java.La croissante complexité du logiciel a mené les chercheurs en génie logiciel à chercher des approcher pour concevoir et projéter des nouveaux systèmes. Par exemple, l'approche des architectures orientées services (SOA) est considérée actuellement comme le moyen le plus avancé pour réaliser et intégrer rapidement des applications modulaires et flexibles. Une des principales préocuppations des solutions en génie logiciel et la réutilisation, et par conséquent, la généralité de la solution, ce qui peut empêcher son application dans des systèmes où des optimisation sont souvent utilisées, tels que les systèmes temps réels. Ainsi, créer un système temps réel est devenu très couteux. De plus, la plupart des systèmes temps réel ne beneficient pas des facilités apportées par le genie logiciel, tels que la modularité et la flexibilité. Le but de cette thèse c'est de prendre en compte ces aspects temps réel dans des solutions populaires et standards SOA pour faciliter la conception et le développement d'applications temps réel flexibles et modulaires. Cela sera fait à l'aide d'un modèle d'applications temps réel orienté composant autorisant des modifications dynamiques dans l'architecture de l'application. Le modèle de composant sera une extension au standard SCA qui intègre des attributs de qualité de service sur le consomateur et le fournisseur de services pour l'établissement d'un accord de niveau de service spécifique au temps réel. Ce modèle sera executé sur une plateforme de services OSGi, le standard de facto pour le developpement d'applications modulaires en Java

Thèses en Ligne

Hal - Université Grenoble Alpes