Search CORE

3,292 research outputs found

An Implementation of a Predictable Cache-coherent Multi-core System

Author: Tegegn Paulos
Publication venue: 'University of Waterloo'
Publication date: 13/05/2019
Field of study

Multi-core platforms have entered the realm of the embedded systems to meet the ever growing performance requirements of the real-time embedded applications. Real-time applications leverage the hardware parallelism from multi-cores while keeping the hardware cost minimum. However, when the real-time tasks are deployed on the multi-core platforms, they experience interference due to sharing of hardware resources such as shared bus, last level cache, and main memory. As a result, it complicates computing the worst-case execution time of the real-time tasks. In this thesis, I present a hardware prototype that implements a predictable cache-coherent real-time multi-core system. The designed hardware follows the design guidelines outlined in the predictable cache coherence protocol. The hardware uses a latency insensitive interfaces to integrate the multi-core components such as the processor, cache controller, and interconnecting bus. The prototyped multi-core hardware is synthesized and implemented in a low-cost and high-performing FPGA board. The hardware is validated and verified on a tethered system that enables the design to run multi-threaded pthread applications

University of Waterloo's Institutional Repository

The potential of programmable logic in the middle: cache bleaching

Author: Mancuso Renato
Roozkhosh Shahin
Publication venue
Publication date: 21/04/2020
Field of study

Consolidating hard real-time systems onto modern multi-core Systems-on-Chip (SoC) is an open challenge. The extensive sharing of hardware resources at the memory hierarchy raises important unpredictability concerns. The problem is exacerbated as more computationally demanding workload is expected to be handled with real-time guarantees in next-generation Cyber-Physical Systems (CPS). A large body of works has approached the problem by proposing novel hardware re-designs, and by proposing software-only solutions to mitigate performance interference. Strong from the observation that unpredictability arises from a lack of fine-grained control over the behavior of shared hardware components, we outline a promising new resource management approach. We demonstrate that it is possible to introduce Programmable Logic In-the-Middle (PLIM) between a traditional multi-core processor and main memory. This provides the unique capability of manipulating individual memory transactions. We propose a proof-of-concept system implementation of PLIM modules on a commercial multi-core SoC. The PLIM approach is then leveraged to solve long-standing issues with cache coloring. Thanks to PLIM, colored sparse addresses can be re-compacted in main memory. This is the base principle behind the technique we call Cache Bleaching. We evaluate our design on real applications and propose hypervisor-level adaptations to showcase the potential of the PLIM approach.Accepted manuscrip

Crossref

Boston University Institutional Repository (OpenBU)

Overview of Swallow --- A Scalable 480-core System for Investigating the Performance and Energy Efficiency of Many-core Applications and Operating Systems

Author: Hollis Simon J.
Kerrison Steve
Publication venue
Publication date: 23/04/2015
Field of study

We present Swallow, a scalable many-core architecture, with a current configuration of 480 x 32-bit processors. Swallow is an open-source architecture, designed from the ground up to deliver scalable increases in usable computational power to allow experimentation with many-core applications and the operating systems that support them. Scalability is enabled by the creation of a tile-able system with a low-latency interconnect, featuring an attractive communication-to-computation ratio and the use of a distributed memory configuration. We analyse the energy and computational and communication performances of Swallow. The system provides 240GIPS with each core consuming 71--193mW, dependent on workload. Power consumption per instruction is lower than almost all systems of comparable scale. We also show how the use of a distributed operating system (nOS) allows the easy creation of scalable software to exploit Swallow's potential. Finally, we show two use case studies: modelling neurons and the overlay of shared memory on a distributed memory system.Comment: An open source release of the Swallow system design and code will follow and references to these will be added at a later dat

arXiv.org e-Print Archive

Explore Bristol Research

parMERASA Multi-Core Execution of Parallelised Hard Real-Time Applications Supporting Analysability

Author: Abella Jaume
Bonenfant Armelle
Bradatsch Christian
Broster Ian
Böddeker Bert
Cassé Hugues
Cazorla Francisco
Fernandes Joao
George David
Gerdes Mike
Hugl Andreas
Jahr Ralf
Kehr Sebastian
Kluge Florian
Lay Nick
Mische Jörg
Ozaktas Haluk
Panic Milos
Petrov Zlatko
Pyka Arthur
Quinones Eduardo
Regler Hans
Rochange Christine
Rohde Mathias
Sainrat Pascal
Uhrig Sascha
Ungerer Theo
Zaykov Pavel G.
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceEngineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and running them on an embedded multi-core processor, which enables combining the requirements for high-performance with timing-predictable execution. parMERASA will provide a timing analyzable system of parallel hard real-time applications running on a scalable multicore processor. parMERASA goes one step beyond mixed criticality demands: It targets future complex control algorithms by parallelizing hard real-time programs to run on predictable multi-/many-core processors. We aim to achieve a breakthrough in techniques for parallelization of industrial hard real-time programs, provide hard real-time support in system software, WCET analysis and verification tools for multi-cores, and techniques for predictable multi-core designs with up to 64 cores

OPUS Augsburg

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

Discriminative Coherence: Balancing Performance and Latency Bounds in Data-Sharing Multi-Core Real-Time Systems

Author: Hassan Mohamed
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020)
Publication date: 01/01/2020
Field of study

Dagstuhl Research Online Publication Server

A Time-predictable Object Cache

Author: Schoeberl Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Abstract—Static cache analysis for data allocated on the heap is practically impossible for standard data caches. We propose a distinct object cache for heap allocated data. The cache is highly associative to track symbolic object addresses in the static analysis. Cache lines are organized to hold single objects and individual fields are loaded on a miss. This cache organization is statically analyzable and improves the performance. In this paper we present the design and implementation of the object cache in a uniprocessor and chipmultiprocessor version of the Java processor JOP. Keywords-real-time systems; time-predictable computer architecture; worst-case execution time analysis I

CiteSeerX

Crossref

Online Research Database In Technology

Multi-core devices for safety-critical systems: a survey

Author: Abella Ferrer Jaume
Agirre Irune
Ahmadian Hamidreza
Allende Imanol
Cazorla Almeida Francisco Javier
Grüttner Kim
Obermaisser Roman
Perez Cerrolaza Jon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2020
Field of study

Multi-core devices are envisioned to support the development of next-generation safety-critical systems, enabling the on-chip integration of functions of different criticality. This integration provides multiple system-level potential benefits such as cost, size, power, and weight reduction. However, safety certification becomes a challenge and several fundamental safety technical requirements must be addressed, such as temporal and spatial independence, reliability, and diagnostic coverage. This survey provides a categorization and overview at different device abstraction levels (nanoscale, component, and device) of selected key research contributions that support the compliance with these fundamental safety requirements.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-65316-P, Basque Government under grant KK-2019-00035 and the HiPEAC Network of Excellence. The Spanish Ministry of Economy and Competitiveness has also partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC-2013-14717).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

A survey of techniques for reducing interference in real-time applications on multicore platforms

Author: Carretero Pérez Jesús
Fernández Muñoz Javier
Lozano Santiago
Lugo Tamara
Publication venue: IEEE
Publication date: 15/02/2022
Field of study

This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas Técnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de Próxima Generación" under Grant IND2019/TIC-17261

Universidad Carlos III de Madrid e-Archivo

PASoC: A Predictable Accelerator Rich SoC for Safety-Critical Systems

Author: Tadepalli Susmita
Publication venue: 'University of Waterloo'
Publication date: 19/09/2023
Field of study

This thesis presents a model of a Predictable Accelerator-rich System-on-Chip (PASoC) for safety-critical systems, which guarantees timing predictability of a memory access in the system. Earlier adoption of accelerator-rich SoCs was for general-purpose comput ing and thus timing predictability of such systems was not well explored, despite being used in safety-critical systems. This thesis takes initial steps in exploring the predictabil ity of ASoCs by combining CPU clusters with one or more hardware accelerators. The PASoC allows the integration of multiple coherent agents to interact with each other over a shared memory bus and a shared LLC. These agents can be a cluster of cache-coherent homogeneous cores, and fully or one-way coherent hardware accelerators. PASoC ensures the predictability of a memory request through some modifications in hardware architecture and cache coherence protocols. PASoC supports predictable cache coherence within the cluster of cores and across agents. The former uses linear cache coherence, and the latter uses a modified version of predictable Modified Shared Invalid (MSI) cache coherence pro tocol. PASoC analyzes the per-request worst-case latency of a memory request from any of the agents and evaluates the design on the gem5 simulator. Finally, this work presents some observations based on the analysis that can help in future designs of PASoCs

University of Waterloo's Institutional Repository

Contention in multicore hardware shared resources: Understanding of the state of the art

Author: Abella Ferrer Jaume
Cazorla Almeida Francisco Javier
Fernández Gabriel
Quiñones Eduardo
Rochange Christine
Vardanega Tullio
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/01/2014
Field of study

The real-time systems community has over the years devoted considerable attention to the impact on execution timing that arises from contention on access to hardware shared resources. The relevance of this problem has been accentuated with the arrival of multicore processors. From the state of the art on the subject, there appears to be considerable diversity in the understanding of the problem and in the “approach” to solve it. This sparseness makes it difficult for any reader to form a coherent picture of the problem and solution space. This paper draws a tentative taxonomy in which each known approach to the problem can be categorised based on its specific goals and assumptions.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Dagstuhl Research Online Publication Server