Search CORE

24,980 research outputs found

parMERASA Multi-Core Execution of Parallelised Hard Real-Time Applications Supporting Analysability

Author: Abella Jaume
Bonenfant Armelle
Bradatsch Christian
Broster Ian
Böddeker Bert
Cassé Hugues
Cazorla Francisco
Fernandes Joao
George David
Gerdes Mike
Hugl Andreas
Jahr Ralf
Kehr Sebastian
Kluge Florian
Lay Nick
Mische Jörg
Ozaktas Haluk
Panic Milos
Petrov Zlatko
Pyka Arthur
Quinones Eduardo
Regler Hans
Rochange Christine
Rohde Mathias
Sainrat Pascal
Uhrig Sascha
Ungerer Theo
Zaykov Pavel G.
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceEngineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and running them on an embedded multi-core processor, which enables combining the requirements for high-performance with timing-predictable execution. parMERASA will provide a timing analyzable system of parallel hard real-time applications running on a scalable multicore processor. parMERASA goes one step beyond mixed criticality demands: It targets future complex control algorithms by parallelizing hard real-time programs to run on predictable multi-/many-core processors. We aim to achieve a breakthrough in techniques for parallelization of industrial hard real-time programs, provide hard real-time support in system software, WCET analysis and verification tools for multi-cores, and techniques for predictable multi-core designs with up to 64 cores

OPUS Augsburg

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

parMERASA – multicore execution of parallelised hard real-time applications supporting analysability

Author: A Bonenfant
A Hugl
A Pyka
B Böddeker
C Bradatsch
C Rochange
D George
E Quiñones
F Cazorla
F Kluge
H Cassé
H Ozaktas
H Regler
I Broster
J Abella
J Fernandes
J Mische
M Gerdes
M Panic
M Rohde
N Lay
P G Zaykov
P Sainrat
R Jahr
S Kehr
S Uhrig
T Ungerer
Z Petrov
Publication venue
Publication date: 11/04/2020
Field of study

Abstract-Engineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and running them on an embedded multi-core processor, which enables combining the requirements for high-performance with timing-predictable execution. parMERASA will provide a timing analyzable system of parallel hard real-time applications running on a scalable multicore processor. parMERASA goes one step beyond mixed criticality demands: It targets future complex control algorithms by parallelizing hard real-time programs to run on predictable multi-/many-core processors. We aim to achieve a breakthrough in techniques for parallelization of industrial hard real-time programs, provide hard real-time support in system software, WCET analysis and verification tools for multi-cores, and techniques for predictable multi-core designs with up to 64 cores

CiteSeerX

Overview of Swallow --- A Scalable 480-core System for Investigating the Performance and Energy Efficiency of Many-core Applications and Operating Systems

Author: Hollis Simon J.
Kerrison Steve
Publication venue
Publication date: 23/04/2015
Field of study

We present Swallow, a scalable many-core architecture, with a current configuration of 480 x 32-bit processors. Swallow is an open-source architecture, designed from the ground up to deliver scalable increases in usable computational power to allow experimentation with many-core applications and the operating systems that support them. Scalability is enabled by the creation of a tile-able system with a low-latency interconnect, featuring an attractive communication-to-computation ratio and the use of a distributed memory configuration. We analyse the energy and computational and communication performances of Swallow. The system provides 240GIPS with each core consuming 71--193mW, dependent on workload. Power consumption per instruction is lower than almost all systems of comparable scale. We also show how the use of a distributed operating system (nOS) allows the easy creation of scalable software to exploit Swallow's potential. Finally, we show two use case studies: modelling neurons and the overlay of shared memory on a distributed memory system.Comment: An open source release of the Swallow system design and code will follow and references to these will be added at a later dat

arXiv.org e-Print Archive

Explore Bristol Research

Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

Author: Buckley Kevan
Sillitoe Ian
Yang Shufan
Yu Zheqi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

Enlighten

DyPS: Dynamic Processor Switching for Energy-Aware Video Decoding on Multi-core SoCs

Author: Benazzouz Djamel
Benmoussa Yahia
Boukhobza Jalil
Hadjadj-Aoul Yassine
Senn Eric
Publication venue
Publication date: 26/08/2013
Field of study

In addition to General Purpose Processors (GPP), Multicore SoCs equipping modern mobile devices contain specialized Digital Signal Processor designed with the aim to provide better performance and low energy consumption properties. However, the experimental measurements we have achieved revealed that system overhead, in case of DSP video decoding, causes drastic performances drop and energy efficiency as compared to the GPP decoding. This paper describes DyPS, a new approach for energy-aware processor switching (GPP or DSP) according to the video quality . We show the pertinence of our solution in the context of adaptive video decoding and describe an implementation on an embedded Linux operating system with the help of the GStreamer framework. A simple case study showed that DyPS achieves 30% energy saving while sustaining the decoding performanc

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Université de Bretagne Occidentale

HAL-Rennes 1

A case study for NoC based homogeneous MPSoC architectures

Author: Casu Mario Roberto
Macchiarulo Luca
Ruo Roch Massimo
Tota Sergio Vincenzo
Zamboni Maurizio
Publication venue: IEEE
Publication date: 01/01/2009
Field of study

The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino