Search CORE

2 research outputs found

The Slowdown or Race-to-idle Question: Workload-Aware Energy Optimization of SMT Multicore Platforms under Process Variation

Author: Al-Hashimi Bashir M.
Das Anup
Merrett Geoff V.
Publication venue
Publication date
Field of study

Two widely used approaches for reducing energy consumption in multithreaded workloads are slowdown (using DVFS) and race-to-idle. In this paper, we first demonstrate that most energy-efficient choice is dependent on (1) workload (memory bound, CPU bound etc.), (2) process variation and (3) support for Simultaneous Multithreading (SMT). We then propose an approach for mapping application threads on SMT multicore systems at run-time, to minimize energy consumption. The proposed approach interfaces with the OS and hardware performance counters to characterize application threads. This characterization captures the effect of process variation on execution time and identifies the break-even operating point, where one strategy (slowdown or race-to-idle) outperforms the other. Thread mapping is performed using these characterized data by iteratively collapsing application threads (SMT) followed by binary programming-based thread mapping. Finally, performance slack is exploited at run-time to select between slowdown and race-to-idle, based upon the break-even operating point calculated for each individual thread. This end-to-end approach is implemented as a run-time manager for the Linux operating system and is validated across a range of high performance applications. Results demonstrate up to 13% energy reduction over all state-of-the-art approaches, with an average of 18% improvement over Linux

Southampton (e-Prints Soton)

Simultaneous multithreading support in embedded distributed memory MPSoCs

Author: Adeniyi-Jones Chris
Busseuil Rémi
Garibotti Rafael
Kourouma Mamady
Ost Luciano
Robert Michel
Sassatelli Gilles
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

International audienceScalability and programmability are important issues in large homogeneous MPSoCs. Such architectures often rely on explicit message-passing among processors, each of which possessing a local private memory. This paper presents a low-overhead hardware/software distributed shared memory approach that makes such architectures multithreading-capable. The proposed solution is implemented into an open-source message-passing MPSoC through developing a POSIX-like thread API, which shows excellent scalability using application kernels used for benchmarking in shared-memory systems. This approach efficiently draws strengths from the on-chip distributed private memory that opens the way to exposing the multithreading programmability/capabilities of that component as a general-purpose accelerator

Crossref

HAL Descartes

Hal-Diderot