Search CORE

4 research outputs found

When parallel speedups hit the memory wall

Author: Eder Kerstin
Furtunato Alex F. A.
Georgiou Kyriakos
Xavier-de-Souza Samuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2019
Field of study

After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number of cores used and the ratio between processor and memory frequencies. Given the large number of possible configurations of operating frequency and number of cores that current architectures can offer, suitable speedup models to describe such variations among these configurations are quite desirable for off-line or on-line scheduling decisions. This work proposes new parallel speedup models that account for variations of the average data-access delay to describe the limiting effect of the memory wall on parallel speedups. Analytical results indicate that the proposed modeling can capture the desired behavior while experimental hardware results validate the former. Additionally, we show that when accounting for parameters that reflect the intrinsic characteristics of the applications, such as degree of parallelism and susceptibility to the memory wall, our proposal has significant advantages over machine-learning-based modeling. Moreover, besides being black-box modeling, our experiments show that conventional machine-learning modeling needs about one order of magnitude more measurements to reach the same level of accuracy achieved in our modeling.Comment: 24 page

arXiv.org e-Print Archive

Explore Bristol Research

When parallel speedups hit the memory wall

Author: Eder Kerstin
Furtunato Alex F. A.
Georgiou Kyriakos
Xavier-de-Souza Samuel
Publication venue
Publication date: 03/05/2019
Field of study

Explore Bristol Research

Performance and Energy Trade-Offs for Parallel Applications on Heterogeneous Multi-Processing Systems

Author: Coutinho Demetrios A. M.
Eder Kerstin
Georgiou Kyriakos
Lorenzon Arthur Francisco
Nunez-Yanez Jose
Sensi Daniele De
Xavier-de-Souza Samuel
Publication venue: 'MDPI AG'
Publication date: 11/05/2020
Field of study

This work proposes a methodology to find performance and energy trade-offs for parallel applications running on Heterogeneous Multi-Processing systems with a single instruction-set architecture. These offer flexibility in the form of different core types and voltage and frequency pairings, defining a vast design space to explore. Therefore, for a given application, choosing a configuration that optimizes the performance and energy consumption is not straightforward. Our method proposes novel analytical models for performance and power consumption whose parameters can be fitted using only a few strategically sampled offline measurements. These models are then used to estimate an application’s performance and energy consumption for the whole configuration space. In turn, these offline predictions define the choice of estimated Pareto-optimal configurations of the model, which are used to inform the selection of the configuration that the application should be executed on. The methodology was validated on an ODROID-XU3 board for eight programs from the PARSEC Benchmark, Phoronix Test Suite and Rodinia applications. The generated Pareto-optimal configuration space represented a 99% reduction of the universe of all available configurations. Energy savings of up to 59.77%, 61.38% and 17.7% were observed when compared to the performance, ondemand and powersave Linux governors, respectively, with higher or similar performance

Multidisciplinary Digital Publishing Institute

Explore Bristol Research