Search CORE

834 research outputs found

POSTER: Exploiting asymmetric multi-core processors with flexible system sofware

Author: Ayguadé Parra Eduard
Badia Sala Rosa Maria
Casas Guix Marc
Chronaki Kallia
Labarta Mancho Jesús José
Moreto Planas Miquel
Rico Alejandro
Valero Cortés Mateo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Energy efficiency has become the main challenge for high performance computing (HPC). The use of mobile asymmetric multi-core architectures to build future multi-core systems is an approach towards energy savings while keeping high performance. However, it is not known yet whether such systems are ready to handle parallel applications. This paper fills this gap by evaluating emerging parallel applications on an asymmetric multi-core. We make use of the PARSEC benchmark suite and a processor that implements the ARM big.LITTLE architecture. We conclude that these applications are not mature enough to run on such systems, as they suffer from load imbalance. Furthermore, we explore the behaviour of dynamic scheduling solutions on either the Operating System (OS) or the runtime level. Comparing these approaches shows us that the most efficient scheduling takes place in the runtime level, influencing the future research towards such solutions.This work has been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contracts TIN2015-65316-P), by Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), by the RoMoL ERC Advanced Grant (GA 321253) and the European HiPEAC Network of Excellence. The Mont-Blanc project receives funding from the EU's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 610402 and from the EU's H2020 Framework Programme (H2020/2014-2020) under grant agreement number 671697. M. Moretó has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047. M. Casas is supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the 7th R&D Framework Programme of the European Union (Contract 2013 BP B 00243).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Fairness-aware scheduling on single-ISA heterogeneous multi-cores

Author: Akram Shoaib
Eeckhout Lieven
Heirman Wim
Jaleel Aamer
Van Craeynest Kenzo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Single-ISA heterogeneous multi-cores consisting of small (e.g., in-order) and big (e.g., out-of-order) cores dramatically improve energy- and power-efficiency by scheduling workloads on the most appropriate core type. A significant body of recent work has focused on improving system throughput through scheduling. However, none of the prior work has looked into fairness. Yet, guaranteeing that all threads make equal progress on heterogeneous multi-cores is of utmost importance for both multi-threaded and multi-program workloads to improve performance and quality-of-service. Furthermore, modern operating systems affinitize workloads to cores (pinned scheduling) which dramatically affects fairness on heterogeneous multi-cores. In this paper, we propose fairness-aware scheduling for single-ISA heterogeneous multi-cores, and explore two flavors for doing so. Equal-time scheduling runs each thread or workload on each core type for an equal fraction of the time, whereas equal-progress scheduling strives at getting equal amounts of work done on each core type. Our experimental results demonstrate an average 14% (and up to 25%) performance improvement over pinned scheduling through fairness-aware scheduling for homogeneous multi-threaded workloads; equal-progress scheduling improves performance by 32% on average for heterogeneous multi-threaded workloads. Further, we report dramatic improvements in fairness over prior scheduling proposals for multi-program workloads, while achieving system throughput comparable to throughput-optimized scheduling, and an average 21% improvement in throughput over pinned scheduling

Ghent University Academic Bibliography

Scheduling and performance characterization on heterogeneous computing systems

Author: Γεωργακούδης Γιώργης
Publication venue
Publication date: 01/01/2016
Field of study

University of Thessaly Institutional Repository

Tendencias en Arquitecturas y Algoritmos Paralelos para HPC

Author: Balladini Javier
Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura
Encinas Diego
Frati Emmanuel
Leibovich Fabiana
Montes de Oca Erica
Montezanti Diego
Naiouf Marcelo
Pousa Adrián
Rodriguez Eguren Sebastián
Rodríguez Ismael Pablo
Romero Fernando
Rucci Enzo
Villagarcía Wanza Horacio Alfredo
Publication venue
Publication date: 01/05/2014
Field of study

El eje de esta línea de I/D lo constituye el estudio de tendencias actuales en las áreas de arquitecturas y algoritmos paralelos. Incluye como temas centrales:\n Arquitecturas Many-core (GPU, procesadores MIC), Arquitecturas híbridas (diferentes combinaciones de multicores y GPUs) y Arquitecturas heterogéneas.\n Lenguajes y Estructuras de Datos para nuevas arquitecturas de cómputo paralelo.\n Desarrollo y evaluación de algoritmos paralelos sobre nuevas arquitecturas y su evaluación de rendimiento.\n Estudio de las arquitecturas tipo Cloud y el desarrollo de software de base y aplicaciones eficientes en Cloud Computing, en particular en el área de cómputo paralelo de altas prestaciones (HPC).\n Aspectos del consumo energético, en particular en relación con clases de instrucciones y algoritmos paralelos.\n Empleo de contadores de hardware, en particular en toma de decisiones en tiempo de ejecución.\nLas temáticas mencionadas se observan como aristas promisorias en el futuro del cómputo paralelo de altas prestaciones.Eje: Procesamiento Distribuido y Paralel

Centro de Servicios en Gestión de Información

Servicio de Difusión de la Creación Intelectual

Tendencias en Arquitecturas y Algoritmos Paralelos para HPC

Author: Balladini Javier
Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura
Encinas Diego
Frati Emmanuel
Leibovich Fabiana
Montes de Oca Erica
Montezanti Diego
Naiouf Marcelo
Pousa Adrián
Rodriguez Eguren Sebastián
Rodríguez Ismael Pablo
Romero Fernando
Rucci Enzo
Villagarcía Wanza Horacio Alfredo
Publication venue
Publication date: 01/05/2014
Field of study

Centro de Servicios en Gestión de Información

Planificación consciente de la contención y gestión de recursos en arquitecturas multicore emergentes

Author: García García Adrián
Publication venue: 'Universidad Complutense de Madrid (UCM)'
Publication date: 29/03/2022
Field of study

Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 14-12-2021Chip multicore processors (CMPs) currently constitute the architecture of choice for mosto general-pùrpose computing systems, and they will likely continue to be dominant in the near future. Advances in technology have enabled to pack an increasing number of cores and bigger caches on the same chip. Nevertheless, contention on shared resources on CMPs -present since the advent of these architectures- still poses a big challenge. Cores in a CMP typically share a last-level cache (LLC) and other memory-related resources with the remaining cores, such as a DRAM controller and an interconnection network. This causes that co-running applications may intensively compete with each other for these shared resources, leading to substantial and uneven performance degradation...Los procesadores multinúcleo o CMPs (Chip Multicore Processors) son actualmente la arquitectura más usada por la mayoría de sistemas de computación de propósito general, y muy probablemente se mantendrían en esa posición dominante en el futuro cercano. Los avances tecnológicos han permitido integrar progresivamente en el mismo chip más cores y aumentar los tamaños de los distintos niveles de cache. No obstante, la contención de recursos compartidos en CMPs {presente desde la aparición de estas arquitecturas{ todavía representa un reto importante que afrontar. Los cores en un CMP comparten en la mayor parte de los diseños una cache de último nivel o LLC (Last-Level Cache) y otros recursos, como el controlador de DRAM o una red de interconexión. La existencia de dichos recursos compartidos provoca en ocasiones que cuando se ejecutan dos o más aplicaciones simultáneamente en el sistema, se produzca una degradación sustancial y potencialmente desigual del rendimiento entre aplicaciones...Fac. de InformáticaTRUEunpu

Docta Complutense

Task scheduling techniques for asymmetric multi-core systems

Author: Ayguadé Parra Eduard
Badia Sala Rosa Maria
Casas Marc
Chronaki Kallia
Labarta Mancho Jesús José
Moreto Planas Miquel
Rico Alejandro
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

As performance and energy efficiency have become the main challenges for next-generation high-performance computing, asymmetric multi-core architectures can provide solutions to tackle these issues. Parallel programming models need to be able to suit the needs of such systems and keep on increasing the application’s portability and efficiency. This paper proposes two task scheduling approaches that target asymmetric systems. These dynamic scheduling policies reduce total execution time either by detecting the longest or the critical path of the dynamic task dependency graph of the application, or by finding the earliest executor of a task. They use dynamic scheduling and information discoverable during execution, fact that makes them implementable and functional without the need of off-line profiling. In our evaluation we compare these scheduling approaches with two existing state-of the art heterogeneous schedulers and we track their improvement over a FIFO baseline scheduler. We show that the heterogeneous schedulers improve the baseline by up to 1.45 in a real 8-core asymmetric system and up to 2.1 in a simulated 32-core asymmetric chip.This work has been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), by the RoMoL ERC Advanced Grant (GA 321253) and the European HiPEAC Network of Excellence. The Mont-Blanc project receives funding from the EU’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no 610402 and from the EU’s H2020 Framework Programme (H2020/2014-2020) under grant agreement no 671697. M. Moretó has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047. M. Casas is supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the 7th R&D Framework Programme of the European Union (Contract 2013 BP B 00243).Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Software Patterns for Asymmetric Multiprocessing Devices on Embedded Systems: a performance assessment

Author: Garrido Alejandra
Martos Pedro Ignacio Domingo
Publication venue
Publication date: 02/10/2020
Field of study

In emnedded systems there is a variant of Multicore System on Chip devices (MSoC devices) where not all the computing elements (processor cores) are equal. The differences in the cores of these devices range from different hardware architectures using the same instruction set to completely different processors working together inside the same device. These SoCs are called “Asymmetric Multi Processing Devices” (AMP Devices). In order to help developers to take advantage of the possinilities that these devices may offer in the context of emnedded systems, software design patterns have neen defined, descrining software architectural solutions with known uses. However, there are still no experimental results showing the nenefits of these solutions. In this work we measure the performance of a design pattern called Mini Me, applied on an AMP device configuration, and compare it against two Symmetric Multiprocessing Device (SMP Device) configurations. The evaluations show a netter than expected computing performance of the AMP Configuration using the design pattern Mini Me.Laboratorio de Investigación y Formación en Informática Avanzad

Servicio de Difusión de la Creación Intelectual