Search CORE

102 research outputs found

Recommended from our members

A Dynamic Reconfiguration Framework to Maximize Performance/Power in Asymmetric Multicore Processors

Author: Annamalai Arunachalam
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2013
Field of study

Recent trends in technology scaling have shifted the processing paradigm to multicores. Depending on the characteristics of the cores, the multicores can be either symmetric or asymmetric. Prior research has shown that Asymmetric Multicore Processors (AMPs) outperform their symmetric (SMP) counterparts within a given resource and power budget. But, due to the heterogeneity in core-types and time-varying workload behavior, thread-to-core assignment is always a challenge in AMPs. As the computational requirements vary significantly across different applications and with time, there is a need to dynamically allocate appropriate computational resources on demand to suit the applications’ current needs, in order to maximize the performance and minimize the energy consumption. Performance/power of the applications could be further increased by dynamically adapting the voltage and frequency of the cores to better fit the changing characteristics of the workloads. Not only can a core be forced to a low power mode when its activity level is low, but the power saved by doing so could be opportunistically re-budgeted to the other cores to boost the overall system throughput. To this end, we propose a novel solution that seamlessly combines heterogeneity with a Dynamic Reconfiguration Framework (DRF). The proposed dynamic reconfiguration framework is equipped with Dynamic Resource Allocation (DRA) and Voltage/Frequency Adaptation (DVFA) capabilities to adapt the core resources and operating conditions at runtime to the changing demands of the applications. As a proof of concept, we illustrate our proposed approach using a dual-core AMP and demonstrate significant performance/power benefits over various baselines

ScholarWorks@UMass Amherst

Recommended from our members

Dynamic Processor Reconfiguration for Power, Performance and Reliability Management

Author: Srinivasan Sudarshan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 10/11/2016
Field of study

Technology advancements allowed more transistors to be packed in a smaller area, while the improved performance helped in achieving higher clock frequencies. This, unfortunately led to a power density problem, forcing processor industry to lower the clock frequency and integrate multiple cores on the same die. Depending on core characteristics, the multiple cores in the die could be symmetric or asymmetric. Asymmetric multi-core processors (AMPs) have been proposed as an alternative to symmetric multi-cores to improve power efficiency. AMPs comprise of cores that implement the same ISA, but differ in performance and power characteristics due to varying sizes of micro-architectural resources. As the computational bottleneck of a workload shifts from one resource to another during its course of execution, reassigning it to another core (where it runs more efficiently), can improve the overall power efficiency. Thus achieving high power efficiency in AMPs requires (i) a diverse set of cores that are optimized for various program phases, (ii) runtime analysis to determine the best core to run on, and (iii) low overhead of re-assigning a thread to a different core type. Decisions to swap threads between AMPs are made at coarse grain granularity of millions of instructions, to mitigate the impact of thread migration overhead. But the computational needs of the program rapidly change during the course of its execution. The best core configuration for an application such that, both power consumption and performance are optimized, changes over time rapidly at fine granularity of thousands of instructions. This dissertation explores ways to design core micro-architecture such that high power efficiency could be achieved, if switching overhead could be lowered, enabling fine grain switching. To take advantage of power saving opportunities at fine grain granularity, this thesis explores reconfigurable/morphable architectures where core resources are reconfigured on demand to suit the needs of the executing application. At first, we explore reconfigurable architectures consisting of two kinds of cores: out-of-order (OOO) big cores and in-order (InO) small cores. The big cores provide higher performance while the small cores are more power efficient. In this proposed architecture, OOO core reconfigures into InO core at run time. Our proposed online management scheme decides to switch between these core types such that we obtain significant power benefits without impacting performance. We also observe that, resource requirements of applications can be quite diverse and consequently, resource bottlenecks or excesses can vary considerably. Thus, reconfiguration between just two core modes may not fully exploit power and performance improvement opportunities. We therefore, explore reconfigurable architectures consisting of diverse core types that not limited to big and little cores. A single core can reconfigure into multiple core modes where each mode has unique power and performance characteristics. Workload performance on a particular core mode depends on a large set of processor resources. Some workloads are highly memory intensive, some exhibit large instruction dependency, some experience high rates of branch mis-prediction, while other workloads exhibit large exploitable instruction level parallelism. A diverse set of core modes is needed, that could address shifting resource needs during various program phases of an application. Different trade-offs in power and performance could be achieved by reducing or expanding the size of various resource. Trade-offs for each core mode are also affected by operating voltage and frequency. We therefore, propose joint core resource resizing with dynamic voltage and frequency scaling (DVFS), which is important for applications whose performance is sensitive to changes in frequency. Thus, at fine granularity, the core should adapt to varying instruction window sizes, execution bandwidth and frequency to meet the demands of the workload at run-time to improve power efficiency. Many current processors employ DVFS aggressively to improve power efficiency and maximize performance. This dissertation studies the tradeoff in power efficiency in using fine grain DVFS and reconfigurable architectures mentioned above.We also explore another important problem due to continued scaling of devices which results in higher vulnerability to soft-errors. We consider dynamic core reconfiguration from the perspectives of both power efficiency and vulnerability to soft-errors. An online management scheme is proposed such that core reconfiguration upon a thread switch not only improves power efficiency but also does not increase the vulnerability to soft errors. In summary, we propose in this thesis several solutions for improving power efficiency by integrating heterogeneity within the core. We also address how popular power reduction techniques like DVFS are comparable to our approach. Finally, we address reliability challenges along with improving power efficiency

ScholarWorks@UMass Amherst

Arquitecturas multiprocesador en HPC: software, métricas y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura
Dell'Oso Matías
Eguren Sebastián
Encinas Diego
Iglesias Luciano
Montezanti Diego
Méndez Mariano
Naiouf Marcelo
Paniego Juan Manuel
Pi Puig Martín
Pousa Adrián
Rodríguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio Alfredo
Publication venue
Publication date: 01/04/2016
Field of study

Caracterizar las arquitecturas multiprocesador distribuidas enfocadas especialmente a cluster y cloud computing, con énfasis en las que utilizan procesadores de múltiples núcleos (multicores, GPUs y Xeon Phi), con el objetivo de modelizarlas, estudiar su escalabilidad, analizar y predecir performance de aplicaciones paralelas, estudiar el consumo energético y su impacto en la perfomance así como desarrollar esquemas para detección y tolerancia a fallas en las mismas.\nProfundizar el estudio de arquitecturas basadas en GPUs y su comparación con clusters de multicores, así como el empleo combinado de GPUs y multicores en computadoras de alta perfomance.\nIniciar investigación experimental con arquitecturas paralelas basadas en FPGAs. En particular estudiar perfomance en Clusters “híbridos”.\nAnalizar y desarrollar software de base para clusters, tratando de optimizar el rendimiento.\nInvestigar arquitecturas multicore asimétricas, desarrollar algoritmos de planificación en el software de sistema operativo para permitir la optimización del rendimiento y consumo energético en aplicaciones de propósito general.\nEstudiar clases de aplicaciones inteligentes en tiempo real, en particular el trabajo colaborativo de robots conectados a un cloud.\nEs de hacer notar que este proyecto se coordina con otros proyectos en curso en el III-LIDI, relacionados con Algoritmos Paralelos, Sistemas Distribuidos y Sistemas de Tiempo Real.Eje: Procesamiento Distribuido y Paralel

Centro de Servicios en Gestión de Información

Servicio de Difusión de la Creación Intelectual

Recommended from our members

ADACORE: Achieving Energy Efficiency via Adaptive Core Morphing at Runtime

Author: Kurella Nithesh
Publication venue: ScholarWorks@UMass Amherst
Publication date: 23/11/2015
Field of study

Heterogeneous multicore processors offer an energy-efficient alternative to homogeneous multicores. Typically, heterogeneous multi-core refers to a system with more than one core where all the cores use a single ISA but differ in one or more micro-architectural configurations. A carefully designed multicore system consists of cores of diverse power and performance profiles. During execution, an application is run on a core that offers the best trade-off between performance and energy-efficiency. Since the resource needs of an application may vary with time, so does the optimal core choice. Moving a thread from one core to another involves transferring the entire processor state and cache warm-up. Frequent migration leads to large performance overhead, negating any benefits of migration. Infrequent migration on the other hand leads to missed opportunities. Thus, reducing overhead of migration is integral to harnessing benefits of heterogeneous multicores. \par This work proposes \textit{AdaCore}, a novel core architecture which pushes the heterogeneity exploited in the heterogeneous multicore into a single core. \textit{AdaCore} primarily addresses the resource bottlenecks in workloads. The design attempts to adaptively match the resource demands by reconfiguring on-chip resources at a fine-grain granularity. The adaptive core morphing allows core configurations with diverse power and performance profiles within a single core by adaptive voltage, frequency and resource reconfiguration. Towards this end, the proposed novel architecture while providing energy savings, improves performance with a low overhead in-core reconfiguration. This thesis further compares \textit{AdaCore} with a standard Out-of-Order core with capability to perform Dynamic Voltage and Frequency Scaling (DVFS) designed to achieve energy efficiency. The results presented in this thesis indicate that the proposed scheme can improve the performance/Watt of application, on average, by 32\% over a static out-of-order core and by 14\% over DVFS. The proposed scheme improves

IPS^{2}/Watt

by 38\% over static out-of-order core

ScholarWorks@UMass Amherst

Scheduling and performance characterization on heterogeneous computing systems

Author: Γεωργακούδης Γιώργης
Publication venue
Publication date: 01/01/2016
Field of study

University of Thessaly Institutional Repository

Arquitecturas multiprocesador en HPC: software de base, métricas y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura
Denham Mónica
Frati Emmanuel
Iglesias Luciano
Montezanti Diego
Naiouf Fernando
Pousa Adrián
Rodríguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio Alfredo
Publication venue
Publication date: 01/04/2013
Field of study

Caracterizar las arquitecturas multiprocesador distribuidas enfocadas especialmente a cluster y cloud computing, con énfasis en las que utilizan procesadores de múltiples núcleos (multicores y GPUs), con el objetivo de modelizarlas, estudiar su escalabilidad, analizar y predecir performance de aplicaciones paralelas y desarrollar esquemas de tolerancia a fallas en las mismas. Profundizar el estudio de arquitecturas basadas en GPUs y su comparación con clusters de multicores, así como el empleo combinado de GPUs y multicores en computadoras de alta perfomance. Analizar la eficiencia energética en estas arquitecturas paralelas, considerando el impacto de la arquitectura, el sistema operativo, el modelo de programación y el algoritmo específico. Analizar y desarrollar software de base para clusters de multicores y GPUs, tratando de optimizar el rendimiento. En el año 2012 se han agregado dos líneas de interés: -El estudio de clusters híbridos, que combinen multicores y GPUs. -La utilización de los registros de hardware de los procesadores para la toma de diferentes decisiones en tiempo de ejecución. Es de hacer notar que este proyecto se coordina con otros dos proyectos en curso en el III-LIDI, relacionados con Algoritmos Distribuidos/Paralelos y Sistemas de Software Distribuido.Ponencia presentada en el WICC 2013 realizado el 18 y 19 de abril de 2013 en Paraná (Entre Ríos)

Centro de Servicios en Gestión de Información

Servicio de Difusión de la Creación Intelectual

Arquitecturas multiprocesador en computación de alto desempeño: software, métricas, modelos y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Dell'Oso Matías
Encinas Diego
Iglesias Luciano
Montezanti Diego Miguel
Méndez Mariano
Naiouf Marcelo
Paniego Juan Manuel
Pi Puig Martín
Pousa Adrián
Rodriguez Eguren Sebastián
Rodriguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio A.
Publication venue
Publication date: 01/04/2017
Field of study

Caracterizar las arquitecturas multiprocesador distribuidas enfocadas especialmente a cluster y cloud computing, con énfasis en las que utilizan procesadores de múltiples núcleos (multicores, GPUs y Xeon Phi), con el objetivo de modelizarlas, estudiar su escalabilidad, analizar y predecir performance de aplicaciones paralelas, estudiar el consumo energético y su impacto en la perfomance así como desarrollar esquemas para detección y tolerancia a fallas en las mismas. Profundizar el estudio de arquitecturas basadas en GPUs y su comparación con clusters de multicores, así como el empleo combinado de GPUs y multicores en computadoras de alta perfomance. Iniciar investigación experimental con arquitecturas paralelas basadas en FPGAs. En particular estudiar perfomance en Clusters “híbridos”. Analizar y desarrollar software de base para clusters, tratando de optimizar el rendimiento. Investigar arquitecturas multicore asimétricas, desarrollar algoritmos de planificación en el software de sistema operativo para permitir la optimización del rendimiento y consumo energético en aplicaciones de propósito general. Estudiar clases de aplicaciones inteligentes en tiempo real, en particular el trabajo colaborativo de robots conectados a un cloud y procesamiento de Big Data. Es de hacer notar que este proyecto se coordina con otros proyectos en curso en el III-LIDI, relacionados con Computación de Alto Desempeño, Algoritmos Paralelos, Sistemas Distribuidos y Sistemas de Tiempo Real.Eje: Procesamiento Distribuido y Paralelo.Red de Universidades con Carreras en Informática (RedUNCI

Centro de Servicios en Gestión de Información

Servicio de Difusión de la Creación Intelectual

Planificación consciente de la contención y gestión de recursos en arquitecturas multicore emergentes

Author: García García Adrián
Publication venue: 'Universidad Complutense de Madrid (UCM)'
Publication date: 29/03/2022
Field of study

Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 14-12-2021Chip multicore processors (CMPs) currently constitute the architecture of choice for mosto general-pùrpose computing systems, and they will likely continue to be dominant in the near future. Advances in technology have enabled to pack an increasing number of cores and bigger caches on the same chip. Nevertheless, contention on shared resources on CMPs -present since the advent of these architectures- still poses a big challenge. Cores in a CMP typically share a last-level cache (LLC) and other memory-related resources with the remaining cores, such as a DRAM controller and an interconnection network. This causes that co-running applications may intensively compete with each other for these shared resources, leading to substantial and uneven performance degradation...Los procesadores multinúcleo o CMPs (Chip Multicore Processors) son actualmente la arquitectura más usada por la mayoría de sistemas de computación de propósito general, y muy probablemente se mantendrían en esa posición dominante en el futuro cercano. Los avances tecnológicos han permitido integrar progresivamente en el mismo chip más cores y aumentar los tamaños de los distintos niveles de cache. No obstante, la contención de recursos compartidos en CMPs {presente desde la aparición de estas arquitecturas{ todavía representa un reto importante que afrontar. Los cores en un CMP comparten en la mayor parte de los diseños una cache de último nivel o LLC (Last-Level Cache) y otros recursos, como el controlador de DRAM o una red de interconexión. La existencia de dichos recursos compartidos provoca en ocasiones que cuando se ejecutan dos o más aplicaciones simultáneamente en el sistema, se produzca una degradación sustancial y potencialmente desigual del rendimiento entre aplicaciones...Fac. de InformáticaTRUEunpu

Docta Complutense

Algoritmos paralelos y evaluacion de rendimiento en plataformas de cómputo de altas prestaciones

Author: Basgall María José
Chichizola Franco
Costanzo Manuel
De Giusti Armando Eduardo
De Giusti Laura Cristina
Frati Fernando Emmanuel
Gallo Silvana Lis
Gaudiani Adriana Angélica
Naiouf Marcelo
Pousa Adrián
Rucci Enzo
Sanz Victoria María
Sánchez Mariano
Publication venue
Publication date: 31/10/2022
Field of study

El eje central de la línea de I/D es investigar en temas de cómputo paralelo y distribuido de alto desempeño, tanto en lo referido a los fundamentos como a la construcción, evaluación y optimización de las aplicaciones en arquitecturas multiprocesador. Se aplican los conceptos en problemas numéricos y no numéricos de cómputo intensivo y/o sobre grandes volúmenes de datos con el fin de obtener soluciones de alto rendimiento. También incluye la construcción de ambientes para la enseñanza de la programación concurrente y paralela. En la dirección de tesis de postgrado existe colaboración con el grupo HPC4EAS (High Performance Computing for Efficient Applications and Simulation) del Dpto. de Arquitectura de Computadores y Sistemas Operativos de la Universidad Autónoma de Barcelona; con el Departamento de Arquitectura de Computadores y Automática de la Universidad Complutense de Madrid; y con el grupo Soft Computing and Intelligent Information Systems (SCI2S) de la Universidad de Granada, entre otros.Red de Universidades con Carreras en Informátic

Servicio de Difusión de la Creación Intelectual