Search CORE

806 research outputs found

Dynamically controlled resource allocation in SMT processors

Author: Cazorla Almeida Francisco Javier
Fernandez Prieto Enrique
Ramírez Bellido Alejandro
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

SMT processors increase performance by executing instructions from several threads simultaneously. These threads use the resources of the processor better by sharing them but, at the same time, threads are competing for these resources. The way critical resources are distributed among threads determines the final performance. Currently, processor resources are distributed among threads as determined by the fetch policy that decides which threads enter the processor to compete for resources. However, current fetch policies only use indirect indicators of resource usage in their decision, which can lead to resource monopolization by a single thread or to resource waste when no thread can use them. Both situations can harm performance and happen, for example, after an L2 cache miss. In this paper, we introduce the concept of dynamic resource control in SMT processors. Using this concept, we propose a novel resource allocation policy for SMT processors. This policy directly monitors the usage of resources by each thread and guarantees that all threads get their fair share of the critical shared resources, avoiding monopolization. We also define a mechanism to allow a thread to borrow resources from another thread if that thread does not require them, thereby reducing resource under-use. Simulation results show that our dynamic resource allocation policy outperforms a static resource allocation policy by 8%, on average. It also improves the best dynamic resource-conscious fetch policies like FLUSH++ by 4%, on average, using the harmonic mean as a metric. This indicates that our policy does not obtain the ILP boost by unfairly running high ILP threads over slow memory-bounded threads. Instead, it achieves a better throughput-fairness balance.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Efficient resources assignment schemes for clustered multithreaded processors

Author: Fernando Latorre
González Colás Antonio María
González González José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

New feature sizes provide larger number of transistors per chip that architects could use in order to further exploit instruction level parallelism. However, these technologies bring also new challenges that complicate conventional monolithic processor designs. On the one hand, exploiting instruction level parallelism is leading us to diminishing returns and therefore exploiting other sources of parallelism like thread level parallelism is needed in order to keep raising performance with a reasonable hardware complexity. On the other hand, clustering architectures have been widely studied in order to reduce the inherent complexity of current monolithic processors. This paper studies the synergies and trade-offs between two concepts, clustering and simultaneous multithreading (SMT), in order to understand the reasons why conventional SMT resource assignment schemes are not so effective in clustered processors. These trade-offs are used to propose a novel resource assignment scheme that gets and average speed up of 17.6% versus Icount improving fairness in 24%.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Exploring coordinated software and hardware support for hardware resource allocation

Author: Figueiredo Boneti Carlos Santieri de
Publication venue: Universitat Politècnica de Catalunya
Publication date: 04/09/2009
Field of study

Multithreaded processors are now common in the industry as they offer high performance at a low cost. Traditionally, in such processors, the assignation of hardware resources between the multiple threads is done implicitly, by the hardware policies. However, a new class of multithreaded hardware allows the explicit allocation of resources to be controlled or biased by the software. Currently, there is little or no coordination between the allocation of resources done by the hardware and the prioritization of tasks done by the software.This thesis targets to narrow the gap between the software and the hardware, with respect to the hardware resource allocation, by proposing a new explicit resource allocation hardware mechanism and novel schedulers that use the currently available hardware resource allocation mechanisms.It approaches the problem in two different types of computing systems: on the high performance computing domain, we characterize the first processor to present a mechanism that allows the software to bias the allocation hardware resources, the IBM POWER5. In addition, we propose the use of hardware resource allocation as a way to balance high performance computing applications. Finally, we propose two new scheduling mechanisms that are able to transparently and successfully balance applications in real systems using the hardware resource allocation. On the soft real-time domain, we propose a hardware extension to the existing explicit resource allocation hardware and, in addition, two software schedulers that use the explicit allocation hardware to improve the schedulability of tasks in a soft real-time system.In this thesis, we demonstrate that system performance improves by making the software aware of the mechanisms to control the amount of resources given to each running thread. In particular, for the high performance computing domain, we show that it is possible to decrease the execution time of MPI applications biasing the hardware resource assignation between threads. In addition, we show that it is possible to decrease the number of missed deadlines when scheduling tasks in a soft real-time SMT system.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Architectural support for real-time task scheduling in SMT processors

Author: Cazorla Almeida Francisco Javier
Fernández Enrique
Knijnenburg Peter M.W.
Ramírez Bellido Alejandro
Sakellariou Rizos
Valero Cortés Mateo
Publication venue
Publication date: 01/01/2005
Field of study

In Simultaneous Multithreaded (SMT) architectures most hardware resources are shared between threads. This provides a good cost/performance trade-off which renders these architectures suitable for use in embedded systems. However, since threads share many resources, like caches, they also interfere with each other. As a result, execution times of applications become highly unpredictable and highly dependent on the context in which an application is executed. Obviously, this poses problems if an SMT is to be used in a (soft) real time system. In this paper, we propose two novel hardware mechanisms that can be used to reduce this performance variability. In contrast to previous approaches, our proposed mechanisms do not need any information beyond the information already known by traditional job schedulers. Neither do they require extensive profiling of workloads to determine optimal schedules. Our mechanisms are based on dynamic resource partitioning. The OS level job scheduler needs to be slightly adapted in order to provide the hardware resource allocator some information on how this resource partitioning needs to be done. We show that our mechanisms provide high stability for SMT architectures to be used in real time systems: the real time benchmarks we used meet their deadlines in more than 98% of the cases considered while the other thread in the workload still achieves high throughput.Postprint (published version

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

International Migration, Integration and Social Cohesion online publications

Using Hardware Resource Allocation to Balance HPC Applications

Author: Carlos Boneti
Francisco J. Cazorla
Mateo Valero
Roberto Gioiosa
Publication venue: 'IntechOpen'
Publication date: 01/01/2010
Field of study

IntechOpen

Efficient Resource Allocation on a Dynamic Simultaneous Multithreaded Architecture

Author: Ortiz-Arroyo Daniel
Publication venue
Publication date: 01/01/2006
Field of study

VBN

Hyperheuristics for explicit resource partitioning in simultaneous multithreaded processors

Author: Güney İsa A.
Küçük Gürhan
Poyraz Kemal
Özcan Ender
Publication venue: 'The Scientific and Technological Research Council of Turkey'
Publication date: 28/03/2020
Field of study

Repository@Nottingham

Maximizing multithreaded multicore architectures through thread migrations

Author: Acosta Ojeda Carmelo Alexis
Cazorla Almeida Francisco Javier
Falcón Samper Ayose Jesús
Ramírez Bellido Alejandro
Santana Jaria Oliverio J.
Valero Cortés Mateo
Publication venue
Publication date: 01/01/2009
Field of study

Heterogeneity in general-purpose workloads often end up in non optimal per-thread hardware resource usage. The current trend towards multicore architectures, containing several multithreaded cores, increases the need of a complexity-effective way to expose the heterogeneity in general-purpose workloads to the underlying hardware, in order to obtain all the potential performance of these architectures. In this paper we present the Heterogeneity-Aware Dynamic Thread Migrator (hDTM), a novel complexity-effective hardware mechanism that exposes the heterogeneity in software to the hardware, also enabling the hardware to react to the dynamic behavior variations in the running applications. By means of core-to-core thread migrations, the hDTM mechanism strives to perform the desired behavior transparently to the Operating System. As an example of the general-purpose hDTM concept presented in this paper, we describe a naive hDTM implementation for a Power5-like processor and provide results on the benefits of the proposed mechanism. Our results indicate that even this simple hDTM implementation is able to get close to hDTM’s goal, not only avoiding losses due to bad thread-to-core assignments (up to a 25%) but also going beyond the best static thread-to-core assignment upper limit.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Introducing runahead threads

Author: Pajuelo González Manuel Alejandro
Ramírez García Tanausu
Santana Jaria Oliverio J.
Valero Cortés Mateo
Publication venue
Publication date: 01/01/2007
Field of study

Simultaneous Multithreading processors share their resources among multiple threads in order to improve performance. However, a resource control policy is needed to avoid resource conflicts and prevent some threads from monopolizing them. On the contrary, resource conflicts would cause other threads to suffer from resource starvation degrading the overall performance. This situation is especially sensitive for memory bounded threads, because they hold an important amount of resources while long latency accesses are being served. Several fetch policies and resource control techniques have been proposed to overcome these problems by limiting the per-thread resource utilization. Nevertheless, this limitation is harmful for memory bounded threads because it restricts the memory level parallelism available that hides the long latency memory accesses. In this paper, we propose Runahead threads on SMT scenarios as a valuable solution for both exploiting the memory-level parallelism and reducing the resource contention. This approach switches a memory-bounded eager resource thread to a speculative light thread, avoiding critical resource blocking among multiple threads. Furthermore, it improves the thread-level parallelism by removing long-latency memory operations from the instruction window, releasing busy resources. We compare an SMT architecture using Runahead threads (SMTRA) to both state-of-the-art static fetch and dynamic resource control policies. Our results show that the SMTRA combination performs better, in terms of throughput and fairness, than any of the other policies.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC