A comparison of cache hierarchies for SMT processors

Monreal Arnal, Teresa; Suárez Gracía, Dario; Viñals Yúfera, Víctor

research

A comparison of cache hierarchies for SMT processors

Authors: Teresa Monreal Arnal
Dario Suárez Gracía
Víctor Viñals Yúfera
Publication date: 1 January 2011
Publisher: Universidad de La Laguna. Servicio de Publicaciones

Abstract

In the multithread and multicore era, programs are forced to share part of the processor structures. On one hand, the state of the art in multithreading describes how efficiently manage and distribute inner resources such as reorder buffer or issue windows. On the other hand, there is a substantial body of works focused on outer resources, mainly on how to effectively share last level caches in multicores. Between these ends, first and second level caches have remained apart even if they are shared in most commercial multithreaded processors. This work analyzes multiprogrammed workloads as the worst-case scenario for cache sharing among threads. In order to obtain representative results, we present a sampling-based methodology that for multiple metrics such as STP, ANTT, IPC throughput, or fairness, reduces simulation time up to 4 orders of magnitude when running 8-thread workloads with an error lower than 3% and a confidence level of 97%. With the above mentioned methodology, we compare several state-of-the-art cache hierarchies, and observe that Light NUCA provides performance benefits in SMT processors regardless the organization of the last level cache. Most importantly, Light NUCA gains are consistent across the entire number of simulated threads, from one to eight.Peer ReviewedPostprint (author's final draft

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

UPCommons. Portal del coneixement obert de la UPC

oai:upcommons.upc.edu:2117/114...

Last time updated on 21/05/2018

UPCommons

oai:upcommons.upc.edu:2117/114...

Last time updated on 17/04/2020