Article thumbnail

Co-design of on-chip caches and networks for scalable shared-memory many-core CMPs

By Woo Cheol Kwon

Abstract

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 169-180).Chip Multi-Processors(CMPs) have become mainstream in recent years, providing increased parallelism as core counts scale. While a tiled CMP is widely accepted to be a scalable architecture for the many-core era, on-chip cache organization and coherence are far from solved problems. As the on-chip interconnect directly influences the latency and bandwidth of on-chip cache, scalable interconnect is an essential part of on-chip cache design. On the other hand, optimal design of interconnect can be determined by the traffic forms that it should handle. Thus, on-chip cache organization is inherently interleaved with on-chip interconnect design and vice versa. This dissertation aims to motivate the need for re-organization of on-chip caches to leverage the advancement of on-chip network technology to harness the full potential of future many-core CMPs. Conversely, we argue that on-chip network should also be designed to support specific functionalities required by the on-chip cache. We propose such co-design techniques to offer significant improvement of on-chip cache performance, and thus to provide scalable CMP cache solutions towards future many-core CMPs. The dissertation starts with the problem of remote on-chip cache access latency. Prior locality-aware approaches fundamentally attempt to keep data as close as possible to the requesting cores. In this dissertation, we challenge this design approach by introducing new cache organization that leverages a co-designed on-chip network that allows multi-hop single-cycle traversals. Next, the dissertation moves to cache coherence request ordering. Without built-in ordering capability within the interconnect, cache coherence protocols have to rely on external ordering points. This dissertation proposes a scalable ordered Network-on-Chip which supports ordering of requests for snoopy cache coherence. Lastly, we describe development of a 36-core research prototype chip to demonstrate that the proposed Network-on-Chip enables shared-memory CMPs to be readily scalable to many-core platforms.by Woo Cheol Kwon.Ph. D

Topics: Electrical Engineering and Computer Science.
Publisher: Massachusetts Institute of Technology
Year: 2018
OAI identifier: oai:dspace.mit.edu:1721.1/118084
Provided by: DSpace@MIT
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://hdl.handle.net/1721.1/1... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.