Cache Memory Access Patterns in the GPU Architecture

Nimkar, Yash

Cache Memory Access Patterns in the GPU Architecture

Authors: Yash Nimkar
Publication date: 1 July 2018
Publisher: RIT Scholar Works

Abstract

Data exchange between a Central Processing Unit (CPU) and a Graphic Processing Unit (GPU) can be very expensive in terms of performance. The characterization of data and cache memory access patterns differ between a CPU and a GPU. The motivation of this research is to analyze the cache memory access patterns of GPU architectures and to potentially improve data exchange between a CPU and GPU. The methodology of this work uses Multi2Sim GPU simulator for AMD Radeon and NVIDIA Kepler GPU architectures. This simulator, used to emulate the GPU architecture in software, enables certain code modifications for the L1 and L2 cache memory blocks. Multi2Sim was configured to run multiple benchmarks to analyze and record how the benchmarks access GPU cache memory. The recorded results were used to study three main metrics: (1) Most Recently Used (MRU) and Least Recently Used (LRU) accesses for L1 and L2 caches, (2) Inter-warp and Intra-warp cache memory accesses in the GPU architecture for different sets of workloads, and (3) To record and compare the GPU cache access patterns for certain machine learning benchmarks with its general purpose counterparts

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

RIT Scholar Works

oai:repository.rit.edu:theses-...

Last time updated on 12/01/2024