1 research outputs found
DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning
Non-volatile memory (NVM) technologies such as spin-transfer torque magnetic
random access memory (STT-MRAM) and spin-orbit torque magnetic random access
memory (SOT-MRAM) have significant advantages compared to conventional SRAM due
to their non-volatility, higher cell density, and scalability features. While
previous work has investigated several architectural implications of NVM for
generic applications, in this work we present DeepNVM++, a framework to
characterize, model, and analyze NVM-based caches in GPU architectures for deep
learning (DL) applications by combining technology-specific circuit-level
models and the actual memory behavior of various DL workloads. We present both
iso-capacity and iso-area performance and energy analysis for systems whose
last-level caches rely on conventional SRAM and emerging STT-MRAM and SOT-MRAM
technologies. In the iso-capacity case, STT-MRAM and SOT-MRAM provide up to
3.8x and 4.7x energy-delay product (EDP) reduction and 2.4x and 2.8x area
reduction compared to conventional SRAM, respectively. Under iso-area
assumptions, STT-MRAM and SOT-MRAM provide up to 2x and 2.3x EDP reduction and
accommodate 2.3x and 3.3x cache capacity when compared to SRAM, respectively.
We also perform a scalability analysis and show that STT-MRAM and SOT-MRAM
achieve orders of magnitude EDP reduction when compared to SRAM for large cache
capacities. Our comprehensive cross-layer framework is demonstrated on
STT-/SOT-MRAM technologies and can be used for the characterization, modeling,
and analysis of any NVM technology for last-level caches in GPUs for DL
applications.Comment: 12 pages, 10 figure