Search CORE

2 research outputs found

Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube

Author: Asgari Bahar
Garg Kartikay
Hadidi Ramyad
Kim Hyesoon
Krishna Tushar
Mudassar Burhan Ahmad
Young Jeffrey
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/02/2018
Field of study

Memories that exploit three-dimensional (3D)-stacking technology, which integrate memory and logic dies in a single stack, are becoming popular. These memories, such as Hybrid Memory Cube (HMC), utilize a network-on-chip (NoC) design for connecting their internal structural organizations. This novel usage of NoC, in addition to aiding processing-in-memory capabilities, enables numerous benefits such as high bandwidth and memory-level parallelism. However, the implications of NoCs on the characteristics of 3D-stacked memories in terms of memory access latency and bandwidth have not been fully explored. This paper addresses this knowledge gap by (i) characterizing an HMC prototype on the AC-510 accelerator board and revealing its access latency behaviors, and (ii) by investigating the implications of such behaviors on system and software designs

arXiv.org e-Print Archive

Crossref

Near-memory primitive support and infratructure for sparse algorithm

Author: Garg Kartikay
Publication venue: Georgia Institute of Technology
Publication date: 07/06/2017
Field of study

This thesis introduces an approach to solving the problem of memory latency performance penalties with traditional accelerators. By introducing simple near-data-processing (NDP) accelerators for primitives such as SpMV (Sparse Matrix Multiplication of Vectors) and DGEMM (Double Precision Dense Matrix Multiplication) kernels, applications can achieve a considerable performance boost. We evaluate our work for SuperLU application for the HPC community. Thesis Statement: Reevaluating core primitives such as DGEMM, SCATTER, and GATHER for 3D-stacked PIM architectures that incorporate re-configurable fabrics can deliver multi-fold performance improvements for SUPERLU and other sparse algorithms.M.S

Scholarly Materials And Research @ Georgia Tech