High Performance Hybrid Memory Systems with 3D-stacked DRAM

Abstract

The bandwidth of traditional DRAM is pin limited and so does not scale wellwith the increasing demand of data intensive workloads limiting performance.3D-stacked DRAM can alleviate this problem providing substantially higherbandwidth to a processor chip. However, the capacity of 3D-stacked DRAM isnot enough to replace the bulk of the memory and therefore it is used eitheras a DRAM cache or as part of a flat address space with support for datamigration. The performance of both above alternative designs is limited bytheir particular overheads. In this thesis we propose designs that improvethe performance of hybrid memory systems in which 3D-stacked DRAM isused either as a cache or as part of a flat address space with data migration.DRAM caches have shown excellent potential in capturing the spatial andtemporal data locality of applications, however they are still far from their idealperformance. Besides the unavoidable DRAM access to fetch the requesteddata, tag access is in the critical path adding significant latency and energycosts. Existing approaches are not able to remove these overheads and insome cases limit DRAM cache design options. To alleviate the tag accessoverheads of DRAM caches this thesis proposes Decoupled Fused Cache (DFC),a DRAM cache design that fuses DRAM cache tags with the tags of the on-chipLast Level Cache (LLC) to access the DRAM cache data directly on LLCmisses. Compared to current state-of-the-art DRAM caches, DFC improvessystem performance by 6% on average and by 16-18% for large cacheline sizes.Finally, DFC reduces DRAM cache traffic by 18% and DRAM cache energyconsumption by 7%. Data migration schemes have significant performancepotential, but also entail overheads, which may diminish migration benefitsor even lead to performance degradation. These overheads are mainly due tothe high cost of swapping data between memories which also makes selectingwhich data to migrate critical to performance. To address these challengesof data migration this thesis proposes LLC guided Data Migration (LGM).LGM uses the LLC to predict future reuse and select memory segments formigration. Furtermore, LGM reduces the data migration traffic overheads bynot migrating the cache lines of memory segments which are present in theLLC. LGM outperforms current state-of-the art migration designs improvingsystem performance by 12.1% and reducing memory system dynamic energyby 13.2%

    Similar works