Next generation, wide-issue processors will require greater memory bandwidth than provided by present memory hierarchy designs. We propose techniques for increasing the memory bandwidth of multi-ported L1 Dcaches, large on-chip L2 caches, and dedicated memory ports while minimizing cycle time impact. These approaches are evaluated within the context of an 8-way superscalar processor design and next-generation VLSI, packaging, and RAM technologies. We show that the combined L1 and L2 cache enhancements can outperform conventional techniques by over 80%, and that even with an on-chip 512KB L2 cache, board-level caches provide signi cant enough performance gains to justify their higher cost.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.