Current HPC systems provide memory resources that are statically configured
and tightly coupled with compute nodes. However, workloads on HPC systems are
evolving. Diverse workloads lead to a need for configurable memory resources to
achieve high performance and utilization. In this study, we evaluate a memory
subsystem design leveraging CXL-enabled memory pooling. Two promising use cases
of composable memory subsystems are studied -- fine-grained capacity
provisioning and scalable bandwidth provisioning. We developed an emulator to
explore the performance impact of various memory compositions. We also provide
a profiler to identify the memory usage patterns in applications and their
optimization opportunities. Seven scientific and six graph applications are
evaluated on various emulated memory configurations. Three out of seven
scientific applications had less than 10% performance impact when the pooled
memory backed 75% of their memory footprint. The results also show that a
dynamically configured high-bandwidth system can effectively support
bandwidth-intensive unstructured mesh-based applications like OpenFOAM.
Finally, we identify interference through shared memory pools as a practical
challenge for adoption on HPC systems.Comment: 10 pages, 13 figures. Accepted for publication in Workshop on Memory
Centric High Performance Computing (MCHPC'22) at SC2