1 research outputs found
A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs
Tensor computations present significant performance challenges that impact a
wide spectrum of applications ranging from machine learning, healthcare
analytics, social network analysis, data mining to quantum chemistry and signal
processing. Efforts to improve the performance of tensor computations include
exploring data layout, execution scheduling, and parallelism in common tensor
kernels. This work presents a benchmark suite for arbitrary-order sparse tensor
kernels using state-of-the-art tensor formats: coordinate (COO) and
hierarchical coordinate (HiCOO) on CPUs and GPUs. It presents a set of
reference tensor kernel implementations that are compatible with real-world
tensors and power law tensors extended from synthetic graph generation
techniques. We also propose Roofline performance models for these kernels to
provide insights of computer platforms from sparse tensor view.Comment: 13 pages, 7 figure