Host-Assisted Zero-Copy Remote Memory Access Communication on InfiniBand

Abstract

The remote memory access (RMA) is becoming an increasingly important communication model due to its excellent potential for overlapping communication and computations and achieving high performance on modern networks with RDMA hardware such as Infiniband. RMA plays a vital role in supporting the emerging global address space languages and management of advanced distributed data structures. This paper describes how remote memory access communication (RMA) can be implemented efficiently over InfiniBand based on the 'zero-copy' approach. The capabilities not offered directly by the Infiniband verb layer can be implemented efficiently using the novel host-assisted approach while achieving zerocopy communication and supporting a high degree of overlapping computations and communication. For contiguous case we are able to achieve a small message latency of 7.44s and a peak bandwidth of 730 MB/s for 'put' and a small message latency of 15s and a peak bandwidth of 689 MegaBytes for 'get'. These numbers are almost as good as the performance of the native VAPI layer. For the noncontiguous case, with our host assisted approach, we can support close to the peak bandwidth that was achieved for the contiguous data. We also demonstrate the superior tolerance of host-assisted datatransfer operations to CPU intensive tasks due to minimum host involvement in our approach as compared to the traditional host-based approach. Our implementation also supports a very high degree of overlap of computation and communication. 99% overlap for contiguous and up to 95% for non contiguous in case of large message sizes were achieved. Finally, the NAS MG and parallel matrix multiplication benchmarks were used to validate effectiveness of our approach, and demonstrated excellent overall performance

    Similar works

    Full text

    thumbnail-image

    Available Versions