For several years, MPI has been the de facto standard for writing parallel
applications. One of the most popular MPI implementations is MPICH. Its
successor, MPICH2, features a completely new design that provides more
performance and flexibility. To ensure portability, it has a hierarchical
structure based on which porting can be done at different levels. In this
paper, we present our experiences designing and implementing MPICH2 over
InfiniBand. Because of its high performance and open standard, InfiniBand is
gaining popularity in the area of high-performance computing. Our study focuses
on optimizing the performance of MPI-1 functions in MPICH2. One of our
objectives is to exploit Remote Direct Memory Access (RDMA) in Infiniband to
achieve high performance. We have based our design on the RDMA Channel
interface provided by MPICH2, which encapsulates architecture-dependent
communication functionalities into a very small set of functions. Starting with
a basic design, we apply different optimizations and also propose a
zero-copy-based design. We characterize the impact of our optimizations and
designs using microbenchmarks. We have also performed an application-level
evaluation using the NAS Parallel Benchmarks. Our optimized MPICH2
implementation achieves 7.6 μs latency and 857 MB/s bandwidth, which are
close to the raw performance of the underlying InfiniBand layer. Our study
shows that the RDMA Channel interface in MPICH2 provides a simple, yet
powerful, abstraction that enables implementations with high performance by
exploiting RDMA operations in InfiniBand. To the best of our knowledge, this is
the first high-performance design and implementation of MPICH2 on InfiniBand
using RDMA support.Comment: 12 pages, 17 figure