1 research outputs found
Reliable multicast fault tolerant MPI in the Grid environment
Grid environments have recently been developed with low stretch and overheads
that increase with the logarithm of the number of nodes in the system. Getting
and sending data to/from a large numbers of nodes is gaining importance due to
an increasing number of independent data providers and the heterogeneity of the
network/Grid. One of the key challenges is to achieve a balance between low
bandwidth consumption and good reliability. In this paper we present an
implementation of a reliable multicast protocol over a fault tolerant MPI:
MPICHV2. It can provide one way to solve the problem of transferring large
chunks of data between applications running on a grid with limited network
links. We first show that we can achieve similar performance as the MPICH-P4
implementation by using multicast with data compression in a cluster. Next, we
provide a theoretical cluster organization and GRID network architecture to
harness the performance provided by using multicast. Finally, we present the
conclusion and future work