1 research outputs found
Decoupled Strategy for Imbalanced Workloads in MapReduce Frameworks
In this work, we consider the integration of MPI one-sided communication and
non-blocking I/O in HPC-centric MapReduce frameworks. Using a decoupled
strategy, we aim to overlap the Map and Reduce phases of the algorithm by
allowing processes to communicate and synchronize using solely one-sided
operations. Hence, we effectively increase the performance in situations where
the workload per process is unexpectedly unbalanced. Using a Word-Count
implementation and a large dataset from the Purdue MapReduce Benchmarks Suite
(PUMA), we demonstrate that our approach can provide up to 23% performance
improvement on average compared to a reference MapReduce implementation that
uses state-of-the-art MPI collective communication and I/O