Search CORE

62 research outputs found

Straggler mitigation in hadoop mapreduce framework: a review

Author: Abu Bakar Kamalrulnizam
Ajibade Lukuman Saheed
Aliyu Ahmed
Publication venue: 'The Science and Information Organization'
Publication date: 01/01/2022
Field of study

Processing huge and complex data to obtain useful information is challenging, even though several big data processing frameworks have been proposed and further enhanced. One of the prominent big data processing frameworks is MapReduce. The main concept of MapReduce framework relies on distributed and parallel processing. However, MapReduce framework is facing serious performance degradations due to the slow execution of certain tasks type called stragglers. Failing to handle stragglers causes delay and affects the overall job execution time. Meanwhile, several straggler reduction techniques have been proposed to improve the MapReduce performance. This study provides a comprehensive and qualitative review of the different existing straggler mitigation solutions. In addition, a taxonomy of the available straggler mitigation solutions is presented. Critical research issues and future research directions are identified and discussed to guide researchers and scholars

Universiti Teknologi Malaysia Institutional Repository

Communication-Computation Efficient Gradient Coding

Author: Abbe Emmanuel
Ye Min
Publication venue
Publication date: 01/01/2018
Field of study

This paper develops coding techniques to reduce the running time of distributed learning tasks. It characterizes the fundamental tradeoff to compute gradients (and more generally vector summations) in terms of three parameters: computation load, straggler tolerance and communication cost. It further gives an explicit coding scheme that achieves the optimal tradeoff based on recursive polynomial constructions, coding both across data subsets and vector components. As a result, the proposed scheme allows to minimize the running time for gradient computations. Implementations are made on Amazon EC2 clusters using Python with mpi4py package. Results show that the proposed scheme maintains the same generalization error while reducing the running time by

32\%

compared to uncoded schemes and

23\%

compared to prior coded schemes focusing only on stragglers (Tandon et al., ICML 2017)

arXiv.org e-Print Archive

Princeton University Open Access Repository