2 research outputs found

    Experience with benchmarking dependability and performance of MapReduce systems

    No full text
    International audienceMapReduce provides a convenient means for distributed data processing and automatic parallel execution on clusters of machines. It has various applications and is used by several services featuring fault tolerance and scalability. Many studies investigated the dependability and performance of MapReduce, ranging from job scheduling to data placement and replication, adaptive and on-demand fault tolerance to new fault tolerance models. However, the ad-hoc and overly simplified setting used to evaluate most MapReduce fault tolerance and performance improvement solutions poses significant challenges to the analysis and comparison of the effectiveness of these solutions. The paper precisely addresses this issue and presents MRBS, a comprehensive benchmark suite for evaluating the dependability and performance of MapReduce systems. MRBS includes five benchmarks covering several application domains and a wide range of execution scenarios such as data-intensive vs. compute-intensive applications, or batch applications vs. online interactive applications. MRBS allows to inject various workloads, dataloads and faultloads, and produces extensive reliability, availability and performance statistics. We implemented the MRBS benchmark suite for Hadoop MapReduce, and we illustrate its use with various case studies running on Amazon EC2 and on a private cloud

    Experience with benchmarking dependability and performance of MapReduce systems

    No full text
    International audienceMapReduce provides a convenient means for distributed data processing and automatic parallel execution on clusters of machines. It has various applications and is used by several services featuring fault tolerance and scalability. Many studies investigated the dependability and performance of MapReduce, ranging from job scheduling to data placement and replication, adaptive and on-demand fault tolerance to new fault tolerance models. However, the ad-hoc and overly simplified setting used to evaluate most MapReduce fault tolerance and performance improvement solutions poses significant challenges to the analysis and comparison of the effectiveness of these solutions. The paper precisely addresses this issue and presents MRBS, a comprehensive benchmark suite for evaluating the dependability and performance of MapReduce systems. MRBS includes five benchmarks covering several application domains and a wide range of execution scenarios such as data-intensive vs. compute-intensive applications, or batch applications vs. online interactive applications. MRBS allows to inject various workloads, dataloads and faultloads, and produces extensive reliability, availability and performance statistics. We implemented the MRBS benchmark suite for Hadoop MapReduce, and we illustrate its use with various case studies running on Amazon EC2 and on a private cloud
    corecore