Local Aggregation with Modified B+ tree in Map Reduce Data Processing

Abstract

MapReduce is well-applied in high performancecomputing for large scale data processing. However, aslong as the clusters grow, handling with huge amountof intermediate data produced in the shuffle and reducephases (middle step of Map Reduce) have impactsheavily upon the performance. With local aggregation(either combiners or in-mapper), shuffling largeamounts of data can be reduced which alleviates thereduce straggler problem. The proposed modified B+tree based indexing algorithm is applied to reduceintermediate data amount for output retrieval fast aswell as scalable data storag

    Similar works

    Full text

    thumbnail-image

    Available Versions