Relaxing the problem-size bound for out-of-core columnsort

Abstract

Previous implementations of out-of-core columnsort limit the problem size to N ≤ � (M/P) 3 /2, where N is the number of records to sort, P is the number of processors, and M is the total number of records that the entire system can hold in its memory (so that M/P is the number of records that a single processor can hold in its memory). We implemented two variations to out-of-core columnsort that relax this restriction. Subblock columnsort is based on an algorithmic modification of the underlying columnsort algorithm, and it improves the problem-size bound to N ≤ (M/P) 5/3 /4 2/3 but at the cost of additional disk I/O. M-columnsort changes the notion of the column size in columnsort, improving the maximum problem size to N ≤ � M 3 /2 but at the cost of additional computation and communication. Experimental results on a Beowulf cluster show that both subblock columnsort and M-columnsort run well but that M-columnsort is faster. A further advantage of M-columnsort is that it handles a wider range of problem sizes than subblock columnsort. This research was supported in part by NSF Grant EIA-98-02068.

    Similar works