21 research outputs found

    Memory-efficient hash joins

    No full text

    A Top-Down Parallel Semisort

    No full text
    Semisorting is the problem of reordering an input array of keys such that equal keys are contiguous but different keys are not necessarily in sorted order. Semisorting is important for collecting equal values and is widely used in practice. For example, it is the core of the MapReduce paradigm, is a key component of the database join operation, and has many other applications. We describe a (randomized) parallel algorithm for the problem that is theoretically efficient (linear work and log-arithmic depth), but is designed to be more practically effi-cient than previous algorithms. We use ideas from the par-allel integer sorting algorithm of Rajasekaran and Reif, but instead of processing bits of a integers in a reduced range in a bottom-up fashion, we process the hashed values of keys directly top-down. We implement the algorithm and exper-imentally show on a variety of input distributions that it outperforms a similarly-optimized radix sort on a modern 40-core machine with hyper-threading by about a factor of 1.7–1.9, and achieves a parallel speedup of up to 38x. We discuss the various optimizations used in our implementa-tion and present an extensive experimental analysis of its performance

    Patience is a virtue

    No full text

    Database Scan Variants on Modern CPUs: A Performance Study

    No full text
    corecore