50 research outputs found

    Parallel String Sample Sort

    Get PDF
    We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain situations.Comment: 34 pages, 7 figures and 12 table

    Engineering Parallel String Sorting

    Get PDF
    We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we first propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Then we focus on NUMA architectures, and develop parallel multiway LCP-merge and -mergesort to reduce the number of random memory accesses to remote nodes. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain situations. Comprehensive experiments on five current multi-core platforms are then reported and discussed. The experiments show that our implementations scale very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115

    Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools

    Get PDF
    This dissertation focuses on two fundamental sorting problems: string sorting and suffix sorting. The first part considers parallel string sorting on shared-memory multi-core machines, the second part external memory suffix sorting using the induced sorting principle, and the third part distributed external memory suffix sorting with a new distributed algorithmic big data framework named Thrill.Comment: 396 pages, dissertation, Karlsruher Instituts f\"ur Technologie (2018). arXiv admin note: text overlap with arXiv:1101.3448 by other author

    Analytical Comparison of Grid File and K-d-b-tree Structures

    Get PDF
    Computing and Information Scienc

    Signature Files: An Integrated Access Method for Formatted and Unformatted Databases

    Get PDF
    The signature file approach is one of the most powerful information storage and retrieval techniques which is used for finding the data objects that are relevant to the user queries. The main idea of all signature based schemes is to reflect the essence of the data items into bit pattern (descriptors or signatures) and store them in a separate file which acts as a filter to eliminate the non aualifvine data items for an information reauest. It provides an integrated access method for both formattid and formatted databases. A complative overview and discussion of the proposed signatnre generation methods and the major signature file organization schemes are presented. Applications of the signature techniques to formatted and unformatted databases, single and multiterm query cases, serial and paratlei architecture. static and dynamic environments are provided with a special emphasis on the multimedia databases where the pioneering prototype systems using signatnres yield highly encouraging results

    Search Tree Data Structures and Their Applications

    Get PDF
    This study concerns the discussion of search tree data structures and their applications. The thesis presents three new top-down updating algorithms for the concurrent data processing environment.Computing and Information Scienc

    Parallel Multiway LCP-Mergesort

    Get PDF
    In this bachelor thesis, multiway LCP-Merge is introduced, parallelized and applied to create a fully parallel LCP-Mergesort, as well as NUMA optimized pS5. As an advancement of binary LCP-Mergesort, a multiway LCP-aware tournament tree is introduced and parallelized. For dynamic load balancing, one well-known and two new strategies for splitting merge work packages are utilized. Besides the introduction of fully parallel multiway LCP-Mergesort, further focus is put on NUMA architectures. Thus \u27parallel Super Scalar String Sample Sort\u27 (pS5) is adapted to the special properties of these systems by utilising the parallel LCP-Merge. Moreover this yields an efficient and generic approach for parallelizing arbitrary sequential string sorting algorithms and making parallel algorithms NUMA-aware. Several optimizations, important for practical implementations, as well as comprehensive experiments on two current NUMA platforms, are then reported and discussed. The experiments show the good scalability of the introduced algorithms and especially, the great improvements of NUMA-aware pS5 with real-world input sets on modern machines
    corecore