Search CORE

13 research outputs found

In search of the fastest sorting algorithm

Author: Attard Cassar Emmanuel
Connections
Publication venue: University of Malta. Junior College
Publication date: 01/01/2018
Field of study

This paper explores in a chronological way the concepts, structures, and algorithms that programmers and computer scientists have tried out in their attempt to produce and improve the process of sorting. The measure of ‘fastness’ in this paper is mainly given in terms of the Big O notation.peer-reviewe

OAR@UM

Parallel String Sample Sort

Author: J. Kärkkäinen
J. Wassenberg
K. Mehlhorn
P. Sanders
P.M. McIlroy
R. Sinha
R. Sinha
R. Sinha
T. Hagerup
W. Ng
Publication venue
Publication date: 01/01/2013
Field of study

arXiv.org e-Print Archive

CiteSeerX

Crossref

KITopen

Constructing Large-Scale Semantic Web Indices for the Six RDF Collation Orders

Author: Christopher Blochwitz
Dennis Heinrich
Sven Groppe
Thilo Pionteck
Publication venue: RonPub
Publication date: 01/01/2016
Field of study

The Semantic Web community collects masses of valuable and publicly available RDF data in order to drive the success story of the Semantic Web. Efficient processing of these datasets requires their indexing. Semantic Web indices make use of the simple data model of RDF: The basic concept of RDF is the triple, which hence has only 6 different collation orders. On the one hand having 6 collation orders indexed fast merge joins (consuming the sorted input of the indices) can be applied as much as possible during query processing. On the other hand constructing the indices for 6 different collation orders is very time-consuming for large-scale datasets. Hence the focus of this paper is the efficient Semantic Web index construction for large-scale datasets on today's multi-core computers. We complete our discussion with a comprehensive performance evaluation, where our approach efficiently constructs the indices of over 1 billion triples of real world data

RonPub -- Research Online Publishing

PatTrieSort - External String Sorting based on Patricia Tries

Author: Christopher Blochwitz
Dennis Heinrich
Stefan Werner
Sven Groppe
Thilo Pionteck
Publication venue: RonPub
Publication date: 01/01/2015
Field of study

External merge sort belongs to the most efficient and widely used algorithms to sort big data: As much data as fits inside is sorted in main memory and afterwards swapped to external storage as so called initial run. After sorting all the data in this way block-wise, the initial runs are merged in a merging phase in order to retrieve the final sorted run containing the completely sorted original data. Patricia tries are one of the most space-efficient ways to store strings especially those with common prefixes. Hence, we propose to use patricia tries for initial run generation in an external merge sort variant, such that initial runs can become large compared to traditional external merge sort using the same main memory size. Furthermore, we store the initial runs as patricia tries instead of lists of sorted strings. As we will show in this paper, patricia tries can be efficiently merged having a superior performance in comparison to merging runs of sorted strings. We complete our discussion with a complexity analysis as well as a comprehensive performance evaluation, where our new approach outperforms traditional external merge sort by a factor of 4 for sorting over 4 billion strings of real world data

RonPub -- Research Online Publishing

Engineering Parallel String Sorting

Author: Bingmann Timo
Eberle Andreas
Sanders Peter
Publication venue
Publication date: 09/03/2014
Field of study

We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we first propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Then we focus on NUMA architectures, and develop parallel multiway LCP-merge and -mergesort to reduce the number of random memory accesses to remote nodes. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain situations. Comprehensive experiments on five current multi-core platforms are then reported and discussed. The experiments show that our implementations scale very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115

arXiv.org e-Print Archive

CiteSeerX

Crossref

KITopen

Cache-conscious sorting of large sets of strings with dynamic tries

Author: Sinha R
Zobel J
Publication venue: Association for Computing Machinery (USA)
Publication date: 01/01/2004
Field of study

Ongoing changes in computer architecture are affecting the efficiency of string-sorting algorithms. The size of main memory in typical computers continues to grow but memory accesses require increasing numbers of instruction cycles, which is a problem for the most efficient of the existing string-sorting algorithms as they do not utilize cache well for large data sets. We propose a new sorting algorithm for strings, burstsort, based on dynamic construction of a compact trie in which strings are kept in buckets. It is simple, fast, and efficient. We experimentally explore key implementation options and compare burstsort to existing string-sorting algorithms on large and small sets of strings with a range of characteristics. These experiments show that, for large sets of strings, burstsort is almost twice as fast as any previous algorithm, primarily due to a lower rate of cache miss

RMIT Research Repository

Cache-conscious sorting of large sets of strings with dynamic tries

Author: Andersson A.
Arge L.
Bentley J.
Hawking D.
Heinz S.
Hoare C. A. R.
Johnson D. S.
Justin Zobel
LaMarca A.
McIlroy P. M.
Ranjan Sinha
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

SIMD- and cache-friendly algorithm for sorting an array of structures

Author: Gedik B.
Knuth D. E.
Lacey S.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Parallel Multiway LCP-Mergesort

Author: Eberle Andreas
Publication venue
Publication date: 13/08/2014
Field of study

In this bachelor thesis, multiway LCP-Merge is introduced, parallelized and applied to create a fully parallel LCP-Mergesort, as well as NUMA optimized pS5. As an advancement of binary LCP-Mergesort, a multiway LCP-aware tournament tree is introduced and parallelized. For dynamic load balancing, one well-known and two new strategies for splitting merge work packages are utilized. Besides the introduction of fully parallel multiway LCP-Mergesort, further focus is put on NUMA architectures. Thus \u27parallel Super Scalar String Sample Sort\u27 (pS5) is adapted to the special properties of these systems by utilising the parallel LCP-Merge. Moreover this yields an efficient and generic approach for parallelizing arbitrary sequential string sorting algorithms and making parallel algorithms NUMA-aware. Several optimizations, important for practical implementations, as well as comprehensive experiments on two current NUMA platforms, are then reported and discussed. The experiments show the good scalability of the introduced algorithms and especially, the great improvements of NUMA-aware pS5 with real-world input sets on modern machines

KITopen