1,338 research outputs found
Sorting Integers on the AP1000
Sorting is one of the classic problems of computer science. Whilst well
understood on sequential machines, the diversity of architectures amongst
parallel systems means that algorithms do not perform uniformly on all
platforms. This document describes the implementation of a radix based
algorithm for sorting positive integers on a Fujitsu AP1000 Supercomputer,
which was constructed as an entry in the Joint Symposium on Parallel Processing
(JSPP) 1994 Parallel Software Contest (PSC94). Brief consideration is also
given to a full radix sort conducted in parallel across the machine.Comment: 1994 Project Report, 23 page
An Elegant Algorithm for the Construction of Suffix Arrays
The suffix array is a data structure that finds numerous applications in
string processing problems for both linguistic texts and biological data. It
has been introduced as a memory efficient alternative for suffix trees. The
suffix array consists of the sorted suffixes of a string. There are several
linear time suffix array construction algorithms (SACAs) known in the
literature. However, one of the fastest algorithms in practice has a worst case
run time of . The problem of designing practically and theoretically
efficient techniques remains open. In this paper we present an elegant
algorithm for suffix array construction which takes linear time with high
probability; the probability is on the space of all possible inputs. Our
algorithm is one of the simplest of the known SACAs and it opens up a new
dimension of suffix array construction that has not been explored until now.
Our algorithm is easily parallelizable. We offer parallel implementations on
various parallel models of computing. We prove a lemma on the -mers of a
random string which might find independent applications. We also present
another algorithm that utilizes the above algorithm. This algorithm is called
RadixSA and has a worst case run time of . RadixSA introduces an
idea that may find independent applications as a speedup technique for other
SACAs. An empirical comparison of RadixSA with other algorithms on various
datasets reveals that our algorithm is one of the fastest algorithms to date.
The C++ source code is freely available at
http://www.engr.uconn.edu/~man09004/radixSA.zi
Faster Radix Sort via Virtual Memory and Write-Combining
Sorting algorithms are the deciding factor for the performance of common
operations such as removal of duplicates or database sort-merge joins. This
work focuses on 32-bit integer keys, optionally paired with a 32-bit value. We
present a fast radix sorting algorithm that builds upon a
microarchitecture-aware variant of counting sort. Taking advantage of virtual
memory and making use of write-combining yields a per-pass throughput
corresponding to at least 88 % of the system's peak memory bandwidth. Our
implementation outperforms Intel's recently published radix sort by a factor of
1.5. It also compares favorably to the reported performance of an algorithm for
Fermi GPUs when data-transfer overhead is included. These results indicate that
scalar, bandwidth-sensitive sorting algorithms remain competitive on current
architectures. Various other memory-intensive applications can benefit from the
techniques described herein
- …