Experimental Analysis of a Parallel Quicksort-Based Algorithm for Suffix Array Generation
- Publication date
- Publisher
Abstract
. This paper presents experiments performed with an implementation of a quicksort-based parallel indexing algorithm. Besides the expected reduction in execution time, it was observed that the word frequency distribution of the input textual database has a strong influence on performance. Communication and computational load balances are achieved by processing the same quantity of text on each processor. This effectively occurs due to the auto-similar feature of texts, verified experimentally in this work. Also, as seen by the experiments, the auto-similarity of the word frequency distribution implies that this distribution is independent of the text size. In terms of implementation, the knowledge a priori of this word frequency may improve the indexing time by eliminating certain parts of the algorithm. Keywords: Parallel Processing, Information Retrieval, Index Generation, Auto-Similarity, Message Passing. 1 Introduction Information retrieval is a research area of growing interest by..