Search CORE

3 research outputs found

A Practical Implementation of Compressed Suffix Arrays with Applications to Self-indexing

Author
Publication venue
Publication date: 06/03/2020
Field of study

Abstract: In this paper, we develop a simple and practical storage scheme for compressed suffix arrays (CSA). Our CSA can be constructed in linear time and needs 2nH k + n + o(n) bits of space simultaneously for any k c 1 and any constant c <1, where H k denotes the k-th order entropy. We compare the performance of our method with two established compressed indexing methods, viz. the FM-index and the Sad-CSA. Experiments on the Canterbury Corpus and the Pizza&Chili Corpus show significant advantages of our algorithm over two other indices in terms of compression and query time. Our storage scheme achieves better performance on all types of data present in these two corpora, except for evenly distributed data, such as DNA. The source code for our CSA is available online

CiteSeerX

A Practical Implementation of Compressed Suffix Arrays with Applications to Self-Indexing

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref