research

Online Construction of Wavelet Trees

Abstract

The wavelet tree (WT) is a flexible and efficient data structure for representing character strings in succinct space, while allowing for fast generalised rank, select and access operations. As such, they play an important role in modern text indexing methods. However, despite their popularity, not many algorithms have been published concerning their construction. In particular, while the WT is capable of representing a sequence of length n over an alphabet of size m in nlg m+o(nlg m) bits, much more space is typically used for its construction. Here we propose an O(nlg m)-time online method for the construction of the WT, requiring no prior knowledge about the input alphabet. The proposed algorithm is conceptually simpler than other state-of-the-art methods, while having comparable time performance and being more space-efficient in practice, since it performs just one pass over the input text and uses little extra space other than for the structure itself, as shown both theoretically and empirically

    Similar works