1 research outputs found

    Fully-Online Suffix Tree and Directed Acyclic Word Graph Construction for Multiple Texts

    Full text link
    We consider construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection T\mathcal{T} of texts, where a new symbol may be appended to any text in T={T1,,TK}\mathcal{T} = \{T_1, \ldots, T_K\}, at any time. This fully-online scenario, which arises in dynamically indexing multi-sensor data, is a natural generalization of the long solved semi-online text indexing problem, where texts T1,,TkT_1, \ldots, T_{k} are permanently fixed before the next text Tk+1T_{k+1} is processed for each 1k<K1 \leq k < K. We present fully-online algorithms that construct the suffix tree and the DAWG for T\mathcal{T} in O(Nlogσ)O(N \log \sigma) time and O(N)O(N) space, where NN is the total lengths of the strings in T\mathcal{T} and σ\sigma is their alphabet size. The standard explicit representation of the suffix tree leaf edges and some DAWG edges must be relaxed in our fully-online scenario, since too many updates on these edges are required in the worst case. Instead, we provide access to the updated suffix tree leaf edge labels and the DAWG edges to be redirected via auxiliary data structures, in O(logσ)O(\log \sigma) time per added character.Comment: 28 pages, 6 figures, LaTe
    corecore