2 research outputs found
Linear-time Computation of DAWGs, Symmetric Indexing Structures, and MAWs for Integer Alphabets
The directed acyclic word graph (DAWG) of a string of length is the
smallest (partial) DFA which recognizes all suffixes of with only
nodes and edges. In this paper, we show how to construct the DAWG for the input
string from the suffix tree for , in time for integer alphabets
of polynomial size in . In so doing, we first describe a folklore algorithm
which, given the suffix tree for , constructs the DAWG for the reversed
string of in time. Then, we present our algorithm that builds the
DAWG for in time for integer alphabets, from the suffix tree for
. We also show that a straightforward modification to our DAWG construction
algorithm leads to the first -time algorithm for constructing the affix
tree of a given string over an integer alphabet. Affix trees are a text
indexing structure supporting bidirectional pattern searches. We then discuss
how our constructions can lead to linear-time algorithms for building other
text indexing structures, such as linear-size suffix tries and symmetric CDAWGs
in linear time in the case of integer alphabets. As a further application to
our -time DAWG construction algorithm, we show that the set
of all minimal absent words (MAWs) of can be computed in
optimal, input- and output-sensitive time and
working space for integer alphabets.Comment: This is an extended version of the paper "Computing DAWGs and Minimal
Absent Words in Linear Time for Integer Alphabets" from MFCS 201
Computing DAWGs and Minimal Absent Words in Linear Time for Integer Alphabets
The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n + |MAW(y)|) time and O(n) working space for integer alphabets