1,548 research outputs found
Optimal Parallel Construction of Minimal Suffix and Factor Automata
This paper gives optimal parallel algorithms for the construction of the smallest deterministic finite automata recognizing all the suffixes and the factors of a string. The algorithms use recently discovered optimal parallel suffix tree construction algorithms together with data structures for the efficient manipulation of trees, exploiting the well known relation between suffix and factor automata and suffix trees
Can Nondeterminism Help Complementation?
Complementation and determinization are two fundamental notions in automata
theory. The close relationship between the two has been well observed in the
literature. In the case of nondeterministic finite automata on finite words
(NFA), complementation and determinization have the same state complexity,
namely Theta(2^n) where n is the state size. The same similarity between
determinization and complementation was found for Buchi automata, where both
operations were shown to have 2^\Theta(n lg n) state complexity. An intriguing
question is whether there exists a type of omega-automata whose determinization
is considerably harder than its complementation. In this paper, we show that
for all common types of omega-automata, the determinization problem has the
same state complexity as the corresponding complementation problem at the
granularity of 2^\Theta(.).Comment: In Proceedings GandALF 2012, arXiv:1210.202
An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms
We propose a framework for the exact probabilistic
analysis of window-based pattern matching algorithms, such as
Boyer--Moore, Horspool, Backward DAWG Matching, Backward Oracle
Matching, and more. In particular, we develop an algorithm that
efficiently computes the distribution of a pattern matching
algorithm's running time cost (such as the number of text character
accesses) for any given pattern in a random text model. Text models
range from simple uniform models to higher-order Markov models or
hidden Markov models (HMMs). Furthermore, we provide an algorithm to
compute the exact distribution of \emph{differences} in running time
cost of two pattern matching algorithms. Methodologically, we use
extensions of finite automata which we call \emph{deterministic
arithmetic automata} (DAAs) and \emph{probabilistic arithmetic
automata} (PAAs)~\cite{Marschall2008}. Given an algorithm, a
pattern, and a text model, a PAA is constructed from which the
sought distributions can be derived using dynamic programming. To
our knowledge, this is the first time that substring- or
suffix-based pattern matching algorithms are analyzed exactly by
computing the whole distribution of running time cost.
Experimentally, we compare Horspool's algorithm, Backward DAWG
Matching, and Backward Oracle Matching on prototypical patterns of
short length and provide statistics on the size of minimal DAAs for
these computations
- …