10 research outputs found

    Weak factor automata : the failure of failure factor oracles?

    Get PDF
    In indexing of, and pattern matching on, DNA and text sequences, it is often important to represent all factors of a sequence. One e cient, compact representation is the factor oracle (FO). At the same time, any classical deterministic nite automaton (DFA) can be transformed to a so-called failure one (FDFA), which may use failure transitions to replace multiple symbol transitions, potentially yielding a more compact representation. We combine the two ideas and directly construct a failure factor oracle (FFO) from a given sequence, in contrast to ex post facto transformation to an FDFA. The algorithm is suitable for both short and long sequences. We empirically compared the resulting FFOs and FOs on number of transitions for many DNA sequences of lengths 4 - 512, showing gains of up to 10% in total number of transitions, with failure transitions also taking up less space than symbol transitions. The resulting FFOs can be used for indexing, as well as in a variant of the FO-using backward oracle matching algorithm. We discuss and classify this pattern matching algorithm in terms of the keyword pattern matching taxonomies of Watson, Cleophas and Zwaan. We also empirically compared the use of FOs and FFOs in such backward reading pattern matching algorithms, using both DNA and natural language (English) data sets. The results indicate that the decrease in pattern matching performance of an algorithm using an FFO instead of an FO may outweigh the gain in representation space by using an FFO instead of an FO.http://www.journals.co.za/ej/ejour_comp.htmlam201

    Failure deterministic finite automata

    No full text
    Inspired by failure functions found in classical pattern matching algorithms, a failure deterministic finite automaton (FDFA) is defined as a formalism to recognise a regular language. An algorithm, based on formal concept analysis, is proposed for deriving from a given deterministic finite automaton (DFA) a language-equivalent FDFA. The FDFA’s transition diagram has fewer arcs than that of the DFA. A small modification to the classical DFA’s algorithm for recognising language elements yields a corresponding algorithm for an FDFA

    Failure deterministic finite automata

    Get PDF
    Inspired by failure functions found in classical pattern matching algorithms, a failure deterministic finite automaton (FDFA) is defined as a formalism to recognise a regular language. An algorithm, based on formal concept analysis, is proposed for deriving from a given deterministic finite automaton (DFA) a language-equivalent FDFA. The FDFA’s transition diagram has fewer arcs than that of the DFA. A small modification to the classical DFA’s algorithm for recognising language elements yields a corresponding algorithm for an FDFA

    Failure Deterministic Finite Automata

    No full text
    Lettere En WysbegeerteSentrum vir Kennisdinamika & BesluitnemingPlease help us populate SUNScholar with the post print version of this article. It can be e-mailed to: [email protected]

    An assessment of algorithms for deriving failure deterministic finite automata

    Get PDF
    CITATION: Nxumalo, M., et al. 2017. An assessment of algorithms for deriving failure deterministic finite automata. South African Computer Journal, 29(1):43-68, doi:10.18489/sacj.v29i1.456.The original publication is available at http://sacj.cs.uct.ac.zaFailure deterministic finite automata (FDFAs) represent regular languages more compactly than deterministic finite automata (DFAs). Four algorithms that convert arbitrary DFAs to language-equivalent FDFAs are empirically investigated. Three are concrete variants of a previously published abstract algorithm, the DFA-Homomorphic Algorithm (DHA). The fourth builds a maximal spanning tree from the DFA to derive what it calls a delayed input DFA. A first suite of test data consists of DFAs that recognise randomised sets of finite length keywords. Since the classical Aho-Corasick algorithm builds an optimal FDFA from such a set (and only from such a set), it provides benchmark FDFAs against which the performance of the general algorithms can be compared. A second suite of test data consists of random DFAs generated by a specially designed algorithm that also builds language-equivalent FDFAs, some of which may have non-divergent cycles. These random FDFAs provide (not necessarily tight) lower bounds for assessing the effectiveness of the four general FDFA generating algorithms.http://sacj.cs.uct.ac.za/index.php/sacj/article/view/456Publisher's versio

    An assessment of algorithms for deriving failure deterministic finite automata

    No full text
    \u3cp\u3eFailure deterministic finite automata (FDFAs) represent regular languages more compactly than deterministic finite automata (DFAs). Four algorithms that convert arbitrary DFAs to language-equivalent FDFAs are empirically investigated. Three are concrete variants of a previously published abstract algorithm, the DFA-Homomorphic Algorithm (DHA). The fourth builds a maximal spanning tree from the DFA to derive what it calls a delayed input DFA. A first suite of test data consists of DFAs that recognise randomised sets of finite length keywords. Since the classical Aho-Corasick algorithm builds an optimal FDFA from such a set (and only from such a set), it provides benchmark FDFAs against which the performance of the general algorithms can be compared. A second suite of test data consists of random DFAs generated by a specially designed algorithm that also builds language-equivalent FDFAs, some of which may have non-divergent cycles. These random FDFAs provide (not necessarily tight) lower bounds for assessing the effectiveness of the four general FDFA generating algorithms.\u3c/p\u3

    An Assessment of Algorithms for Deriving Failure Deterministic Finite Automata

    No full text

    An assessment of selected algorithms for generating failure deterministic finite automata

    Get PDF
    A Failure Deterministic Finite Automaton (FDFA) o ers a deterministic and a compact representation of an automaton that is used by various algorithms to solve pattern matching problems e ciently. An abstract, concept lattice based algorithm called the DFA - Homomorphic Algorithm (DHA) was proposed to convert a deterministic nite automata (DFA) into an FDFA. The abstract DHA has several nondeterministic choices. The DHA is tuned into four decisive and specialized variants that may potentially remove the optimal possible number of symbol transitions from the DFA while adding failure transitions. The resulting specialized FDFA are: MaxIntent FDFA, MinExtent FDFA, MaxIntent-MaxExtent FDFA and MaxArcReduncdancy FDFA. Furthermore, two output based investigations are conducted whereby two speci c types of DFA-to-FDFA algorithms are compared with DHA variants. Firstly, the well-known Aho-Corasick algorithm, and its DFA is converted into DHA FDFA variants. Empirical and comparative results show that when heuristics for DHA variants are suitably chosen, the minimality attained by the Aho-Corasick algorithm in its output FDFAs can be closely approximated by DHA FDFAs. Secondly, testing DHA FDFAs in the general case whereby random DFAs and language equivalent FDFAs are carefully constructed. The random DFAs are converted into DHA FDFA types and the random FDFAs are compared with DHA FDFAs. A published non concept lattice based algorithm producing an FDFA called D2FA is also shown to perform well in all experiments. In the general context DHA performed well though not as good as the D2FA algorithm. As a by-product of general case FDFA tests, an algorithm for generating random FDFAs and a language equivalent DFAs was proposed.Dissertation (MSc)--University of Pretoria, 2016.tm2016Computer ScienceMScUnrestricte

    An Aho-Corasick based assessment of algorithms generating failure deterministic finite automata

    No full text
    The Aho-Corasick algorithm derives a failure deterministic finite automaton for finding matches of a finite set of keywords in a text. It has the minimum number of transitions needed for this task. The DFA-Homomorphic Algorithm (DHA) algorithm is more general, deriving from an arbitrary complete deterministic finite automaton a language-equivalent failure deterministic finite automaton. DHA takes formal concepts of a lattice as input. This lattice is built from a state/outtransition formal context that is derived from the complete deterministic finite automaton. In this paper, three general variants of the abstract DHA are benchmarked against the specialised Aho-Corasick algorithm. It is shown that when heuristics for these variants are suitably chosen, the minimality attained by the Aho-Corasick algorithm can be closely approximated. A published non-lattice-based algorithm is also shown to perform well in experiments

    An Aho-Corasick based assessment of algorithms generating failure deterministic finite automata

    No full text
    \u3cp\u3eThe Aho-Corasick algorithm derives a failure deterministic finite automaton for finding matches of a finite set of keywords in a text. It has the minimum number of transitions needed for this task. The DFA-Homomorphic Algorithm (DHA) algorithm is more general, deriving from an arbitrary complete deterministic finite automaton a language-equivalent failure deterministic finite automaton. DHA takes formal concepts of a lattice as input. This lattice is built from a state/outtransition formal context that is derived from the complete deterministic finite automaton. In this paper, three general variants of the abstract DHA are benchmarked against the specialised Aho-Corasick algorithm. It is shown that when heuristics for these variants are suitably chosen, the minimality attained by the Aho-Corasick algorithm can be closely approximated. A published non-lattice-based algorithm is also shown to perform well in experiments.\u3c/p\u3
    corecore