12,559 research outputs found

    Real-time Regular Expression Matching

    Full text link
    This paper is devoted to finite state automata, regular expression matching, pattern recognition, and the exponential blow-up problem, which is the growing complexity of automata exponentially depending on regular expression length. This paper presents a theoretical and hardware solution to the exponential blow-up problem for some complicated classes of regular languages, which caused severe limitations in Network Intrusion Detection Systems work. The article supports the solution with theorems on correctness and complexity.Comment: 17 pages, 11 figure

    Linear-time Minimization of Wheeler DFAs

    Get PDF
    Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is playing an important role in the emerging field of compressed data structures: as opposed to general automata, WDFAs can be stored in just log s + O(1) bits per edge, s being the alphabet's size, and support optimal-time pattern matching queries on the substring closure of the language they recognize. An important step to achieve further compression is minimization. When the input A is a general deterministic finite-state automaton (DFA), the state-of-the-art is represented by the classic Hopcroft's algorithm, which runs in O(vertical bar A vertical bar log vertical bar A vertical bar) time. This algorithm stands at the core of the only existing minimization algorithm for Wheeler DFAs, which inherits its complexity. In this work, we show that the minimum WDFA equivalent to a given input WDFA can be computed in linear O(vertical bar A vertical bar) time. When run on de Bruijn WDFAs built from real DNA datasets, an implementation of our algorithm reduces the number of nodes from 14% to 51% at a speed of more than 1 million nodes per second.Peer reviewe

    Simultaneous Finite Automata: An Efficient Data-Parallel Model for Regular Expression Matching

    Get PDF
    Automata play important roles in wide area of computing and the growth of multicores calls for their efficient parallel implementation. Though it is known in theory that we can perform the computation of a finite automaton in parallel by simulating transitions, its implementation has a large overhead due to the simulation. In this paper we propose a new automaton called simultaneous finite automaton (SFA) for efficient parallel computation of an automaton. The key idea is to extend an automaton so that it involves the simulation of transitions. Since an SFA itself has a good property of parallelism, we can develop easily a parallel implementation without overheads. We have implemented a regular expression matcher based on SFA, and it has achieved over 10-times speedups on an environment with dual hexa-core CPUs in a typical case.Comment: This paper has been accepted at the following conference: 2013 International Conference on Parallel Processing (ICPP- 2013), October 1-4, 2013 Ecole Normale Suprieure de Lyon, Lyon, Franc

    Deterministic Automata for Unordered Trees

    Get PDF
    Automata for unordered unranked trees are relevant for defining schemas and queries for data trees in Json or Xml format. While the existing notions are well-investigated concerning expressiveness, they all lack a proper notion of determinism, which makes it difficult to distinguish subclasses of automata for which problems such as inclusion, equivalence, and minimization can be solved efficiently. In this paper, we propose and investigate different notions of "horizontal determinism", starting from automata for unranked trees in which the horizontal evaluation is performed by finite state automata. We show that a restriction to confluent horizontal evaluation leads to polynomial-time emptiness and universality, but still suffers from coNP-completeness of the emptiness of binary intersections. Finally, efficient algorithms can be obtained by imposing an order of horizontal evaluation globally for all automata in the class. Depending on the choice of the order, we obtain different classes of automata, each of which has the same expressiveness as CMso.Comment: In Proceedings GandALF 2014, arXiv:1408.556

    Efficient Multistriding of Large Non-deterministic Finite State Automata for Deep Packet Inspection

    Get PDF
    Multistride automata speed up input matching because each multistriding transformation halves the size of the input string, leading to a potential 2x speedup. However, up to now little effort has been spent in optimizing the building process of multistride automata, with the result that current algorithms cannot be applied to real-life, large automata such as the ones used in commercial IDSs, because the time and the memory space needed to create the new automaton quickly becomes unfeasible. In this paper, new algorithms for efficient building of multistride NFAs for packet inspection are presented, explaining how these new techniques can outperform the previous algorithms in terms of required time and memory usag

    Efficient Online Timed Pattern Matching by Automata-Based Skipping

    Full text link
    The timed pattern matching problem is an actively studied topic because of its relevance in monitoring of real-time systems. There one is given a log ww and a specification A\mathcal{A} (given by a timed word and a timed automaton in this paper), and one wishes to return the set of intervals for which the log ww, when restricted to the interval, satisfies the specification A\mathcal{A}. In our previous work we presented an efficient timed pattern matching algorithm: it adopts a skipping mechanism inspired by the classic Boyer--Moore (BM) string matching algorithm. In this work we tackle the problem of online timed pattern matching, towards embedded applications where it is vital to process a vast amount of incoming data in a timely manner. Specifically, we start with the Franek-Jennings-Smyth (FJS) string matching algorithm---a recent variant of the BM algorithm---and extend it to timed pattern matching. Our experiments indicate the efficiency of our FJS-type algorithm in online and offline timed pattern matching

    Finite Countermodel Based Verification for Program Transformation (A Case Study)

    Get PDF
    Both automatic program verification and program transformation are based on program analysis. In the past decade a number of approaches using various automatic general-purpose program transformation techniques (partial deduction, specialization, supercompilation) for verification of unreachability properties of computing systems were introduced and demonstrated. On the other hand, the semantics based unfold-fold program transformation methods pose themselves diverse kinds of reachability tasks and try to solve them, aiming at improving the semantics tree of the program being transformed. That means some general-purpose verification methods may be used for strengthening program transformation techniques. This paper considers the question how finite countermodels for safety verification method might be used in Turchin's supercompilation method. We extract a number of supercompilation sub-algorithms trying to solve reachability problems and demonstrate use of an external countermodel finder for solving some of the problems.Comment: In Proceedings VPT 2015, arXiv:1512.0221
    corecore