1,639 research outputs found

    Efficient Online Timed Pattern Matching by Automata-Based Skipping

    Full text link
    The timed pattern matching problem is an actively studied topic because of its relevance in monitoring of real-time systems. There one is given a log ww and a specification A\mathcal{A} (given by a timed word and a timed automaton in this paper), and one wishes to return the set of intervals for which the log ww, when restricted to the interval, satisfies the specification A\mathcal{A}. In our previous work we presented an efficient timed pattern matching algorithm: it adopts a skipping mechanism inspired by the classic Boyer--Moore (BM) string matching algorithm. In this work we tackle the problem of online timed pattern matching, towards embedded applications where it is vital to process a vast amount of incoming data in a timely manner. Specifically, we start with the Franek-Jennings-Smyth (FJS) string matching algorithm---a recent variant of the BM algorithm---and extend it to timed pattern matching. Our experiments indicate the efficiency of our FJS-type algorithm in online and offline timed pattern matching

    MONAA: A Tool for Timed Pattern Matching with Automata-Based Acceleration

    Full text link
    We present monaa, a monitoring tool over a real-time property specified by either a timed automaton or a timed regular expression. It implements a timed pattern matching algorithm that combines 1) features suited for online monitoring, and 2) acceleration by automata-based skipping. Our experiments demonstrate monaa's performance advantage, especially in online usage.Comment: Published in: 2018 IEEE Workshop on Monitoring and Testing of Cyber-Physical Systems (MT-CPS

    An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms

    Get PDF
    We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer--Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm's running time cost (such as the number of text character accesses) for any given pattern in a random text model. Text models range from simple uniform models to higher-order Markov models or hidden Markov models (HMMs). Furthermore, we provide an algorithm to compute the exact distribution of \emph{differences} in running time cost of two pattern matching algorithms. Methodologically, we use extensions of finite automata which we call \emph{deterministic arithmetic automata} (DAAs) and \emph{probabilistic arithmetic automata} (PAAs)~\cite{Marschall2008}. Given an algorithm, a pattern, and a text model, a PAA is constructed from which the sought distributions can be derived using dynamic programming. To our knowledge, this is the first time that substring- or suffix-based pattern matching algorithms are analyzed exactly by computing the whole distribution of running time cost. Experimentally, we compare Horspool's algorithm, Backward DAWG Matching, and Backward Oracle Matching on prototypical patterns of short length and provide statistics on the size of minimal DAAs for these computations

    Designing optimal- and fast-on-average pattern matching algorithms

    Full text link
    Given a pattern ww and a text tt, the speed of a pattern matching algorithm over tt with regard to ww, is the ratio of the length of tt to the number of text accesses performed to search ww into tt. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to ww, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with 9 pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed

    A new problem in string searching

    Full text link
    We describe a substring search problem that arises in group presentation simplification processes. We suggest a two-level searching model: skip and match levels. We give two timestamp algorithms which skip searching parts of the text where there are no matches at all and prove their correctness. At the match level, we consider Harrison signature, Karp-Rabin fingerprint, Bloom filter and automata based matching algorithms and present experimental performance figures.Comment: To appear in Proceedings Fifth Annual International Symposium on Algorithms and Computation (ISAAC'94), Lecture Notes in Computer Scienc