1,639 research outputs found
Efficient Online Timed Pattern Matching by Automata-Based Skipping
The timed pattern matching problem is an actively studied topic because of
its relevance in monitoring of real-time systems. There one is given a log
and a specification (given by a timed word and a timed automaton
in this paper), and one wishes to return the set of intervals for which the log
, when restricted to the interval, satisfies the specification
. In our previous work we presented an efficient timed pattern
matching algorithm: it adopts a skipping mechanism inspired by the classic
Boyer--Moore (BM) string matching algorithm. In this work we tackle the problem
of online timed pattern matching, towards embedded applications where it is
vital to process a vast amount of incoming data in a timely manner.
Specifically, we start with the Franek-Jennings-Smyth (FJS) string matching
algorithm---a recent variant of the BM algorithm---and extend it to timed
pattern matching. Our experiments indicate the efficiency of our FJS-type
algorithm in online and offline timed pattern matching
MONAA: A Tool for Timed Pattern Matching with Automata-Based Acceleration
We present monaa, a monitoring tool over a real-time property specified by
either a timed automaton or a timed regular expression. It implements a timed
pattern matching algorithm that combines 1) features suited for online
monitoring, and 2) acceleration by automata-based skipping. Our experiments
demonstrate monaa's performance advantage, especially in online usage.Comment: Published in: 2018 IEEE Workshop on Monitoring and Testing of
Cyber-Physical Systems (MT-CPS
An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms
We propose a framework for the exact probabilistic
analysis of window-based pattern matching algorithms, such as
Boyer--Moore, Horspool, Backward DAWG Matching, Backward Oracle
Matching, and more. In particular, we develop an algorithm that
efficiently computes the distribution of a pattern matching
algorithm's running time cost (such as the number of text character
accesses) for any given pattern in a random text model. Text models
range from simple uniform models to higher-order Markov models or
hidden Markov models (HMMs). Furthermore, we provide an algorithm to
compute the exact distribution of \emph{differences} in running time
cost of two pattern matching algorithms. Methodologically, we use
extensions of finite automata which we call \emph{deterministic
arithmetic automata} (DAAs) and \emph{probabilistic arithmetic
automata} (PAAs)~\cite{Marschall2008}. Given an algorithm, a
pattern, and a text model, a PAA is constructed from which the
sought distributions can be derived using dynamic programming. To
our knowledge, this is the first time that substring- or
suffix-based pattern matching algorithms are analyzed exactly by
computing the whole distribution of running time cost.
Experimentally, we compare Horspool's algorithm, Backward DAWG
Matching, and Backward Oracle Matching on prototypical patterns of
short length and provide statistics on the size of minimal DAAs for
these computations
Designing optimal- and fast-on-average pattern matching algorithms
Given a pattern and a text , the speed of a pattern matching algorithm
over with regard to , is the ratio of the length of to the number of
text accesses performed to search into . We first propose a general
method for computing the limit of the expected speed of pattern matching
algorithms, with regard to , over iid texts. Next, we show how to determine
the greatest speed which can be achieved among a large class of algorithms,
altogether with an algorithm running this speed. Since the complexity of this
determination make it impossible to deal with patterns of length greater than
4, we propose a polynomial heuristic. Finally, our approaches are compared with
9 pre-existing pattern matching algorithms from both a theoretical and a
practical point of view, i.e. both in terms of limit expected speed on iid
texts, and in terms of observed average speed on real data. In all cases, the
pre-existing algorithms are outperformed
A new problem in string searching
We describe a substring search problem that arises in group presentation
simplification processes. We suggest a two-level searching model: skip and
match levels. We give two timestamp algorithms which skip searching parts of
the text where there are no matches at all and prove their correctness. At the
match level, we consider Harrison signature, Karp-Rabin fingerprint, Bloom
filter and automata based matching algorithms and present experimental
performance figures.Comment: To appear in Proceedings Fifth Annual International Symposium on
Algorithms and Computation (ISAAC'94), Lecture Notes in Computer Scienc
- …