36,929 research outputs found
Optimal-Hash Exact String Matching Algorithms
String matching is the problem of finding all the occurrences of a pattern in
a text. We propose improved versions of the fast family of string matching
algorithms based on hashing -grams. The improvement consists of considering
minimal values such that each -grams of the pattern has a unique hash
value. The new algorithms are fastest than algorithm of the HASH family for
short patterns on large size alphabets.Comment: 14 page
A new family and structure for Commentz-Walter-style multiple-keyword pattern matching algorithms
In this paper, I present a new family of Commentz-Walter-style multiple-keyword string pattern matching algorithms. The algorithms share a common algorithmic skeleton, which is significantly optimized when compared to the original Commentz- Walter skeleton and subsequently derived improvements. The new skeleton is derived via correctness-preserving stepwise algorithmic improvements, in the Eindhoven style of programming
Generalised Pattern Matching Revisited
In the problem of
[STOC'94, Muthukrishnan and Palem], we are given a text of length over
an alphabet , a pattern of length over an alphabet
, and a matching relationship ,
and must return all substrings of that match (reporting) or the number
of mismatches between each substring of of length and (counting).
In this work, we improve over all previously known algorithms for this problem
for various parameters describing the input instance:
* being the maximum number of characters that match a fixed
character,
* being the number of pairs of matching characters,
* being the total number of disjoint intervals of characters
that match the characters of the pattern .
At the heart of our new deterministic upper bounds for and
lies a faster construction of superimposed codes, which solves
an open problem posed in [FOCS'97, Indyk] and can be of independent interest.
To conclude, we demonstrate first lower bounds for . We start by
showing that any deterministic or Monte Carlo algorithm for must
use time, and then proceed to show higher lower bounds
for combinatorial algorithms. These bounds show that our algorithms are almost
optimal, unless a radically new approach is developed
- …