Search CORE

1 research outputs found

Fast Multi-Dimensional Approximate Pattern Matching

Author: Gonzalo Navarro
Ricardo Baeza-yates
Publication venue
Publication date: 01/01/1998
Field of study

. We address the problem of approximate string matching in d dimensions, that is, to find a pattern of size m d in a text of size n d with at most k ! m d errors (substitutions, insertions and deletions along any dimension). We use a novel and very flexible error model, for which there exists only an algorithm to evaluate the similarity between two elements in two dimensions at O(m 4 ) time. We extend the algorithm to d dimensions, at O(d!m 2d ) time and O(d!m 2d\Gamma1 ) space. We also give the first search algorithm for such model, which is O(d!m d n d ) time and O(d!m d n d\Gamma1 ) space. We show how to reduce the space cost to O(d!3 d m 2d\Gamma1 ) with little time penalty. Finally, we present the first sublinear-time (on average) searching algorithm (i.e. not all text cells are inspected), which is O(kn d =m d\Gamma1 ) for k ! (m=(d(log oe m \Gamma log oe d))) d\Gamma1 , where oe is the alphabet size. After that error level the filter still remains ..

CiteSeerX