research

Error Tree: A Tree Structure for Hamming & Edit Distances & Wildcards Matching

Abstract

Error Tree is a novel tree structure that is mainly oriented to solve the approximate pattern matching problems, Hamming and edit distances, as well as the wildcards matching problem. The input is a text of length nn over a fixed alphabet of length Σ\Sigma, a pattern of length mm, and kk. The output is to find all positions that have \leq kk Hamming distance, edit distance, or wildcards matching with PP. The algorithm proposes for Hamming distance and wildcards matching a tree structure that needs O(nlogΣknk!)O(n\frac{log_\Sigma ^{k}n}{k!}) words and takes O(mkk!+occO(\frac {m^k}{k!} + occ)(O(m+logΣknk!+occO(m + \frac {log_\Sigma ^kn}{k!} + occ) in the average case) of query time for any online/offline pattern, where occocc is the number of outputs. As well, a tree structure of O(2knlogΣknk!)O(2^{k}n\frac{log_\Sigma ^{k}n}{k!}) words and O(mkk!+3koccO(\frac {m^k}{k!} + 3^{k}occ)(O(m+logΣknk!+3koccO(m + \frac {log_\Sigma ^kn}{k!} + 3^{k}occ) in the average case) query time for edit distance for any online/offline pattern

    Similar works

    Full text

    thumbnail-image

    Available Versions