Skip to main content
Article thumbnail
Location of Repository

A Hybrid Indexing Method for Approximate String Matching

By Gonzalo Navarro and Ricardo Baeza-Yates

Abstract

We present a new indexing method for the approximate string matching problem. The method is based on a suffix array combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the average retrieval time is Ç Ò � ÐÓ � Ò,forsome�� that depends on the error fraction tolerated « and the alphabet size �. Itisshownthat �� for approximately « �   � � Ô �,where � � � � ����. Thespace required is four times the text size, which is quite moderate for this problem. We experimentally show that this index can outperform by far all the existing alternatives for indexed approximate searching. These are also the first experiments that compare the different existing schemes

Topics: Suffixtries, suffix trees, text searching allowing errors, text indexing, computational biology
Year: 2009
OAI identifier: oai:CiteSeerX.psu:10.1.1.134.9634
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.dartmouth.edu/~cbbc... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.