Skip to main content
Article thumbnail
Location of Repository

Inverted Files Versus Signature Files for Text Indexing

By Justin Zobel, Alistair Moffat and Kotagiri Ramamohanarao

Abstract

this paper. An interesting feature of compressed inverted lists is that the best compression is achieved for the longest lists, that is, the most frequent terms. In the limit---which, 7 in the case of text indexing, is a term such as "the" that occurs in almost every record---at most one bit per record is required. There is thus no particular need to eliminate common terms from the index: the decision as to whether or not to use the inverted lists for these terms to evaluate a query can be made, as it should be, at query evaluation tim

Topics: Text Processsing, Index generation General Terms, Algorithms, Performance Additional Key Words and Phrases, indexing, text indexing, text databases, inverted files, signature files
Year: 1995
OAI identifier: oai:CiteSeerX.psu:10.1.1.18.6654
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cs.rmit.edu.au/~jz/... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.