Page Indexing for Textual Information Retrieval Systems
- Publication date
- Publisher
Abstract
150 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1983.A number of applications exist for systems which can store and interactively retrieve from very large natural language textual databases. This thesis discusses conventional approaches to the design of such systems. The notion of page indexing is introduced as a new scheme for doing information retrieval from natural language full-text databases.The structure of a page indexed database is described and the algorithms needed to do retrieval using the page index are presented. Some characteristics of page indexed text are analyzed and measured in order to estimate the size of the page index, and to show how the size of the index is related to the page size. One of the advantages of the page indexing scheme is the ease with which such a system can be analyzed. This analysis is based on characteristics of the hardware used to implement the system and on characteristics of queries. Finally, three hypothetical systems are proposed and analyzed using the techniques and methodologies developed in this thesis. These systems range from a microprocessor for a database of 250 megabytes to a large computer system employing multiple special purpose processors for a database of 50 gigabytes.U of I OnlyRestricted to the U of I community idenfinitely during batch ingest of legacy ETD