thesis

Interaction harvesting for document retrieval

Abstract

Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 81-83).Despite advances in search technology, few software systems have been developed which accurately categorize multimedia files. The most successful systems for searching images, sounds, or movies rely on keyword annotation to provide meaningful search terms for non-text documents. Unfortunately, such systems usually require the author to enter the keywords manually, a task that is commonly neglected, or is executed poorly. This thesis proposes an approach to document categorization called Interaction Harvesting, wherein systems establish document relationships based on organizational and curatorial cues, harvested from the mouse and click gestures of an online community. Specifically, the spatial and temporal proximity and placement of documents are taken as indicators of document similarity. We propose an expansion technique whereby such proximal documents exert weighted keyword influences on each other. We hypothesize that these approaches will form a document classification framework that relieves some of the difficulty of the annotation process, while providing keyword-equivalent retrieval performance.by Noah S. Fields.S.M

    Similar works