Harvest: A Scalable, Customizable Discovery and Access System

C. Mic Bowman; Darren R. Hardy; Duane P. Wessels; Michael F. Schwartz; Peter B. Danzig; Udi Manber

Harvest: A Scalable, Customizable Discovery and Access System

Authors: C. Mic Bowman
Darren R. Hardy
Duane P. Wessels
Michael F. Schwartz
Peter B. Danzig
Udi Manber
Publication date: 1 January 1995
Publisher

Abstract

Rapid growth in data volume, user base, and data diversity render Internet-accessible information increasingly difficult to use effectively. In this paper we introduce Harvest, a system that provides an integrated set of customizable tools for gathering information from diverse repositories, building topic-specific content indexes, flexibly searching the indexes, widely replicating them, and caching objects as they are retrieved across the Internet. The system interoperates with WWW clients and with HTTP,FTP, Gopher, and NetNews information resources. We discuss the design and implementation of Harvest and its subsystems, give examples of its uses, and provide measurements indicating that Harvest can significantly reduce server load, network traffic, and space requirements when building indexes, compared with previous systems. We also discuss several popular indexes wehave built using Harvest, underscoring the customizability and scalability of the system

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.48.89...

Last time updated on 22/10/2014