Location of Repository

ELSEVIER Finding related pages in the World Wide Web

By Jeffrey Dean Ł and Monika R. Henzinger

Abstract

When using traditional search engines, users have to formulate queries to describe their information need. This paper discusses a different approach to Web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of related Web pages. A related Web page is one that addresses the same topic as the original page. For example, www.washingtonpost.com is a page related to www.nytimes.com, since both are online newspapers. We describe two algorithms to identify related Web pages. These algorithms use only the connectivity information in the Web (i.e., the links between pages) and not the content of pages or usage information. We have implemented both algorithms and measured their runtime performance. To evaluate the effectiveness of our algorithms, we performed a user study comparing our algorithms with Netscape’s ‘What’s Related ’ servic

Topics: Search engines, Related pages, Searching paradigms
Year: 1999
OAI identifier: oai:CiteSeerX.psu:10.1.1.372.4951
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.ra.ethz.ch/CDstore/... (external link)
  • www.washingtonpost.com (external link)
  • www.nytimes.com, (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.