For discovering the new URI of a missing web page, lexical signatures, which
consist of a small number of words chosen to represent the "aboutness" of a
page, have been previously proposed. However, prior methods relied on computing
the lexical signature before the page was lost, or using cached or archived
versions of the page to calculate a lexical signature. We demonstrate a system
of constructing a lexical signature for a page from its link neighborhood, that
is the "backlinks", or pages that link to the missing page. After testing
various methods, we show that one can construct a lexical signature for a
missing web page using only ten backlink pages. Further, we show that only the
first level of backlinks are useful in this effort. The text that the backlinks
use to point to the missing page is used as input for the creation of a
four-word lexical signature. That lexical signature is shown to successfully
find the target URI in over half of the test cases.Comment: 24 pages, 13 figures, 8 tables, technical repor