1 research outputs found

    Topic Continuity for Web Document Categorization and Ranking

    No full text
    PageRank is primarily based on link structure analysis. Recently, it has been shown that content information can be utilized to improve link analysis. We propose a novel algorithm that harnesses the information contained in the history of a surfer to determine his topic of interest when he is on a given page. As the history is unavailable until query time, we guess it probabilistically so that the operations can be performed offline. This leads to a better web page categorization and, thereby, to a better ranking of web pages. 1
    corecore