Construction of Topic Directories Using Levenshtein Similarity Weight

Abstract

Topic directories are search engines consisting of categories in hierarchical manner. Mapping of a new Web page to an appropriate category of a topic directory is one of the major challenges faced by human-based topic directories due to the rapid pace of growth of the WWW and also the presence of a large number of categories. So, the mapping of new pages onto categories by human experts is an expensive process. Hence, the automation of this process is needed and can be performed using standard similarity measures. In this chapter, we propose an algorithm called Mapping of Web Pages to Categories using Levenshtein Similarity Weight (MPC-LSW algorithm) that performs this mapping of Web pages to categories by comparing the similarity of the pages, ie a page-based comparison instead of the traditional term-based comparison. The time complexity of MPC-LSW is observed to be O (mk) as the terms are eliminated and processing is faster because pages are compressed into strings. Hence, it is an efficient method of mapping

Similar works

Full text

thumbnail-image

ePrints@Bangalore University

redirect
Last time updated on 09/12/2021

This paper was published in ePrints@Bangalore University.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.