Using BFA with WordNet Ontology Based Model for Web Retrieval

Abstract

In the area of information retrieval, the dimension of document vectors plays an important role. We may need to find a few words or concepts, which characterize the document based on its contents, to overcome the problem of the "curse of dimensionality", which makes indexing of highdimensional data problematic. To do so, we earlier proposed a Wordnet and Wordnet+LSI (Latent Semantic Indexing) based model for dimension reduction. While LSI works on the whole collection, another procedure of feature extraction (and thus dimension reduction) exists, using binary factorization. The procedure is based on the search of attractors in Hopfield-like associative memory. Separation of true attractors (factors) and spurious ones is based on calculation of their Lyapunov function. Being applied to textual data the procedure conducted well and even more it showed sensitivity to the context in which the words were used. In this paper, we suggest that the binary factorization may benefit from the Wordnet filtration. 1

    Similar works

    Full text

    thumbnail-image

    Available Versions