1 research outputs found
Word Sense Disambiguation using Diffusion Kernel PCA
One of the major problems in natural language processing (NLP) is the word
sense disambiguation (WSD) problem. It is the task of computationally
identifying the right sense of a polysemous word based on its context.
Resolving the WSD problem boosts the accuracy of many NLP focused algorithms
such as text classification and machine translation. In this paper, we
introduce a new supervised algorithm for WSD, that is based on Kernel PCA and
Semantic Diffusion Kernel, which is called Diffusion Kernel PCA (DKPCA). DKPCA
grasps the semantic similarities within terms, and it is based on PCA. These
properties enable us to perform feature extraction and dimension reduction
guided by semantic similarities and within the algorithm. Our empirical results
on SensEval data demonstrate that DKPCA achieves higher or very close accuracy
results compared to SVM and KPCA with various well-known kernels when the
labeled data ratio is meager. Considering the scarcity of labeled data, whereas
large quantities of unlabeled textual data are easily accessible, these are
highly encouraging first results to develop DKPCA further