Search CORE

115,169 research outputs found

Word Sense Determination from Wikipedia Data Using Neural Networks

Author: Liu Qiao
Publication venue: SJSU ScholarWorks
Publication date: 01/10/2017
Field of study

Many words have multiple meanings. For example, “plant” can mean a type of living organism or a factory. Being able to determine the sense of such words is very useful in natural language processing tasks, such as speech synthesis, question answering, and machine translation. For the project described in this report, we used a modular model to classify the sense of words to be disambiguated. This model consisted of two parts: The first part was a neural-network-based language model to compute continuous vector representations of words from data sets created from Wikipedia pages. The second part classified the meaning of the given word without explicitly knowing what the meaning is. In this unsupervised word sense determination task, we did not need human-tagged training data or a dictionary of senses for each word. We tested the model with some naturally ambiguous words, and compared our experimental results with the related work by Schütze in 1998. Our model achieved similar accuracy as Schütze’s work for some words

SJSU ScholarWorks