2 research outputs found

    A Word Sense Disambiguation Model for Amharic Words using Semi-Supervised Learning Paradigm

    Get PDF
    The main objective of this research was to design a WSD (word sense disambiguation) prototype model for Amharic words using semi-supervised learning method to extract training sets which minimizes the amount of the required human intervention and it can produce considerable improvement in learning accuracy. Due to the unavailability of Amharic word net, only five words were selected. These words were atena (አጠና), derese (ደረሰ), tenesa (ተነሳ), bela (በላ) and ale (አለ). A separate data sets using five ambiguous words were prepared for the development of this Amharic WSD prototype. The final classification task was done on fully labelled training set using Adaboost, bagging, and AD tree classification algorithms on WEKA package.Keywords: Ambiguity Bootstrapping Word Sense disambiguatio

    Semi-supervised Word Sense Disambiguation Based on Weakly Controlled Sense Induction

    No full text
    Abstract—Word Sense Disambiguation in text is still a difficult problem as the best supervised methods require laborious and costly manual preparation of training data. On the other hand, the unsupervised methods express significantly lower accuracy and produce results that are not satisfying for many application. The goal of this work is to develop a model of Word Sense Disambiguation which minimises the amount of the required human intervention, but still assigns senses that come from a manually created lexical semantics resource, i.e., a wordnet. The proposed method is based on clustering text snippets including words in focus. Next, for each cluster we found a core, the core is labelled with a word sense by a human and finally is used to produce a classifier. Classifiers, constructed for each word separately, are applied to text. A performed comparison showed that the approach is close in its precision to a fully supervised one tested on the same data for Polish, and is much better than a baseline of the most frequent sense selection. Possible ways for overcoming the limited coverage of the approach are also discussed in the paper. I