3,312 research outputs found
Boosting Applied to Word Sense Disambiguation
In this paper Schapire and Singer's AdaBoost.MH boosting algorithm is applied
to the Word Sense Disambiguation (WSD) problem. Initial experiments on a set of
15 selected polysemous words show that the boosting approach surpasses Naive
Bayes and Exemplar-based approaches, which represent state-of-the-art accuracy
on supervised WSD. In order to make boosting practical for a real learning
domain of thousands of words, several ways of accelerating the algorithm by
reducing the feature space are studied. The best variant, which we call
LazyBoosting, is tested on the largest sense-tagged corpus available containing
192,800 examples of the 191 most frequent and ambiguous English words. Again,
boosting compares favourably to the other benchmark algorithms.Comment: 12 page
Dependency relations as source context in phrase-based SMT
The Phrase-Based Statistical Machine Translation (PB-SMT) model has recently begun to include source context modeling, under the assumption that the proper lexical
choice of an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features such as words, parts-of-speech, and
supertags have been explored as effective source context in SMT. In this paper, we show that position-independent syntactic dependency relations of the head of a source phrase can be modeled as useful source context to improve target phrase selection and thereby improve overall performance of PB-SMT. On a Dutch—English translation task, by combining dependency relations and syntactic contextual features (part-of-speech), we achieved a 1.0 BLEU (Papineni et al., 2002) point improvement (3.1% relative) over the baseline
- …