21,640 research outputs found
Word sense disambiguation and information retrieval
It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval
(IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will
increase. However, recent research into the application of a word sense disambiguator to an IR system failed
to show any performance increase. From these results it has become clear that more basic research is needed
to investigate the relationship between sense ambiguity, disambiguation, and IR.
Using a technique that introduces additional sense ambiguity into a collection, this paper presents research
that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have
on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an IR system
when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to
be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of
accuracy
Boosting Applied to Word Sense Disambiguation
In this paper Schapire and Singer's AdaBoost.MH boosting algorithm is applied
to the Word Sense Disambiguation (WSD) problem. Initial experiments on a set of
15 selected polysemous words show that the boosting approach surpasses Naive
Bayes and Exemplar-based approaches, which represent state-of-the-art accuracy
on supervised WSD. In order to make boosting practical for a real learning
domain of thousands of words, several ways of accelerating the algorithm by
reducing the feature space are studied. The best variant, which we call
LazyBoosting, is tested on the largest sense-tagged corpus available containing
192,800 examples of the 191 most frequent and ambiguous English words. Again,
boosting compares favourably to the other benchmark algorithms.Comment: 12 page
Word sense disambiguation and information retrieval
It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval
(IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will
increase. However, recent research into the application of a word sense disambiguator to an IR system failed
to show any performance increase. From these results it has become clear that more basic research is needed
to investigate the relationship between sense ambiguity, disambiguation, and IR.
Using a technique that introduces additional sense ambiguity into a collection, this paper presents research
that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have
on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an IR system
when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to
be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of
accuracy
Improving Japanese Zero Pronoun Resolution by Global Word Sense Disambiguation
This paper proposes unsupervised word sense disambiguation based on automatically constructed case frames and its incorporation into our zero pronoun resolution system. The word sense disambiguation is applied to verbs and nouns. We consider that case frames define verb senses and semantic features in a thesaurus define noun senses, respectively, and perform sense disambiguation by selecting them based on case analysis. In addition, according to the one sense per discourse heuristic, the word sense disambiguation results are cached and applied globally to the subsequent words. We integrated this global word sense disambiguation into our zero pronoun resolution system, and conducted experiments of zero pronoun resolution on two different domain corpora. Both of the experimental results indicated the effectiveness of our approach.
Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation
Interpretability of a predictive model is a powerful feature that gains the
trust of users in the correctness of the predictions. In word sense
disambiguation (WSD), knowledge-based systems tend to be much more
interpretable than knowledge-free counterparts as they rely on the wealth of
manually-encoded elements representing word senses, such as hypernyms, usage
examples, and images. We present a WSD system that bridges the gap between
these two so far disconnected groups of methods. Namely, our system, providing
access to several state-of-the-art WSD models, aims to be interpretable as a
knowledge-based system while it remains completely unsupervised and
knowledge-free. The presented tool features a Web interface for all-word
disambiguation of texts that makes the sense predictions human readable by
providing interpretable word sense inventories, sense representations, and
disambiguation results. We provide a public API, enabling seamless integration.Comment: In Proceedings of the the Conference on Empirical Methods on Natural
Language Processing (EMNLP 2017). 2017. Copenhagen, Denmark. Association for
Computational Linguistic
- …