Search CORE

1,594 research outputs found

Distinguishing Word Senses in Untagged Text

Author: Bruce Rebecca
Pedersen Ted
Publication venue
Publication date: 01/01/1997
Field of study

This paper describes an experimental comparison of three unsupervised learning algorithms that distinguish the sense of an ambiguous word in untagged text. The methods described in this paper, McQuitty's similarity analysis, Ward's minimum-variance method, and the EM algorithm, assign each instance of an ambiguous word to a known sense definition based solely on the values of automatically identifiable features in text. These methods and feature sets are found to be more successful in disambiguating nouns rather than adjectives or verbs. Overall, the most accurate of these procedures is McQuitty's similarity analysis in combination with a high dimensional feature set.Comment: 11 pages, latex, uses aclap.st

arXiv.org e-Print Archive

CiteSeerX

Training and Scaling Preference Functions for Disambiguation

Author: Alshawi Hiyan
Carter David
Publication venue
Publication date: 01/01/1994
Field of study

We present an automatic method for weighting the contributions of preference functions used in disambiguation. Initial scaling factors are derived as the solution to a least-squares minimization problem, and improvements are then made by hill-climbing. The method is applied to disambiguating sentences in the ATIS (Air Travel Information System) corpus, and the performance of the resulting scaling factors is compared with hand-tuned factors. We then focus on one class of preference function, those based on semantic lexical collocations. Experimental results are presented showing that such functions vary considerably in selecting correct analyses. In particular we define a function that performs significantly better than ones based on mutual information and likelihood ratios of lexical associations.Comment: To appear in Computational Linguistics (probably volume 20, December 94). LaTeX, 21 page

arXiv.org e-Print Archive

CiteSeerX

The interaction of knowledge sources in word sense disambiguation

Author: Brill Eric
Daelemans Walter
Daelemans Walter
Ide Nancy
Kilgarriff Adam
Marcus Mitchell
Mark Stevenson
Masterman Margaret
McRoy Susan
Yorick Wilks
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2001
Field of study

Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems

CiteSeerX

Crossref

White Rose Research Online

One Homonym per Translation

Author: Hauer Bradley
Kondrak Grzegorz
Publication venue
Publication date: 23/01/2020
Field of study

The study of homonymy is vital to resolving fundamental problems in lexical semantics. In this paper, we propose four hypotheses that characterize the unique behavior of homonyms in the context of translations, discourses, collocations, and sense clusters. We present a new annotated homonym resource that allows us to test our hypotheses on existing WSD resources. The results of the experiments provide strong empirical evidence for the hypotheses. This study represents a step towards a computational method for distinguishing between homonymy and polysemy, and constructing a definitive inventory of coarse-grained senses.Comment: 8 pages, including reference

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Selective Sampling for Example-based Word Sense Disambiguation

Author: Fujii Atsushi
Inui Kentaro
Tanaka Hozumi
Tokunaga Takenobu
Publication venue
Publication date: 01/01/1998
Field of study

This paper proposes an efficient example sampling method for example-based word sense disambiguation systems. To construct a database of practical size, a considerable overhead for manual sense disambiguation (overhead for supervision) is required. In addition, the time complexity of searching a large-sized database poses a considerable problem (overhead for search). To counter these problems, our method selectively samples a smaller-sized effective subset from a given example set for use in word sense disambiguation. Our method is characterized by the reliance on the notion of training utility: the degree to which each example is informative for future example sampling when used for the training of the system. The system progressively collects examples by selecting those with greatest utility. The paper reports the effectiveness of our method through experiments on about one thousand sentences. Compared to experiments with other example sampling methods, our method reduced both the overhead for supervision and the overhead for search, without the degeneration of the performance of the system.Comment: 25 pages, 14 Postscript figure

arXiv.org e-Print Archive

CiteSeerX