Search CORE

4,219 research outputs found

Co-occurrence Vectors from Corpora vs. Distance Vectors from Dictionaries

Author: Nitta Yoshihiko
Niwa Yoshiki
Publication venue
Publication date: 01/01/1994
Field of study

A comparison was made of vectors derived by using ordinary co-occurrence statistics from large text corpora and of vectors derived by measuring the inter-word distances in dictionary definitions. The precision of word sense disambiguation by using co-occurrence vectors from the 1987 Wall Street Journal (20M total words) was higher than that by using distance vectors from the Collins English Dictionary (60K head words + 1.6M definition words). However, other experimental results suggest that distance vectors contain some different semantic information from co-occurrence vectors.Comment: 6 pages, appeared in the Proc. of COLING94 (pp. 304-309)

arXiv.org e-Print Archive

CiteSeerX

Crossref

Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Author: C. Fellbaum
D. Yarowsky
Lucia Specia
M. Stevenson
Maria das Graças Volpe Nunes
Mark Stevenson
S. Muggleton
S. Muggleton
S. Muggleton
S. Muggleton
Y. Wilks
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2010
Field of study

Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources

Crossref

White Rose Research Online