1,351 research outputs found
The interaction of knowledge sources in word sense disambiguation
Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results.
We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems
Learning to Resolve Natural Language Ambiguities: A Unified Approach
We analyze a few of the commonly used statistics based and machine learning
algorithms for natural language disambiguation tasks and observe that they can
be re-cast as learning linear separators in the feature space. Each of the
methods makes a priori assumptions, which it employs, given the data, when
searching for its hypothesis. Nevertheless, as we show, it searches a space
that is as rich as the space of all linear separators. We use this to build an
argument for a data driven approach which merely searches for a good linear
separator in the feature space, without further assumptions on the domain or a
specific problem.
We present such an approach - a sparse network of linear separators,
utilizing the Winnow learning algorithm - and show how to use it in a variety
of ambiguity resolution problems. The learning approach presented is
attribute-efficient and, therefore, appropriate for domains having very large
number of attributes.
In particular, we present an extensive experimental comparison of our
approach with other methods on several well studied lexical disambiguation
tasks such as context-sensitive spelling correction, prepositional phrase
attachment and part of speech tagging. In all cases we show that our approach
either outperforms other methods tried for these tasks or performs comparably
to the best
Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning
This paper describes an experimental comparison of seven different learning
algorithms on the problem of learning to disambiguate the meaning of a word
from context. The algorithms tested include statistical, neural-network,
decision-tree, rule-based, and case-based classification techniques. The
specific problem tested involves disambiguating six senses of the word ``line''
using the words in the current and proceeding sentence as context. The
statistical and neural-network methods perform the best on this particular
problem and we discuss a potential reason for this observed difference. We also
discuss the role of bias in machine learning and its importance in explaining
performance differences observed on specific problems.Comment: 10 page
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation
Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
- …