7,391 research outputs found
Cross-lingual Visual Verb Sense Disambiguation
Recent work has shown that visual context improves cross-lingual sense
disambiguation for nouns. We extend this line of work to the more challenging
task of cross-lingual verb sense disambiguation, introducing the MultiSense
dataset of 9,504 images annotated with English, German, and Spanish verbs. Each
image in MultiSense is annotated with an English verb and its translation in
German or Spanish. We show that cross-lingual verb sense disambiguation models
benefit from visual context, compared to unimodal baselines. We also show that
the verb sense predicted by our best disambiguation model can improve the
results of a text-only machine translation system when used for a multimodal
translation task.Comment: NAACL 2019; fix typo in author nam
A new semantically annotated corpus with syntactic-semantic and cross-lingual senses
International audienceIn this article, we describe a new sense-tagged corpus for Word Sense Disambiguation. The corpus is constituted of instances of 20 French polysemous verbs. Each verb instance is annotated with three sense labels: (1) the actual translation of the verb in the english version of this instance in a parallel corpus, (2) an entry of the verb in a computational dictionary of French (the Lexicon-Grammar tables) and (3) a fine-grained sense label resulting from the concatenation of the translation and the Lexicon-Grammar entry
Using a Probabilistic Class-Based Lexicon for Lexical Ambiguity Resolution
This paper presents the use of probabilistic class-based lexica for
disambiguation in target-word selection. Our method employs minimal but precise
contextual information for disambiguation. That is, only information provided
by the target-verb, enriched by the condensed information of a probabilistic
class-based lexicon, is used. Induction of classes and fine-tuning to verbal
arguments is done in an unsupervised manner by EM-based clustering techniques.
The method shows promising results in an evaluation on real-world translations.Comment: 7 pages, uses colacl.st
The interaction of knowledge sources in word sense disambiguation
Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results.
We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation
Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
A MT System from Turkmen to Turkish employing finite state and statistical methods
In this work, we present a MT system from Turkmen to Turkish. Our system exploits the similarity of the languages by using a modified version of direct translation method. However, the complex inflectional and derivational morphology of the Turkic languages necessitate special treatment for word-by-word translation model. We also employ morphology-aware multi-word processing and statistical disambiguation processes in our system. We believe that this approach is valid for most of the Turkic languages and the architecture implemented using FSTs can be easily extended to those languages
Selective Sampling for Example-based Word Sense Disambiguation
This paper proposes an efficient example sampling method for example-based
word sense disambiguation systems. To construct a database of practical size, a
considerable overhead for manual sense disambiguation (overhead for
supervision) is required. In addition, the time complexity of searching a
large-sized database poses a considerable problem (overhead for search). To
counter these problems, our method selectively samples a smaller-sized
effective subset from a given example set for use in word sense disambiguation.
Our method is characterized by the reliance on the notion of training utility:
the degree to which each example is informative for future example sampling
when used for the training of the system. The system progressively collects
examples by selecting those with greatest utility. The paper reports the
effectiveness of our method through experiments on about one thousand
sentences. Compared to experiments with other example sampling methods, our
method reduced both the overhead for supervision and the overhead for search,
without the degeneration of the performance of the system.Comment: 25 pages, 14 Postscript figure
Resolving Lexical Ambiguity in Tensor Regression Models of Meaning
This paper provides a method for improving tensor-based compositional
distributional models of meaning by the addition of an explicit disambiguation
step prior to composition. In contrast with previous research where this
hypothesis has been successfully tested against relatively simple compositional
models, in our work we use a robust model trained with linear regression. The
results we get in two experiments show the superiority of the prior
disambiguation method and suggest that the effectiveness of this approach is
model-independent
- …