273 research outputs found
Embeddings for word sense disambiguation: an evaluation study
Recent years have seen a dramatic growth in the popularity of word embeddings mainly owing to their ability to capture semantic information from massive amounts of textual content. As a result, many tasks in Natural Language Processing have tried to take advantage of the potential of these distributional models. In this work, we study how word embeddings can be used in Word Sense Disambiguation, one of the oldest tasks in Natural Language Processing and Artificial Intelligence. We propose different methods through which word embeddings can be leveraged in a state-of-the-art supervised WSD system architecture, and perform a deep analysis of how different parameters affect performance. We show how a WSD system that makes use of word embeddings alone, if designed properly, can provide significant performance improvement over a state-of-the-art WSD system that incorporates several standard WSD features
Context-aware graph segmentation for graph-based translation
In this paper, we present an improved
graph-based translation model which segments an input graph into node-induced
subgraphs by taking source context into
consideration. Translations are generated
by combining subgraph translations leftto-right using beam search. Experiments
on Chinese–English and German–English
demonstrate that the context-aware segmentation significantly improves the baseline
graph-based model
Disambiguierung deutschsprachiger Diskursmarker: Eine Pilot-Studie
Discourse markers such as German aber, wohl or obwohl can be regarded as valuable information for a wide range of text-linguistic applications, since they provide important cues for the interpretation of texts or text segments. Unfortunately, many of them are highly ambiguous. Thus, for their use in applications like automatic text summarizations a reliable disambiguation of discourse markers is needed. This should be done automatically, since manual disambiguation is feasible only for small amounts of data.
The aim of this pilot study, therefore, was to investigate methodological requirements of automatic disambiguation of German discourse markers. Two different methods known from word-sense disambiguation, Naive-Bayes and decisionlists, were used for the highly ambiguous marker wenn. A statistical approach was taken to compare the two approaches and different feature combinations
- …