6,575 research outputs found
Using a Probabilistic Class-Based Lexicon for Lexical Ambiguity Resolution
This paper presents the use of probabilistic class-based lexica for
disambiguation in target-word selection. Our method employs minimal but precise
contextual information for disambiguation. That is, only information provided
by the target-verb, enriched by the condensed information of a probabilistic
class-based lexicon, is used. Induction of classes and fine-tuning to verbal
arguments is done in an unsupervised manner by EM-based clustering techniques.
The method shows promising results in an evaluation on real-world translations.Comment: 7 pages, uses colacl.st
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Ridge Regression, Hubness, and Zero-Shot Learning
This paper discusses the effect of hubness in zero-shot learning, when ridge
regression is used to find a mapping between the example space to the label
space. Contrary to the existing approach, which attempts to find a mapping from
the example space to the label space, we show that mapping labels into the
example space is desirable to suppress the emergence of hubs in the subsequent
nearest neighbor search step. Assuming a simple data model, we prove that the
proposed approach indeed reduces hubness. This was verified empirically on the
tasks of bilingual lexicon extraction and image labeling: hubness was reduced
with both of these tasks and the accuracy was improved accordingly.Comment: To be presented at ECML/PKDD 201
- …