19,501 research outputs found
Filling Knowledge Gaps in a Broad-Coverage Machine Translation System
Knowledge-based machine translation (KBMT) techniques yield high quality in
domains with detailed semantic models, limited vocabulary, and controlled input
grammar. Scaling up along these dimensions means acquiring large knowledge
resources. It also means behaving reasonably when definitive knowledge is not
yet available. This paper describes how we can fill various KBMT knowledge
gaps, often using robust statistical techniques. We describe quantitative and
qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT
system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9
Using WordNet for Building WordNets
This paper summarises a set of methodologies and techniques for the fast
construction of multilingual WordNets. The English WordNet is used in this
approach as a backbone for Catalan and Spanish WordNets and as a lexical
knowledge resource for several subtasks.Comment: 8 pages, postscript file. In workshop on Usage of WordNet in NL
Some Issues on Ontology Integration
The word integration has been used with different
meanings in the ontology field. This article
aims at clarifying the meaning of the word âintegrationâ
and presenting some of the relevant work
done in integration. We identify three meanings of
ontology âintegrationâ: when building a new ontology
reusing (by assembling, extending, specializing
or adapting) other ontologies already available;
when building an ontology by merging several
ontologies into a single one that unifies all of
them; when building an application using one or
more ontologies. We discuss the different meanings
of âintegrationâ, identify the main characteristics
of the three different processes and proposethree words to distinguish among those meanings:integration, merge and use
Learning Semantic Correspondences in Technical Documentation
We consider the problem of translating high-level textual descriptions to
formal representations in technical documentation as part of an effort to model
the meaning of such documentation. We focus specifically on the problem of
learning translational correspondences between text descriptions and grounded
representations in the target documentation, such as formal representation of
functions or code templates. Our approach exploits the parallel nature of such
documentation, or the tight coupling between high-level text and the low-level
representations we aim to learn. Data is collected by mining technical
documents for such parallel text-representation pairs, which we use to train a
simple semantic parsing model. We report new baseline results on sixteen novel
datasets, including the standard library documentation for nine popular
programming languages across seven natural languages, and a small collection of
Unix utility manuals.Comment: accepted to ACL-201
The European Language Resources and Technologies Forum: Shaping the Future of the Multilingual Digital Europe
Proceedings of the 1st FLaReNet Forum on the European Language Resources and Technologies, held in Vienna, at the Austrian Academy of Science, on 12-13 February 2009
Unification-Based Glossing
We present an approach to syntax-based machine translation that combines
unification-style interpretation with statistical processing. This approach
enables us to translate any Japanese newspaper article into English, with
quality far better than a word-for-word translation. Novel ideas include the
use of feature structures to encode word lattices and the use of unification to
compose and manipulate lattices. Unification also allows us to specify abstract
features that delay target-language synthesis until enough source-language
information is assembled. Our statistical component enables us to search
efficiently among competing translations and locate those with high English
fluency.Comment: 8 pages, Compressed and uuencoded postscript. To appear: IJCAI-9
- âŚ