16,075 research outputs found
Filling Knowledge Gaps in a Broad-Coverage Machine Translation System
Knowledge-based machine translation (KBMT) techniques yield high quality in
domains with detailed semantic models, limited vocabulary, and controlled input
grammar. Scaling up along these dimensions means acquiring large knowledge
resources. It also means behaving reasonably when definitive knowledge is not
yet available. This paper describes how we can fill various KBMT knowledge
gaps, often using robust statistical techniques. We describe quantitative and
qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT
system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9
Combining Multiple Methods for the Automatic Construction of Multilingual WordNets
This paper explores the automatic construction of a multilingual Lexical
Knowledge Base from preexisting lexical resources. First, a set of automatic
and complementary techniques for linking Spanish words collected from
monolingual and bilingual MRDs to English WordNet synsets are described.
Second, we show how resulting data provided by each method is then combined to
produce a preliminary version of a Spanish WordNet with an accuracy over 85%.
The application of these combinations results on an increment of the extracted
connexions of a 40% without losing accuracy. Both coarse-grained (class level)
and fine-grained (synset assignment level) confidence ratios are used and
evaluated. Finally, the results for the whole process are presented.Comment: 7 pages, 4 postscript figure
K-vec: A New Approach for Aligning Parallel Texts
Various methods have been proposed for aligning texts in two or more
languages such as the Canadian Parliamentary Debates(Hansards). Some of these
methods generate a bilingual lexicon as a by-product. We present an alternative
alignment strategy which we call K-vec, that starts by estimating the lexicon.
For example, it discovers that the English word "fisheries" is similar to the
French "pe^ches" by noting that the distribution of "fisheries" in the English
text is similar to the distribution of "pe^ches" in the French. K-vec does not
depend on sentence boundaries.Comment: 7 pages, uuencoded, compressed PostScript; Proc. COLING-9
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Controlled generation in example-based machine translation
The theme of controlled translation is currently in vogue in the area of MT. Recent research (Sch¨aler et al., 2003;
Carl, 2003) hypothesises that EBMT systems are perhaps best suited to this challenging task. In this paper, we present
an EBMT system where the generation of the target string is filtered by data written according to controlled language
specifications. As far as we are aware, this is the only research available on this topic. In the field of controlled language applications, it is more usual to constrain the source language in this way rather than the target. We translate a small corpus of controlled English into French using the on-line MT system Logomedia, and seed the memories of our EBMT system with a set of automatically induced lexical resources using the Marker Hypothesis as a segmentation tool. We test our system on a large set of sentences extracted from a Sun Translation Memory, and provide both an automatic and a human evaluation. For comparative purposes, we also provide results for Logomedia itself
- …