Search CORE

16,669 research outputs found

Comparing rule-based and data-driven approaches to Spanish-to-Basque machine translation

Author: Labaka Gorka
Sarasola Kepa
Stroppa Nicolas
Way Andy
Publication venue: European Association for Machine Translation
Publication date: 01/01/2007
Field of study

In this paper, we compare the rule-based and data-driven approaches in the context of Spanish-to-Basque Machine Translation. The rule-based system we consider has been developed specifically for Spanish-to-Basque machine translation, and is tuned to this language pair. On the contrary, the data-driven system we use is generic, and has not been specifically designed to deal with Basque. Spanish-to-Basque Machine Translation is a challenge for data-driven approaches for at least two reasons. First, there is lack of bilingual data on which a data-driven MT system can be trained. Second, Basque is a morphologically-rich agglutinative language and translating to Basque requires a huge generation of morphological information, a difficult task for a generic system not specifically tuned to Basque. We present the results of a series of experiments, obtained on two different corpora, one being “in-domain” and the other one “out-of-domain” with respect to the data-driven system. We show that n-gram based automatic evaluation and edit-distance-based human evaluation yield two different sets of results. According to BLEU, the data-driven system outperforms the rule-based system on the in-domain data, while according to the human evaluation, the rule-based approach achieves higher scores for both corpora

DCU Online Research Access Service

Joining hands: developing a sign language machine translation system with and for the deaf community

Author: Morrissey Sara
Way Andy
Publication venue
Publication date: 01/01/2007
Field of study

This paper discusses the development of an automatic machine translation (MT) system for translating spoken language text into signed languages (SLs). The motivation for our work is the improvement of accessibility to airport information announcements for D/deaf and hard of hearing people. This paper demonstrates the involvement of Deaf colleagues and members of the D/deaf community in Ireland in three areas of our research: the choice of a domain for automatic translation that has a practical use for the D/deaf community; the human translation of English text into Irish Sign Language (ISL) as well as advice on ISL grammar and linguistics; and the importance of native ISL signers as manual evaluators of our translated output

CiteSeerX

Irish Universities

DCU Online Research Access Service

Dublin City University at CLEF 2004: experiments in monolingual, bilingual and multilingual retrieval

Author: Burke Michael
Jones Gareth J.F.
Judge John
Khasin Anna
Lam-Adesina Adenike M.
Wagner Joachim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The Dublin City University group participated in the monolingual, bilingual and multilingual retrieval tasks this year. The main focus of our investigation this year was extending our retrieval system to document languages other than English, and completing the multilingual task comprising four languages: English, French, Russian and Finnish. Results from our French monolingual experiments indicate that working in French is more eﬀective for retrieval than adopting document and topic translation to English. However, comparison of our multilingual retrieval results using diﬀerent topic and document translation reveals that this result does not extend to retrieved list merging for the multilingual task in a simple predictable way

Crossref

Irish Universities

DCU Online Research Access Service

Teaching machine translation and translation technology: a contrastive study

Author: Kenny Dorothy
Way Andy
Publication venue
Publication date: 01/01/2001
Field of study

The Machine Translation course at Dublin City University is taught to undergraduate students in Applied Computational Linguistics, while Computer-Assisted Translation is taught on two translator-training programmes, one undergraduate and one postgraduate. Given the differing backgrounds of these sets of students, the course material, methods of teaching and assessment all differ. We report here on our experiences of teaching these courses over a number of years, which we hope will be of interest to lecturers of similar existing courses, as well as providing a reference point for others who may be considering the introduction of such material

Irish Universities

DCU Online Research Access Service

A prototype machine translation system between Turkmen and Turkish

Author: Adali Esref
Adalı Eşref
Oflazer Kemal
Tantug A. Cuneyd
Tantuğ A. Cüneyd
Publication venue
Publication date: 01/06/2006
Field of study

In this work, we present a prototype system for translation of Turkmen texts into Turkish. Although machine translation (MT) is a very hard task, it is easier to implement a MT system between very close language pairs which have similar syntactic structure and word order. We implement a direct translation system between Turkmen and Turkish which performs a word-to-word transfer. We also use a Turkish Language Model to find the most probable Turkish sentence among all possible candidate translations generated by our system

Sabanci University Research Database

Handling non-compositionality in multilingual CNLs

Author: Enache Ramona
Kolachina Prasanth
Listenmaa Inari
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we describe methods for handling multilingual non-compositional constructions in the framework of GF. We specifically look at methods to detect and extract non-compositional phrases from parallel texts and propose methods to handle such constructions in GF grammars. We expect that the methods to handle non-compositional constructions will enrich CNLs by providing more flexibility in the design of controlled languages. We look at two specific use cases of non-compositional constructions: a general-purpose method to detect and extract multilingual multiword expressions and a procedure to identify nominal compounds in German. We evaluate our procedure for multiword expressions by performing a qualitative analysis of the results. For the experiments on nominal compounds, we incorporate the detected compounds in a full SMT pipeline and evaluate the impact of our method in machine translation process.Comment: CNL workshop in COLING 201

arXiv.org e-Print Archive

Crossref

Bilingual newsgroups in Catalonia: a challenge for machine translation

Author: Climent Roca Salvador
Moré López Joaquim
Oliver González Antoni
Salvatierra Mallarach Míriam
Sánchez Sáiz Imma
Taulé Delor Mariona
Vallmanya Cucurull Lluïsa
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

This paper presents a linguistic analysis of a corpus of messages written in Catalan and Spanish, which come from several informal newsgroups on the Universitat Oberta de Catalunya (Open University of Catalonia; henceforth, UOC) Virtual Campus. The surrounding environment is one of extensive bilingualism and contact between Spanish and Catalan. The study was carried out as part of the INTERLINGUA project conducted by the UOC's Internet Interdisciplinary Institute (IN3). Its main goal is to ascertain the linguistic characteristics of the e-mail register in the newsgroups in order to assess their implications for the creation of an online machine translation environment. The results shed empirical light on the relevance of characteristics of the e-mail register, the impact of language contact and interference, and their implications for the use of machine translation for CMC data in order to facilitate cross-linguistic communication on the Internet

The Oberta in open access

The ATIS sign language corpus

Author: Bungeroth Jan
Dreuw Philippe
Morrissey Sara
Ney Hermann
Stein Daniel
van Zijl Lynette
Way Andy
Publication venue
Publication date: 01/01/2008
Field of study

Systems that automatically process sign language rely on appropriate data. We therefore present the ATIS sign language corpus that is based on the domain of air travel information. It is available for five languages, English, German, Irish sign language, German sign language and South African sign language. The corpus can be used for different tasks like automatic statistical translation and automatic sign language recognition and it allows the specific modelling of spatial references in signing space

Irish Universities

DCU Online Research Access Service