97,651 research outputs found
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Introduction to the special issue on cross-language algorithms and applications
With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of
Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special
issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment
analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version
Evaluating syntax-driven approaches to phrase extraction for MT
In this paper, we examine a number of different phrase segmentation approaches for Machine Translation and how they perform when used to supplement the translation model of a phrase-based SMT system. This work represents a summary of a number of years of research carried out at Dublin City University in which it has been found that improvements can be made using hybrid translation
models. However, the level of improvement achieved is dependent on the amount of training data used. We describe the various approaches to phrase segmentation and combination explored, and outline a series of experiments investigating the relative merits of each method
Example-based machine translation of the Basque language
Basque is both a minority and a highly inflected language with free order of sentence constituents. Machine Translation of Basque is thus both a real need and a test bed for MT techniques. In this paper, we present a modular Data-Driven MT system which includes different chunkers as well as chunk aligners which can deal with the free order of sentence constituents of Basque. We conducted Basque to English translation experiments, evaluated on a large corpus
(270, 000 sentence pairs). The experimental results show that our system significantly outperforms state-of-the-art
approaches according to several common automatic evaluation metrics
Combining data-driven MT systems for improved sign language translation
In this paper, we investigate the feasibility of combining two data-driven machine translation (MT) systems for the translation of sign languages (SLs). We take the MT systems of two prominent data-driven research groups, the MaTrEx system developed at DCU and the Statistical Machine
Translation (SMT) system developed at RWTH Aachen University, and apply their respective approaches to the task of translating Irish Sign Language and German Sign Language into English and German. In a set of experiments supported by automatic evaluation results, we show that
there is a definite value to the prospective merging of MaTrExâs Example-Based MT chunks and distortion limit increase with RWTHâs constraint reordering
Technology for large-scale translation of clinical practice guidelines : a pilot study of the performance of a hybrid human and computer-assisted approach
Background: The construction of EBMPracticeNet, a national electronic point-of-care information platform in Belgium, was initiated in 2011 to optimize quality of care by promoting evidence-based decision-making. The project involved, among other tasks, the translation of 940 EBM Guidelines of Duodecim Medical Publications from English into Dutch and French. Considering the scale of the translation process, it was decided to make use of computer-aided translation performed by certificated translators with limited expertise in medical translation. Our consortium used a hybrid approach, involving a human translator supported by a translation memory (using SDL Trados Studio), terminology recognition (using SDL Multiterm termbases) from medical termbases and support from online machine translation. This has resulted in a validated translation memory which is now in use for the translation of new and updated guidelines.
Objective: The objective of this study was to evaluate the performance of the hybrid human and computer-assisted approach in comparison with translation unsupported by translation memory and terminology recognition. A comparison was also made with the translation efficiency of an expert medical translator.
Methods: We conducted a pilot trial in which two sets of 30 new and 30 updated guidelines were randomized to one of three groups. Comparable guidelines were translated (a) by certificated junior translators without medical specialization using the hybrid method (b) by an experienced medical translator without this support and (c) by the same junior translators without the support of the validated translation memory. A medical proofreader who was blinded for the translation procedure, evaluated the translated guidelines for acceptability and adequacy. Translation speed was measured by recording translation and post-editing time. The Human Translation Edit Rate was calculated as a metric to evaluate the quality of the translation. A further evaluation was made of translation acceptability and adequacy.
Results: The average number of words per guideline was 1,195 and the mean total translation time was 100.2 min/1,000 words. No meaningful differences were found in the translation speed for new guidelines. The translation of updated guidelines was 59 min/1,000 words faster (95% CI 2-115; P=.044) in the computer-aided group. Revisions due to terminology accounted for one third of the overall revisions by the medical proofreader.
Conclusions: Use of the hybrid human and computer-aided translation by a non-expert translator makes the translation of updates of clinical practice guidelines faster and cheaper because of the benefits of translation memory. For the translation of new guidelines there was no apparent benefit in comparison with the efficiency of translation unsupported by translation memory (whether by an expert or non-expert translator
- âŚ