97,651 research outputs found

    Improving the translation environment for professional translators

    Get PDF
    When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Evaluating syntax-driven approaches to phrase extraction for MT

    Get PDF
    In this paper, we examine a number of different phrase segmentation approaches for Machine Translation and how they perform when used to supplement the translation model of a phrase-based SMT system. This work represents a summary of a number of years of research carried out at Dublin City University in which it has been found that improvements can be made using hybrid translation models. However, the level of improvement achieved is dependent on the amount of training data used. We describe the various approaches to phrase segmentation and combination explored, and outline a series of experiments investigating the relative merits of each method

    Example-based machine translation of the Basque language

    Get PDF
    Basque is both a minority and a highly inflected language with free order of sentence constituents. Machine Translation of Basque is thus both a real need and a test bed for MT techniques. In this paper, we present a modular Data-Driven MT system which includes different chunkers as well as chunk aligners which can deal with the free order of sentence constituents of Basque. We conducted Basque to English translation experiments, evaluated on a large corpus (270, 000 sentence pairs). The experimental results show that our system significantly outperforms state-of-the-art approaches according to several common automatic evaluation metrics

    Combining data-driven MT systems for improved sign language translation

    Get PDF
    In this paper, we investigate the feasibility of combining two data-driven machine translation (MT) systems for the translation of sign languages (SLs). We take the MT systems of two prominent data-driven research groups, the MaTrEx system developed at DCU and the Statistical Machine Translation (SMT) system developed at RWTH Aachen University, and apply their respective approaches to the task of translating Irish Sign Language and German Sign Language into English and German. In a set of experiments supported by automatic evaluation results, we show that there is a definite value to the prospective merging of MaTrEx’s Example-Based MT chunks and distortion limit increase with RWTH’s constraint reordering

    Technology for large-scale translation of clinical practice guidelines : a pilot study of the performance of a hybrid human and computer-assisted approach

    Get PDF
    Background: The construction of EBMPracticeNet, a national electronic point-of-care information platform in Belgium, was initiated in 2011 to optimize quality of care by promoting evidence-based decision-making. The project involved, among other tasks, the translation of 940 EBM Guidelines of Duodecim Medical Publications from English into Dutch and French. Considering the scale of the translation process, it was decided to make use of computer-aided translation performed by certificated translators with limited expertise in medical translation. Our consortium used a hybrid approach, involving a human translator supported by a translation memory (using SDL Trados Studio), terminology recognition (using SDL Multiterm termbases) from medical termbases and support from online machine translation. This has resulted in a validated translation memory which is now in use for the translation of new and updated guidelines. Objective: The objective of this study was to evaluate the performance of the hybrid human and computer-assisted approach in comparison with translation unsupported by translation memory and terminology recognition. A comparison was also made with the translation efficiency of an expert medical translator. Methods: We conducted a pilot trial in which two sets of 30 new and 30 updated guidelines were randomized to one of three groups. Comparable guidelines were translated (a) by certificated junior translators without medical specialization using the hybrid method (b) by an experienced medical translator without this support and (c) by the same junior translators without the support of the validated translation memory. A medical proofreader who was blinded for the translation procedure, evaluated the translated guidelines for acceptability and adequacy. Translation speed was measured by recording translation and post-editing time. The Human Translation Edit Rate was calculated as a metric to evaluate the quality of the translation. A further evaluation was made of translation acceptability and adequacy. Results: The average number of words per guideline was 1,195 and the mean total translation time was 100.2 min/1,000 words. No meaningful differences were found in the translation speed for new guidelines. The translation of updated guidelines was 59 min/1,000 words faster (95% CI 2-115; P=.044) in the computer-aided group. Revisions due to terminology accounted for one third of the overall revisions by the medical proofreader. Conclusions: Use of the hybrid human and computer-aided translation by a non-expert translator makes the translation of updates of clinical practice guidelines faster and cheaper because of the benefits of translation memory. For the translation of new guidelines there was no apparent benefit in comparison with the efficiency of translation unsupported by translation memory (whether by an expert or non-expert translator
    • …
    corecore