5 research outputs found
Designing the Business Conversation Corpus
While the progress of machine translation of written text has come far in the
past several years thanks to the increasing availability of parallel corpora
and corpora-based training technologies, automatic translation of spoken text
and dialogues remains challenging even for modern systems. In this paper, we
aim to boost the machine translation quality of conversational texts by
introducing a newly constructed Japanese-English business conversation parallel
corpus. A detailed analysis of the corpus is provided along with challenging
examples for automatic translation. We also experiment with adding the corpus
in a machine translation training scenario and show how the resulting system
benefits from its use
A Spoken Dialogue System for Enabling Comfortable Information Acquisition and Consumption
早大学位記番号:新8137早稲田大
実応用を志向した機械翻訳システムの設計と評価
Tohoku University博士(情報科学)thesi
The European Language Resources and Technologies Forum: Shaping the Future of the Multilingual Digital Europe
Proceedings of the 1st FLaReNet Forum on the European Language Resources and Technologies, held in Vienna, at the Austrian Academy of Science, on 12-13 February 2009
Post-editing machine translated text in a commercial setting: Observation and statistical analysis
Machine translation systems, when they are used in a commercial context for publishing purposes, are usually used in combination with human post-editing. Thus understanding human post-editing behaviour is crucial in order to maximise the benefit of machine translation systems. Though there have been a number of studies carried out on human post-editing to date, there is a lack of large-scale studies on post-editing in industrial contexts which focus on the activity in real-life settings. This study observes professional Japanese post-editors’ work and examines the effect of the amount of editing made during post-editing, source text characteristics, and post-editing behaviour, on the amount of post-editing effort. A mixed method approach was employed to both quantitatively and qualitatively analyse the data and gain detailed insights into the post-editing activity from various view points. The results indicate that a number of factors, such as sentence structure, document component types, use of product specific terms, and post-editing patterns and behaviour, have effect on the amount of post-editing effort in an intertwined manner. The findings will contribute to a better utilisation of machine translation systems in the industry as well as the development of the skills and strategies of post-editors