132 research outputs found
Semantics-enhanced task-oriented dialogue translation: a case Study on hotel booking
We showcase TODAY, a semanticsenhanced task-oriented dialogue translation system, whose novelties are: (i) taskoriented named entity (NE) definition and
a hybrid strategy for NE recognition and
translation; and (ii) a novel grounded semantic method for dialogue understanding
and task-order management. TODAY is a
case-study demo which can efficiently and
accurately assist customers and agents in
different languages to reach an agreement
in a dialogue for the hotel booking
Translating pro-drop languages with reconstruction models
Pronouns are frequently omitted in pro-drop languages, such as Chinese, generally leading to significant challenges with respect to the production of complete translations. To date, very little attention has been paid to the dropped pronoun (DP) problem within neural machine translation (NMT). In this work, we propose a novel reconstruction-based approach to alleviating DP translation problems for NMT models. Firstly, DPs within all source sentences are automatically annotated with parallel information extracted from the bilingual training corpus. Next, the annotated source sentence is reconstructed from hidden representations in the NMT model. With auxiliary training objectives, in the terms of reconstruction scores, the parameters associated with the NMT model are guided to produce enhanced hidden representations that are encouraged as much as possible to embed annotated DP information. Experimental results on both Chinese-English and Japanese-English dialogue translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT baseline, which is directly built on the training data annotated with DPs
Discourse-aware neural machine translation
Machine translation (MT) models usually translate a text by considering isolated sentences
based on a strict assumption that the sentences in a text are independent of one another.
However, it is a truism that texts have properties of connectedness that go beyond those of
their individual sentences. Disregarding dependencies across sentences will harm translation quality especially in terms of coherence, cohesion, and consistency. Previously,
some discourse-aware approaches have been investigated for conventional statistical machine translation (SMT). However, this is a serious obstacle for the state-of-the-art neural
machine translation (NMT), which recently has surpassed the performance of SMT.
In this thesis, we try to incorporate useful discourse information for enhancing NMT
models. More specifically, we conduct research on two main parts: 1) exploiting novel
document-level NMT architecture; and 2) dealing with a specific discourse phenomenon
for translation models.
Firstly, we investigate the influence of historical contextual information on the perfor-
mance of NMT models. A cross-sentence context-aware NMT model is proposed to consider the influence of previous sentences in the same document. Specifically, this history
is summarized using an additional hierarchical encoder. The historical representations are
then integrated into the standard NMT model in different strategies. Experimental results
on a Chinese–English document-level translation task show that the approach significantly
improves upon a strong attention-based NMT system by up to +2.1 BLEU points. In addition, analysis and comparison also give insightful discussions and conclusions for this
research direction.
Secondly, we explore the impact of discourse phenomena on the performance of MT.
In this thesis, we focus on the phenomenon of pronoun-dropping (pro-drop), where, in pro-drop languages, pronouns can be omitted when it is possible to infer the referent from the
context. As the data for training a dropped pronoun (DP) generator is scarce, we propose to
automatically annotate DPs using alignment information from a large parallel corpus. We
then introduce a hybrid approach: building a neural-based DP generator and integrating it
into the SMT model. Experimental results on both Chinese–English and Japanese–English
translation tasks demonstrate that our approach achieves a significant improvement of up to
+1.58 BLEU points with 66% F-score for DP generation accuracy.
Motivated by this promising result, we further exploit the DP translation approach for
advanced NMT models. A novel reconstruction-based model is proposed to reconstruct the
DP-annotated source sentence from the hidden states of either encoder or decoder, or both
components. Experimental results on the same translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT
baseline, which is trained on DP-annotated parallel data.
To avoid the errors propagated from an external DP prediction model, we finally investigate an end-to-end DP translation model. Specifically, we improve the reconstruction-based
model from three perspectives. We first employ a shared reconstructor to better exploit encoder and decoder representations. Secondly, we propose to jointly learn to translate and
predict DPs. In order to capture discourse information for DP prediction, we finally combine the hierarchical encoder with the DP translation model. Experimental results on the
same translation tasks show that our approach significantly improves both translation performance and DP prediction accuracy
New Trends in Machine Translation using Large Language Models: Case Examples with ChatGPT
Machine Translation (MT) has made significant progress in recent years using
deep learning, especially after the emergence of large language models (LLMs)
such as GPT-3 and ChatGPT. This brings new challenges and opportunities for MT
using LLMs. In this paper, we brainstorm some interesting directions for MT
using LLMs, including stylized MT, interactive MT, and Translation Memory-based
MT, as well as a new evaluation paradigm using LLMs. We also discuss the
privacy concerns in MT using LLMs and a basic privacy-preserving method to
mitigate such risks. To illustrate the potential of our proposed directions, we
present several examples for the new directions mentioned above, demonstrating
the feasibility of the proposed directions and highlight the opportunities and
challenges for future research in MT using LLMs
- …