13 research outputs found
Selective Attention for Context-aware Neural Machine Translation
Despite the progress made in sentence-level NMT, current systems still fall
short at achieving fluent, good quality translation for a full document. Recent
works in context-aware NMT consider only a few previous sentences as context
and may not scale to entire documents. To this end, we propose a novel and
scalable top-down approach to hierarchical attention for context-aware NMT
which uses sparse attention to selectively focus on relevant sentences in the
document context and then attends to key words in those sentences. We also
propose single-level attention approaches based on sentence or word-level
information in the context. The document-level context representation, produced
from these attention modules, is integrated into the encoder or decoder of the
Transformer model depending on whether we use monolingual or bilingual context.
Our experiments and evaluation on English-German datasets in different document
MT settings show that our selective attention approach not only significantly
outperforms context-agnostic baselines but also surpasses context-aware
baselines in most cases.Comment: Accepted at NAACL-HLT 201
Influence of context on users’ views about explanations for decision-tree predictions
This research was supported in part by grant DP190100006 from the Australian Research Council. Ethics approval for the user studies was obtained from Monash University Human Research Ethics Committee (ID-24208). We thank Marko Bohanec, one of the creators of the Nursery dataset, for helping us understand the features and their values. We are also grateful to the anonymous reviewers for their helpful comments.Peer reviewedPostprin
Turning Flowchart into Dialog: Plan-based Data Augmentation for Low-Resource Flowchart-grounded Troubleshooting Dialogs
Flowchart-grounded troubleshooting dialogue (FTD) systems, which follow the
instructions of a flowchart to diagnose users' problems in specific domains
(eg., vehicle, laptop), have been gaining research interest in recent years.
However, collecting sufficient dialogues that are naturally grounded on
flowcharts is costly, thus FTD systems are impeded by scarce training data. To
mitigate the data sparsity issue, we propose a plan-based data augmentation
(PlanDA) approach that generates diverse synthetic dialog data at scale by
transforming concise flowchart into dialogues. Specifically, its generative
model employs a variational-base framework with a hierarchical planning
strategy that includes global and local latent planning variables. Experiments
on the FloDial dataset show that synthetic dialogue produced by PlanDA improves
the performance of downstream tasks, including flowchart path retrieval and
response generation, in particular on the Out-of-Flowchart settings. In
addition, further analysis demonstrate the quality of synthetic data generated
by PlanDA in paths that are covered by current sample dialogues and paths that
are not covered
Document context neural machine translation with memory networks
We present a document-level neural machine translation model which takes both
source and target document context into account using memory networks. We model
the problem as a structured prediction problem with interdependencies among the
observed and hidden variables, i.e., the source sentences and their unobserved
target translations in the document. The resulting structured prediction
problem is tackled with a neural translation model equipped with two memory
components, one each for the source and target side, to capture the documental
interdependencies. We train the model end-to-end, and propose an iterative
decoding algorithm based on block coordinate descent. Experimental results of
English translations from French, German, and Estonian documents show that our
model is effective in exploiting both source and target document context, and
statistically significantly outperforms the previous work in terms of BLEU and
METEOR.Comment: Accepted by ACL 201
Contextual neural model for translating bilingual multi-speaker conversations
Recent works in neural machine translation have begun to explore document
translation. However, translating online multi-speaker conversations is still
an open problem. In this work, we propose the task of translating Bilingual
Multi-Speaker Conversations, and explore neural architectures which exploit
both source and target-side conversation histories for this task. To initiate
an evaluation for this task, we introduce datasets extracted from Europarl v7
and OpenSubtitles2016. Our experiments on four language-pairs confirm the
significance of leveraging conversation history, both in terms of BLEU and
manual evaluation.Comment: WMT 201