1 research outputs found
Lattice Transformer for Speech Translation
Recent advances in sequence modeling have highlighted the strengths of the
transformer architecture, especially in achieving state-of-the-art machine
translation results. However, depending on the up-stream systems, e.g., speech
recognition, or word segmentation, the input to translation system can vary
greatly. The goal of this work is to extend the attention mechanism of the
transformer to naturally consume the lattice in addition to the traditional
sequential input. We first propose a general lattice transformer for speech
translation where the input is the output of the automatic speech recognition
(ASR) which contains multiple paths and posterior scores. To leverage the extra
information from the lattice structure, we develop a novel controllable lattice
attention mechanism to obtain latent representations. On the LDC
Spanish-English speech translation corpus, our experiments show that lattice
transformer generalizes significantly better and outperforms both a transformer
baseline and a lattice LSTM. Additionally, we validate our approach on the WMT
2017 Chinese-English translation task with lattice inputs from different BPE
segmentations. In this task, we also observe the improvements over strong
baselines.Comment: accepted to ACL 201