RETURNN as a Generic Flexible Neural Toolkit with Application to
  Translation and Speech Recognition

Alkhouli, Tamer; Ney, Hermann; Zeyer, Albert

research

RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition

Authors: Tamer Alkhouli
Hermann Ney
Albert Zeyer
Publication date: 1 January 2018
Publisher: 'Association for Computational Linguistics (ACL)'
Doi

Abstract

We compare the fast training and decoding speed of RETURNN of attention models for translation, due to fast CUDA LSTM kernels, and a fast pure TensorFlow beam search decoder. We show that a layer-wise pretraining scheme for recurrent attention models gives over 1% BLEU improvement absolute and it allows to train deeper recurrent encoder networks. Promising preliminary results on max. expected BLEU training are presented. We are able to train state-of-the-art models for translation and end-to-end models for speech recognition and show results on WMT 2017 and Switchboard. The flexibility of RETURNN allows a fast research feedback loop to experiment with alternative architectures, and its generality allows to use it on a wide range of applications.Comment: accepted as demo paper on ACL 201

Similar works

Full text

Available Versions

RWTH Publications

oai:publications.rwth-aachen.d...

Last time updated on 18/04/2020

Publikationsserver der RWTH Aachen University

oai:publications.rwth-aachen.d...

Last time updated on 18/04/2019

Crossref

Last time updated on 10/08/2021