Sequence-to-Sequence Models for Punctuated Transcription Combing Lexical and Acoustic Features

Bell, Peter; Klejch, Ondrej; Renals, Steve

Sequence-to-Sequence Models for Punctuated Transcription Combing Lexical and Acoustic Features

Authors: Peter Bell
Ondrej Klejch
Steve Renals
Publication date: 19 June 2017
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Doi

Abstract

In this paper we present an extension of our previously described neural machine translation based system for punctuated transcription. This extension allows the system to map from per frame acoustic features to word level representations by replacing the traditional encoder in the encoder-decoder architecture with a hierarchical encoder. Furthermore, we show that a system combining lexical and acoustic features significantly outperforms systems using only a single source of features on all measured punctuation marks. The combination of lexical and acoustic features achieves a significant improvement in F-Measure of 1.5 absolute over the purely lexical neural machine translation based system