5,759 research outputs found
A Hybrid Convolutional Variational Autoencoder for Text Generation
In this paper we explore the effect of architectural choices on learning a
Variational Autoencoder (VAE) for text generation. In contrast to the
previously introduced VAE model for text where both the encoder and decoder are
RNNs, we propose a novel hybrid architecture that blends fully feed-forward
convolutional and deconvolutional components with a recurrent language model.
Our architecture exhibits several attractive properties such as faster run time
and convergence, ability to better handle long sequences and, more importantly,
it helps to avoid some of the major difficulties posed by training VAE models
on textual data
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM
We present a state-of-the-art end-to-end Automatic Speech Recognition (ASR)
model. We learn to listen and write characters with a joint Connectionist
Temporal Classification (CTC) and attention-based encoder-decoder network. The
encoder is a deep Convolutional Neural Network (CNN) based on the VGG network.
The CTC network sits on top of the encoder and is jointly trained with the
attention-based decoder. During the beam search process, we combine the CTC
predictions, the attention-based decoder predictions and a separately trained
LSTM language model. We achieve a 5-10\% error reduction compared to prior
systems on spontaneous Japanese and Chinese speech, and our end-to-end model
beats out traditional hybrid ASR systems.Comment: Accepted for INTERSPEECH 201
- …