Search CORE

9,424 research outputs found

Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding

Author: de Sa Virginia R.
Fang Chen
Jin Hailin
Tang Shuai
Wang Zhaowen
Publication venue
Publication date: 01/01/2018
Field of study

Context plays an important role in human language understanding, thus it may also be useful for machines learning vector representations of language. In this paper, we explore an asymmetric encoder-decoder structure for unsupervised context-based sentence representation learning. We carefully designed experiments to show that neither an autoregressive decoder nor an RNN decoder is required. After that, we designed a model which still keeps an RNN as the encoder, while using a non-autoregressive convolutional decoder. We further combine a suite of effective designs to significantly improve model efficiency while also achieving better performance. Our model is trained on two different large unlabelled corpora, and in both cases the transferability is evaluated on a set of downstream NLP tasks. We empirically show that our model is simple and fast while producing rich sentence representations that excel in downstream tasks

arXiv.org e-Print Archive

Crossref

Rethinking Skip-thought: A Neighborhood based Approach

Author: de Sa Virginia R.
Fang Chen
Jin Hailin
Tang Shuai
Wang Zhaowen
Publication venue
Publication date: 01/01/2017
Field of study

We study the skip-thought model with neighborhood information as weak supervision. More specifically, we propose a skip-thought neighbor model to consider the adjacent sentences as a neighborhood. We train our skip-thought neighbor model on a large corpus with continuous sentences, and then evaluate the trained model on 7 tasks, which include semantic relatedness, paraphrase detection, and classification benchmarks. Both quantitative comparison and qualitative investigation are conducted. We empirically show that, our skip-thought neighbor model performs as well as the skip-thought model on evaluation tasks. In addition, we found that, incorporating an autoencoder path in our model didn't aid our model to perform better, while it hurts the performance of the skip-thought model

arXiv.org e-Print Archive

Crossref