427 research outputs found
SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression
Neural sequence-to-sequence models are currently the dominant approach in
several natural language processing tasks, but require large parallel corpora.
We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting
of two chained encoder-decoder pairs, with words used as a sequence of discrete
latent variables. We apply the proposed model to unsupervised abstractive
sentence compression, where the first and last sequences are the input and
reconstructed sentences, respectively, while the middle sequence is the
compressed sentence. Constraining the length of the latent word sequences
forces the model to distill important information from the input. A pretrained
language model, acting as a prior over the latent sequences, encourages the
compressed sentences to be human-readable. Continuous relaxations enable us to
sample from categorical distributions, allowing gradient-based optimization,
unlike alternatives that rely on reinforcement learning. The proposed model
does not require parallel text-summary pairs, achieving promising results in
unsupervised sentence compression on benchmark datasets.Comment: Accepted to NAACL 201
Deep Recurrent Generative Decoder for Abstractive Text Summarization
We propose a new framework for abstractive text summarization based on a
sequence-to-sequence oriented encoder-decoder model equipped with a deep
recurrent generative decoder (DRGN).
Latent structure information implied in the target summaries is learned based
on a recurrent latent random model for improving the summarization quality.
Neural variational inference is employed to address the intractable posterior
inference for the recurrent latent variables.
Abstractive summaries are generated based on both the generative latent
variables and the discriminative deterministic states.
Extensive experiments on some benchmark datasets in different languages show
that DRGN achieves improvements over the state-of-the-art methods.Comment: 10 pages, EMNLP 201
Abstractive Multi-Document Summarization via Phrase Selection and Merging
We propose an abstraction-based multi-document summarization framework that
can construct new sentences by exploring more fine-grained syntactic units than
sentences, namely, noun/verb phrases. Different from existing abstraction-based
approaches, our method first constructs a pool of concepts and facts
represented by phrases from the input documents. Then new sentences are
generated by selecting and merging informative phrases to maximize the salience
of phrases and meanwhile satisfy the sentence construction constraints. We
employ integer linear optimization for conducting phrase selection and merging
simultaneously in order to achieve the global optimal solution for a summary.
Experimental results on the benchmark data set TAC 2011 show that our framework
outperforms the state-of-the-art models under automated pyramid evaluation
metric, and achieves reasonably well results on manual linguistic quality
evaluation.Comment: 11 pages, 1 figure, accepted as a full paper at ACL 201
A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data
It is a challenging and practical research problem to obtain effective
compression of lengthy product titles for E-commerce. This is particularly
important as more and more users browse mobile E-commerce apps and more
merchants make the original product titles redundant and lengthy for Search
Engine Optimization. Traditional text summarization approaches often require a
large amount of preprocessing costs and do not capture the important issue of
conversion rate in E-commerce. This paper proposes a novel multi-task learning
approach for improving product title compression with user search log data. In
particular, a pointer network-based sequence-to-sequence approach is utilized
for title compression with an attentive mechanism as an extractive method and
an attentive encoder-decoder approach is utilized for generating user search
queries. The encoding parameters (i.e., semantic embedding of original titles)
are shared among the two tasks and the attention distributions are jointly
optimized. An extensive set of experiments with both human annotated data and
online deployment demonstrate the advantage of the proposed research for both
compression qualities and online business values.Comment: 8 Pages, accepted at AAAI 201
Adapting End-to-End Speech Recognition for Readable Subtitles
Automatic speech recognition (ASR) systems are primarily evaluated on
transcription accuracy. However, in some use cases such as subtitling, verbatim
transcription would reduce output readability given limited screen size and
reading time. Therefore, this work focuses on ASR with output compression, a
task challenging for supervised approaches due to the scarcity of training
data. We first investigate a cascaded system, where an unsupervised compression
model is used to post-edit the transcribed speech. We then compare several
methods of end-to-end speech recognition under output length constraints. The
experiments show that with limited data far less than needed for training a
model from scratch, we can adapt a Transformer-based ASR model to incorporate
both transcription and compression capabilities. Furthermore, the best
performance in terms of WER and ROUGE scores is achieved by explicitly modeling
the length constraints within the end-to-end ASR system.Comment: IWSLT 202
Controlling Output Length in Neural Encoder-Decoders
Neural encoder-decoder models have shown great success in many sequence
generation tasks. However, previous work has not investigated situations in
which we would like to control the length of encoder-decoder outputs. This
capability is crucial for applications such as text summarization, in which we
have to generate concise summaries with a desired length. In this paper, we
propose methods for controlling the output sequence length for neural
encoder-decoder models: two decoding-based methods and two learning-based
methods. Results show that our learning-based methods have the capability to
control length without degrading summary quality in a summarization task.Comment: 11 pages. To appear in EMNLP 201
- …