162 research outputs found
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Inspired by how humans summarize long documents, we propose an accurate and
fast summarization model that first selects salient sentences and then rewrites
them abstractively (i.e., compresses and paraphrases) to generate a concise
overall summary. We use a novel sentence-level policy gradient method to bridge
the non-differentiable computation between these two neural networks in a
hierarchical way, while maintaining language fluency. Empirically, we achieve
the new state-of-the-art on all metrics (including human evaluation) on the
CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
Moreover, by first operating at the sentence-level and then the word-level, we
enable parallel decoding of our neural generative model that results in
substantially faster (10-20x) inference speed as well as 4x faster training
convergence than previous long-paragraph encoder-decoder models. We also
demonstrate the generalization of our model on the test-only DUC-2002 dataset,
where we achieve higher scores than a state-of-the-art model.Comment: ACL 2018 (17 pages
Abstractive Summarization Using Attentive Neural Techniques
In a world of proliferating data, the ability to rapidly summarize text is
growing in importance. Automatic summarization of text can be thought of as a
sequence to sequence problem. Another area of natural language processing that
solves a sequence to sequence problem is machine translation, which is rapidly
evolving due to the development of attention-based encoder-decoder networks.
This work applies these modern techniques to abstractive summarization. We
perform analysis on various attention mechanisms for summarization with the
goal of developing an approach and architecture aimed at improving the state of
the art. In particular, we modify and optimize a translation model with
self-attention for generating abstractive sentence summaries. The effectiveness
of this base model along with attention variants is compared and analyzed in
the context of standardized evaluation sets and test metrics. However, we show
that these metrics are limited in their ability to effectively score
abstractive summaries, and propose a new approach based on the intuition that
an abstractive model requires an abstractive evaluation.Comment: Accepted for oral presentation at the 15th International Conference
on Natural Language Processing (ICON 2018
Deep Reinforced Self-Attention Masks for Abstractive Summarization (DR.SAS)
We present a novel architectural scheme to tackle the abstractive
summarization problem based on the CNN/DMdataset which fuses Reinforcement
Learning (RL) withUniLM, which is a pre-trained Deep Learning Model, to solve
various natural language tasks. We have tested the limits of learning
fine-grained attention in Transformers to improve the summarization quality.
UniLM applies attention to the entire token space in a global fashion. We
propose DR.SAS which applies the Actor-Critic (AC) algorithm to learn a dynamic
self-attention distribution over the tokens to reduce redundancy and generate
factual and coherent summaries to improve the quality of summarization. After
performing hyperparameter tuning, we achievedbetter ROUGE results compared to
the baseline. Our model tends to be more extractive/factual yet coherent in
detail because of optimization over ROUGE rewards. We present detailed error
analysis with examples of the strengths and limitations of our model. Our
codebase will be publicly available on our GitHub
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
In the past few years, neural abstractive text summarization with
sequence-to-sequence (seq2seq) models have gained a lot of popularity. Many
interesting techniques have been proposed to improve seq2seq models, making
them capable of handling different challenges, such as saliency, fluency and
human readability, and generate high-quality summaries. Generally speaking,
most of these techniques differ in one of these three categories: network
structure, parameter inference, and decoding/generation. There are also other
concerns, such as efficiency and parallelism for training a model. In this
paper, we provide a comprehensive literature survey on different seq2seq models
for abstractive text summarization from the viewpoint of network structures,
training strategies, and summary generation algorithms. Several models were
first proposed for language modeling and generation tasks, such as machine
translation, and later applied to abstractive text summarization. Hence, we
also provide a brief review of these models. As part of this survey, we also
develop an open source library, namely, Neural Abstractive Text Summarizer
(NATS) toolkit, for the abstractive text summarization. An extensive set of
experiments have been conducted on the widely used CNN/Daily Mail dataset to
examine the effectiveness of several different neural network components.
Finally, we benchmark two models implemented in NATS on the two recently
released datasets, namely, Newsroom and Bytecup
Deep Reinforcement Learning For Sequence to Sequence Models
In recent times, sequence-to-sequence (seq2seq) models have gained a lot of
popularity and provide state-of-the-art performance in a wide variety of tasks
such as machine translation, headline generation, text summarization, speech to
text conversion, and image caption generation. The underlying framework for all
these models is usually a deep neural network comprising an encoder and a
decoder. Although simple encoder-decoder models produce competitive results,
many researchers have proposed additional improvements over these
sequence-to-sequence models, e.g., using an attention-based model over the
input, pointer-generation models, and self-attention models. However, such
seq2seq models suffer from two common problems: 1) exposure bias and 2)
inconsistency between train/test measurement. Recently, a completely novel
point of view has emerged in addressing these two problems in seq2seq models,
leveraging methods from reinforcement learning (RL). In this survey, we
consider seq2seq problems from the RL point of view and provide a formulation
combining the power of RL methods in decision-making with sequence-to-sequence
models that enable remembering long-term memories. We present some of the most
recent frameworks that combine concepts from RL and deep neural networks and
explain how these two areas could benefit from each other in solving complex
seq2seq tasks. Our work aims to provide insights into some of the problems that
inherently arise with current approaches and how we can address them with
better RL models. We also provide the source code for implementing most of the
RL models discussed in this paper to support the complex task of abstractive
text summarization
Deconvolutional Paragraph Representation Learning
Learning latent representations from long text sequences is an important
first step in many natural language processing applications. Recurrent Neural
Networks (RNNs) have become a cornerstone for this challenging task. However,
the quality of sentences during RNN-based decoding (reconstruction) decreases
with the length of the text. We propose a sequence-to-sequence, purely
convolutional and deconvolutional autoencoding framework that is free of the
above issue, while also being computationally efficient. The proposed method is
simple, easy to implement and can be leveraged as a building block for many
applications. We show empirically that compared to RNNs, our framework is
better at reconstructing and correcting long paragraphs. Quantitative
evaluation on semi-supervised text classification and summarization tasks
demonstrate the potential for better utilization of long unlabeled text data.Comment: Accepted by NIPS 201
A Survey on Neural Network-Based Summarization Methods
Automatic text summarization, the automated process of shortening a text
while reserving the main ideas of the document(s), is a critical research area
in natural language processing. The aim of this literature review is to survey
the recent work on neural-based models in automatic text summarization. We
examine in detail ten state-of-the-art neural-based summarizers: five
abstractive models and five extractive models. In addition, we discuss the
related techniques that can be applied to the summarization tasks and present
promising paths for future research in neural-based summarization.Comment: 16 pages, 4 table
Contrastive Attention Mechanism for Abstractive Sentence Summarization
We propose a contrastive attention mechanism to extend the
sequence-to-sequence framework for abstractive sentence summarization task,
which aims to generate a brief summary of a given source sentence. The proposed
contrastive attention mechanism accommodates two categories of attention: one
is the conventional attention that attends to relevant parts of the source
sentence, the other is the opponent attention that attends to irrelevant or
less relevant parts of the source sentence. Both attentions are trained in an
opposite way so that the contribution from the conventional attention is
encouraged and the contribution from the opponent attention is discouraged
through a novel softmax and softmin functionality. Experiments on benchmark
datasets show that, the proposed contrastive attention mechanism is more
focused on the relevant parts for the summary than the conventional attention
mechanism, and greatly advances the state-of-the-art performance on the
abstractive sentence summarization task. We release the code at
https://github.com/travel-go/Abstractive-Text-SummarizationComment: accepted by EMNLP201
Summary Level Training of Sentence Rewriting for Abstractive Summarization
As an attempt to combine extractive and abstractive summarization, Sentence
Rewriting models adopt the strategy of extracting salient sentences from a
document first and then paraphrasing the selected ones to generate a summary.
However, the existing models in this framework mostly rely on sentence-level
rewards or suboptimal labels, causing a mismatch between a training objective
and evaluation metric. In this paper, we present a novel training signal that
directly maximizes summary-level ROUGE scores through reinforcement learning.
In addition, we incorporate BERT into our model, making good use of its ability
on natural language understanding. In extensive experiments, we show that a
combination of our proposed model and training procedure obtains new
state-of-the-art performance on both CNN/Daily Mail and New York Times
datasets. We also demonstrate that it generalizes better on DUC-2002 test set.Comment: EMNLP 2019 Workshop on New Frontiers in Summarizatio
Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks
Auto-encoders compress input data into a latent-space representation and
reconstruct the original data from the representation. This latent
representation is not easily interpreted by humans. In this paper, we propose
training an auto-encoder that encodes input text into human-readable sentences,
and unpaired abstractive summarization is thereby achieved. The auto-encoder is
composed of a generator and a reconstructor. The generator encodes the input
text into a shorter word sequence, and the reconstructor recovers the generator
input from the generator output. To make the generator output human-readable, a
discriminator restricts the output of the generator to resemble human-written
sentences. By taking the generator output as the summary of the input text,
abstractive summarization is achieved without document-summary pairs as
training data. Promising results are shown on both English and Chinese corpora.Comment: Accepted by EMNLP 201
- …