661 research outputs found
A Deep Reinforced Model for Abstractive Summarization
Attentional, RNN-based encoder-decoder models for abstractive summarization
have achieved good performance on short input and output sequences. For longer
documents and summaries however these models often include repetitive and
incoherent phrases. We introduce a neural network model with a novel
intra-attention that attends over the input and continuously generated output
separately, and a new training method that combines standard supervised word
prediction and reinforcement learning (RL). Models trained only with supervised
learning often exhibit "exposure bias" - they assume ground truth is provided
at each step during training. However, when standard word prediction is
combined with the global sequence prediction training of RL the resulting
summaries become more readable. We evaluate this model on the CNN/Daily Mail
and New York Times datasets. Our model obtains a 41.16 ROUGE-1 score on the
CNN/Daily Mail dataset, an improvement over previous state-of-the-art models.
Human evaluation also shows that our model produces higher quality summaries
Fine-tune BERT for Extractive Summarization
BERT, a pre-trained Transformer model, has achieved ground-breaking
performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple
variant of BERT, for extractive summarization. Our system is the state of the
art on the CNN/Dailymail dataset, outperforming the previous best-performed
system by 1.65 on ROUGE-L. The codes to reproduce our results are available at
https://github.com/nlpyang/BertSumComment: fix figure
Learning to Extract Coherent Summary via Deep Reinforcement Learning
Coherence plays a critical role in producing a high-quality summary from a
document. In recent years, neural extractive summarization is becoming
increasingly attractive. However, most of them ignore the coherence of
summaries when extracting sentences. As an effort towards extracting coherent
summaries, we propose a neural coherence model to capture the cross-sentence
semantic and syntactic coherence patterns. The proposed neural coherence model
obviates the need for feature engineering and can be trained in an end-to-end
fashion using unlabeled data. Empirical results show that the proposed neural
coherence model can efficiently capture the cross-sentence coherence patterns.
Using the combined output of the neural coherence model and ROUGE package as
the reward, we design a reinforcement learning method to train a proposed
neural extractive summarizer which is named Reinforced Neural Extractive
Summarization (RNES) model. The RNES model learns to optimize coherence and
informative importance of the summary simultaneously. Experimental results show
that the proposed RNES outperforms existing baselines and achieves
state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The
qualitative evaluation indicates that summaries produced by RNES are more
coherent and readable.Comment: 8 pages, 1 figure, presented at AAAI-201
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
We address the problem of abstractive summarization in two directions:
proposing a novel dataset and a new model. First, we collect Reddit TIFU
dataset, consisting of 120K posts from the online discussion forum Reddit. We
use such informal crowd-generated posts as text source, in contrast with
existing datasets that mostly use formal documents as source such as news
articles. Thus, our dataset could less suffer from some biases that key
sentences usually locate at the beginning of the text and favorable summary
candidates are already inside the text in similar forms. Second, we propose a
novel abstractive summarization model named multi-level memory networks (MMN),
equipped with multi-level memory to store the information of text from
different levels of abstraction. With quantitative evaluation and user studies
via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly
abstractive and the MMN outperforms the state-of-the-art summarization models.Comment: Published in NAACL-HLT 2019 (Oral
Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization
Text summarization refers to the process that generates a shorter form of
text from the source document preserving salient information. Many existing
works for text summarization are generally evaluated by using recall-oriented
understudy for gisting evaluation (ROUGE) scores. However, as ROUGE scores are
computed based on n-gram overlap, they do not reflect semantic meaning
correspondences between generated and reference summaries. Because Korean is an
agglutinative language that combines various morphemes into a word that express
several meanings, ROUGE is not suitable for Korean summarization. In this
paper, we propose evaluation metrics that reflect semantic meanings of a
reference summary and the original document, Reference and Document Aware
Semantic Score (RDASS). We then propose a method for improving the correlation
of the metrics with human judgment. Evaluation results show that the
correlation with human judgment is significantly higher for our evaluation
metrics than for ROUGE scores.Comment: COLING 202
From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information
Text summarization is the research area aiming at creating a short and
condensed version of the original document, which conveys the main idea of the
document in a few words. This research topic has started to attract the
attention of a large community of researchers, and it is nowadays counted as
one of the most promising research areas. In general, text summarization
algorithms aim at using a plain text document as input and then output a
summary. However, in real-world applications, most of the data is not in a
plain text format. Instead, there is much manifold information to be
summarized, such as the summary for a web page based on a query in the search
engine, extreme long document (e.g., academic paper), dialog history and so on.
In this paper, we focus on the survey of these new summarization tasks and
approaches in the real-world application.Comment: Accepted by IJCAI 2020 Survey Trac
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Inspired by how humans summarize long documents, we propose an accurate and
fast summarization model that first selects salient sentences and then rewrites
them abstractively (i.e., compresses and paraphrases) to generate a concise
overall summary. We use a novel sentence-level policy gradient method to bridge
the non-differentiable computation between these two neural networks in a
hierarchical way, while maintaining language fluency. Empirically, we achieve
the new state-of-the-art on all metrics (including human evaluation) on the
CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
Moreover, by first operating at the sentence-level and then the word-level, we
enable parallel decoding of our neural generative model that results in
substantially faster (10-20x) inference speed as well as 4x faster training
convergence than previous long-paragraph encoder-decoder models. We also
demonstrate the generalization of our model on the test-only DUC-2002 dataset,
where we achieve higher scores than a state-of-the-art model.Comment: ACL 2018 (17 pages
Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization
Sequence generative models with RNN variants, such as LSTM, GRU, show
promising performance on abstractive document summarization. However, they
still have some issues that limit their performance, especially while deal-ing
with long sequences. One of the issues is that, to the best of our knowledge,
all current models employ a unidirectional decoder, which reasons only about
the past and still limited to retain future context while giving a prediction.
This makes these models suffer on their own by generating unbalanced outputs.
Moreover, unidirec-tional attention-based document summarization can only
capture partial aspects of attentional regularities due to the inherited
challenges in document summarization. To this end, we propose an end-to-end
trainable bidirectional RNN model to tackle the aforementioned issues. The
model has a bidirectional encoder-decoder architecture; in which the encoder
and the decoder are bidirectional LSTMs. The forward decoder is initialized
with the last hidden state of the backward encoder while the backward decoder
is initialized with the last hidden state of the for-ward encoder. In addition,
a bidirectional beam search mechanism is proposed as an approximate inference
algo-rithm for generating the output summaries from the bidi-rectional model.
This enables the model to reason about the past and future and to generate
balanced outputs as a result. Experimental results on CNN / Daily Mail dataset
show that the proposed model outperforms the current abstractive
state-of-the-art models by a considerable mar-gin.Comment: Preprin
Robust Neural Abstractive Summarization Systems and Evaluation against Adversarial Information
Sequence-to-sequence (seq2seq) neural models have been actively investigated
for abstractive summarization. Nevertheless, existing neural abstractive
systems frequently generate factually incorrect summaries and are vulnerable to
adversarial information, suggesting a crucial lack of semantic understanding.
In this paper, we propose a novel semantic-aware neural abstractive
summarization model that learns to generate high quality summaries through
semantic interpretation over salient content. A novel evaluation scheme with
adversarial samples is introduced to measure how well a model identifies
off-topic information, where our model yields significantly better performance
than the popular pointer-generator summarizer. Human evaluation also confirms
that our system summaries are uniformly more informative and faithful as well
as less redundant than the seq2seq model
A novel repetition normalized adversarial reward for headline generation
While reinforcement learning can effectively improve language generation
models, it often suffers from generating incoherent and repetitive phrases
\cite{paulus2017deep}. In this paper, we propose a novel repetition normalized
adversarial reward to mitigate these problems. Our repetition penalized reward
can greatly reduce the repetition rate and adversarial training mitigates
generating incoherent phrases. Our model significantly outperforms the baseline
model on ROUGE-1\,(+3.24), ROUGE-L\,(+2.25), and a decreased repetition-rate
(-4.98\%).Comment: Accepted by ICASSP 201
- …