251 research outputs found
From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information
Text summarization is the research area aiming at creating a short and
condensed version of the original document, which conveys the main idea of the
document in a few words. This research topic has started to attract the
attention of a large community of researchers, and it is nowadays counted as
one of the most promising research areas. In general, text summarization
algorithms aim at using a plain text document as input and then output a
summary. However, in real-world applications, most of the data is not in a
plain text format. Instead, there is much manifold information to be
summarized, such as the summary for a web page based on a query in the search
engine, extreme long document (e.g., academic paper), dialog history and so on.
In this paper, we focus on the survey of these new summarization tasks and
approaches in the real-world application.Comment: Accepted by IJCAI 2020 Survey Trac
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
We address the problem of abstractive summarization in two directions:
proposing a novel dataset and a new model. First, we collect Reddit TIFU
dataset, consisting of 120K posts from the online discussion forum Reddit. We
use such informal crowd-generated posts as text source, in contrast with
existing datasets that mostly use formal documents as source such as news
articles. Thus, our dataset could less suffer from some biases that key
sentences usually locate at the beginning of the text and favorable summary
candidates are already inside the text in similar forms. Second, we propose a
novel abstractive summarization model named multi-level memory networks (MMN),
equipped with multi-level memory to store the information of text from
different levels of abstraction. With quantitative evaluation and user studies
via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly
abstractive and the MMN outperforms the state-of-the-art summarization models.Comment: Published in NAACL-HLT 2019 (Oral
Abstractive Dialogue Summarization with Sentence-Gated Modeling Optimized by Dialogue Acts
Neural abstractive summarization has been increasingly studied, where the
prior work mainly focused on summarizing single-speaker documents (news,
scientific publications, etc). In dialogues, there are different interactions
between speakers, which are usually defined as dialogue acts. The interactive
signals may provide informative cues for better summarizing dialogues. This
paper proposes to explicitly leverage dialogue acts in a neural summarization
model, where a sentence-gated mechanism is designed for modeling the
relationship between dialogue acts and the summary. The experiments show that
our proposed model significantly improves the abstractive summarization
performance compared to the state-of-the-art baselines on AMI meeting corpus,
demonstrating the usefulness of the interactive signal provided by dialogue
acts.Comment: 8 pages, accepted by SLT 201
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq is an open-source sequence modeling toolkit that allows researchers
and developers to train custom models for translation, summarization, language
modeling, and other text generation tasks. The toolkit is based on PyTorch and
supports distributed training across multiple GPUs and machines. We also
support fast mixed-precision training and inference on modern GPUs. A demo
video can be found at https://www.youtube.com/watch?v=OtgDdWtHvtoComment: NAACL 2019 Demo pape
What is this Article about? Extreme Summarization with Topic-aware Convolutional Neural Networks
We introduce 'extreme summarization', a new single-document summarization
task which aims at creating a short, one-sentence news summary answering the
question ``What is the article about?''. We argue that extreme summarization,
by nature, is not amenable to extractive strategies and requires an abstractive
modeling approach. In the hope of driving research on this task further: (a) we
collect a real-world, large scale dataset by harvesting online articles from
the British Broadcasting Corporation (BBC); and (b) propose a novel abstractive
model which is conditioned on the article's topics and based entirely on
convolutional neural networks. We demonstrate experimentally that this
architecture captures long-range dependencies in a document and recognizes
pertinent content, outperforming an oracle extractive system and
state-of-the-art abstractive approaches when evaluated automatically and by
humans on the extreme summarization dataset.Comment: Accepted to appear in Journal of Artificial Intelligence Research
(JAIR), 37 page
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Graphs that capture relations between textual units have great benefits for
detecting salient information from multiple documents and generating overall
coherent summaries. In this paper, we develop a neural abstractive
multi-document summarization (MDS) model which can leverage well-known graph
representations of documents such as similarity graph and discourse graph, to
more effectively process multiple input documents and produce abstractive
summaries. Our model utilizes graphs to encode documents in order to capture
cross-document relations, which is crucial to summarizing long documents. Our
model can also take advantage of graphs to guide the summary generation
process, which is beneficial for generating coherent and concise summaries.
Furthermore, pre-trained language models can be easily combined with our model,
which further improve the summarization performance significantly. Empirical
results on the WikiSum and MultiNews dataset show that the proposed
architecture brings substantial improvements over several strong baselines.Comment: Accepted by ACL202
Robust Neural Abstractive Summarization Systems and Evaluation against Adversarial Information
Sequence-to-sequence (seq2seq) neural models have been actively investigated
for abstractive summarization. Nevertheless, existing neural abstractive
systems frequently generate factually incorrect summaries and are vulnerable to
adversarial information, suggesting a crucial lack of semantic understanding.
In this paper, we propose a novel semantic-aware neural abstractive
summarization model that learns to generate high quality summaries through
semantic interpretation over salient content. A novel evaluation scheme with
adversarial samples is introduced to measure how well a model identifies
off-topic information, where our model yields significantly better performance
than the popular pointer-generator summarizer. Human evaluation also confirms
that our system summaries are uniformly more informative and faithful as well
as less redundant than the seq2seq model
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
In the past few years, neural abstractive text summarization with
sequence-to-sequence (seq2seq) models have gained a lot of popularity. Many
interesting techniques have been proposed to improve seq2seq models, making
them capable of handling different challenges, such as saliency, fluency and
human readability, and generate high-quality summaries. Generally speaking,
most of these techniques differ in one of these three categories: network
structure, parameter inference, and decoding/generation. There are also other
concerns, such as efficiency and parallelism for training a model. In this
paper, we provide a comprehensive literature survey on different seq2seq models
for abstractive text summarization from the viewpoint of network structures,
training strategies, and summary generation algorithms. Several models were
first proposed for language modeling and generation tasks, such as machine
translation, and later applied to abstractive text summarization. Hence, we
also provide a brief review of these models. As part of this survey, we also
develop an open source library, namely, Neural Abstractive Text Summarizer
(NATS) toolkit, for the abstractive text summarization. An extensive set of
experiments have been conducted on the widely used CNN/Daily Mail dataset to
examine the effectiveness of several different neural network components.
Finally, we benchmark two models implemented in NATS on the two recently
released datasets, namely, Newsroom and Bytecup
A Large-Scale Multi-Length Headline Corpus for Analyzing Length-Constrained Headline Generation Model Evaluation
Browsing news articles on multiple devices is now possible. The lengths of
news article headlines have precise upper bounds, dictated by the size of the
display of the relevant device or interface. Therefore, controlling the length
of headlines is essential when applying the task of headline generation to news
production. However, because there is no corpus of headlines of multiple
lengths for a given article, previous research on controlling output length in
headline generation has not discussed whether the system outputs could be
adequately evaluated without multiple references of different lengths. In this
paper, we introduce two corpora, which are Japanese News Corpus (JNC) and
JApanese MUlti-Length Headline Corpus (JAMUL), to confirm the validity of
previous evaluation settings. The JNC provides common supervision data for
headline generation. The JAMUL is a large-scale evaluation dataset for
headlines of three different lengths composed by professional editors. We
report new findings on these corpora; for example, although the longest length
reference summary can appropriately evaluate the existing methods controlling
output length, this evaluation setting has several problems.Comment: Accepted by INLG 201
AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions
In the last two decades, the landscape of text generation has undergone
tremendous changes and is being reshaped by the success of deep learning. New
technologies for text generation ranging from template-based methods to neural
network-based methods emerged. Meanwhile, the research objectives have also
changed from generating smooth and coherent sentences to infusing personalized
traits to enrich the diversification of newly generated content. With the rapid
development of text generation solutions, one comprehensive survey is urgent to
summarize the achievements and track the state of the arts. In this survey
paper, we present the general systematical framework, illustrate the widely
utilized models and summarize the classic applications of text generation.Comment: Accepted by IEEE UIC 201
- …