8,311 research outputs found
Self-Attentive Model for Headline Generation
Headline generation is a special type of text summarization task. While the
amount of available training data for this task is almost unlimited, it still
remains challenging, as learning to generate headlines for news articles
implies that the model has strong reasoning about natural language. To overcome
this issue, we applied recent Universal Transformer architecture paired with
byte-pair encoding technique and achieved new state-of-the-art results on the
New York Times Annotated corpus with ROUGE-L F1-score 24.84 and ROUGE-2
F1-score 13.48. We also present the new RIA corpus and reach ROUGE-L F1-score
36.81 and ROUGE-2 F1-score 22.15 on it.Comment: accepted for ECIR 201
Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction
Automatic sentence summarization produces a shorter version of a sentence,
while preserving its most important information. A good summary is
characterized by language fluency and high information overlap with the source
sentence. We model these two aspects in an unsupervised objective function,
consisting of language modeling and semantic similarity metrics. We search for
a high-scoring summary by discrete optimization. Our proposed method achieves a
new state-of-the art for unsupervised sentence summarization according to ROUGE
scores. Additionally, we demonstrate that the commonly reported ROUGE F1 metric
is sensitive to summary length. Since this is unwillingly exploited in recent
work, we emphasize that future evaluation should explicitly group summarization
systems by output length brackets.Comment: Accepted at ACL 202
Distilling Knowledge Learned in BERT for Text Generation
Large-scale pre-trained language model such as BERT has achieved great
success in language understanding tasks. However, it remains an open question
how to utilize BERT for language generation. In this paper, we present a novel
approach, Conditional Masked Language Modeling (C-MLM), to enable the
finetuning of BERT on target generation tasks. The finetuned BERT (teacher) is
exploited as extra supervision to improve conventional Seq2Seq models (student)
for better text generation performance. By leveraging BERT's idiosyncratic
bidirectional nature, distilling knowledge learned in BERT can encourage
auto-regressive Seq2Seq models to plan ahead, imposing global sequence-level
supervision for coherent text generation. Experiments show that the proposed
approach significantly outperforms strong Transformer baselines on multiple
language generation tasks such as machine translation and text summarization.
Our proposed model also achieves new state of the art on IWSLT German-English
and English-Vietnamese MT datasets. Code is available at
https://github.com/ChenRocks/Distill-BERT-Textgen.Comment: ACL 202
Hooks in the Headline: Learning to Generate Headlines with Controlled Styles
Current summarization systems only produce plain, factual headlines, but do
not meet the practical needs of creating memorable titles to increase exposure.
We propose a new task, Stylistic Headline Generation (SHG), to enrich the
headlines with three style options (humor, romance and clickbait), in order to
attract more readers. With no style-specific article-headline pair (only a
standard headline summarization dataset and mono-style corpora), our method
TitleStylist generates style-specific headlines by combining the summarization
and reconstruction tasks into a multitasking framework. We also introduced a
novel parameter sharing scheme to further disentangle the style from the text.
Through both automatic and human evaluation, we demonstrate that TitleStylist
can generate relevant, fluent headlines with three target styles: humor,
romance, and clickbait. The attraction score of our model generated headlines
surpasses that of the state-of-the-art summarization model by 9.68%, and even
outperforms human-written references.Comment: ACL 202
Zero-shot topic generation
We present an approach to generating topics using a model trained only for
document title generation, with zero examples of topics given during training.
We leverage features that capture the relevance of a candidate span in a
document for the generation of a title for that document. The output is a
weighted collection of the phrases that are most relevant for describing the
document and distinguishing it within a corpus, without requiring access to the
rest of the corpus. We conducted a double-blind trial in which human annotators
scored the quality of our machine-generated topics along with original
human-written topics associated with news articles from The Guardian and The
Huffington Post. The results show that our zero-shot model generates topic
labels for news documents that are on average equal to or higher quality than
those written by humans, as judged by humans.Comment: 12 pages, 9 figures, 3 table
What comes next? Extractive summarization by next-sentence prediction
Existing approaches to automatic summarization assume that a length limit for
the summary is given, and view content selection as an optimization problem to
maximize informativeness and minimize redundancy within this budget. This
framework ignores the fact that human-written summaries have rich internal
structure which can be exploited to train a summarization system. We present
NEXTSUM, a novel approach to summarization based on a model that predicts the
next sentence to include in the summary using not only the source article, but
also the summary produced so far. We show that such a model successfully
captures summary-specific discourse moves, and leads to better content
selection performance, in addition to automatically predicting how long the
target summary should be. We perform experiments on the New York Times
Annotated Corpus of summaries, where NEXTSUM outperforms lead and content-model
summarization baselines by significant margins. We also show that the lengths
of summaries produced by our system correlates with the lengths of the
human-written gold standards
Conditional Neural Generation using Sub-Aspect Functions for Extractive News Summarization
Much progress has been made in text summarization, fueled by neural
architectures using large-scale training corpora. However, in the news domain,
neural models easily overfit by leveraging position-related features due to the
prevalence of the inverted pyramid writing style. In addition, there is an
unmet need to generate a variety of summaries for different users. In this
paper, we propose a neural framework that can flexibly control summary
generation by introducing a set of sub-aspect functions (i.e. importance,
diversity, position). These sub-aspect functions are regulated by a set of
control codes to decide which sub-aspect to focus on during summary generation.
We demonstrate that extracted summaries with minimal position bias is
comparable with those generated by standard models that take advantage of
position preference. We also show that news summaries generated with a focus on
diversity can be more preferred by human raters. These results suggest that a
more flexible neural summarization framework providing more control options
could be desirable in tailoring to different user preferences, which is useful
since it is often impractical to articulate such preferences for different
applications a priori.Comment: Accepted to Findings of EMNLP 202
Conceptual Text Summarizer: A new model in continuous vector space
Traditional methods of summarization are not cost-effective and possible
today. Extractive summarization is a process that helps to extract the most
important sentences from a text automatically and generates a short informative
summary. In this work, we propose an unsupervised method to summarize Persian
texts. This method is a novel hybrid approach that clusters the concepts of the
text using deep learning and traditional statistical methods. First we produce
a word embedding based on Hamshahri2 corpus and a dictionary of word
frequencies. Then the proposed algorithm extracts the keywords of the document,
clusters its concepts, and finally ranks the sentences to produce the summary.
We evaluated the proposed method on Pasokh single-document corpus using the
ROUGE evaluation measure. Without using any hand-crafted features, our proposed
method achieves state-of-the-art results. We compared our unsupervised method
with the best supervised Persian methods and we achieved an overall improvement
of ROUGE-2 recall score of 7.5%.Comment: The experimental results complete
News Article Teaser Tweets and How to Generate Them
In this work, we define the task of teaser generation and provide an
evaluation benchmark and baseline systems for the process of generating
teasers. A teaser is a short reading suggestion for an article that is
illustrative and includes curiosity-arousing elements to entice potential
readers to read particular news items. Teasers are one of the main vehicles for
transmitting news to social media users. We compile a novel dataset of teasers
by systematically accumulating tweets and selecting those that conform to the
teaser definition. We have compared a number of neural abstractive
architectures on the task of teaser generation and the overall best performing
system is See et al.(2017)'s seq2seq with pointer network
Unified Language Model Pre-training for Natural Language Understanding and Generation
This paper presents a new Unified pre-trained Language Model (UniLM) that can
be fine-tuned for both natural language understanding and generation tasks. The
model is pre-trained using three types of language modeling tasks:
unidirectional, bidirectional, and sequence-to-sequence prediction. The unified
modeling is achieved by employing a shared Transformer network and utilizing
specific self-attention masks to control what context the prediction conditions
on. UniLM compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0
and CoQA question answering tasks. Moreover, UniLM achieves new
state-of-the-art results on five natural language generation datasets,
including improving the CNN/DailyMail abstractive summarization ROUGE-L to
40.51 (2.04 absolute improvement), the Gigaword abstractive summarization
ROUGE-L to 35.75 (0.86 absolute improvement), the CoQA generative question
answering F1 score to 82.5 (37.1 absolute improvement), the SQuAD question
generation BLEU-4 to 22.12 (3.75 absolute improvement), and the DSTC7
document-grounded dialog response generation NIST-4 to 2.67 (human performance
is 2.65). The code and pre-trained models are available at
https://github.com/microsoft/unilm.Comment: Accepted by NeurIPS-19. Code and pre-trained models:
https://github.com/microsoft/unil
- …