351 research outputs found
Abstract Meaning Representation for Multi-Document Summarization
Generating an abstract from a collection of documents is a desirable
capability for many real-world applications. However, abstractive approaches to
multi-document summarization have not been thoroughly investigated. This paper
studies the feasibility of using Abstract Meaning Representation (AMR), a
semantic representation of natural language grounded in linguistic theory, as a
form of content representation. Our approach condenses source documents to a
set of summary graphs following the AMR formalism. The summary graphs are then
transformed to a set of summary sentences in a surface realization step. The
framework is fully data-driven and flexible. Each component can be optimized
independently using small-scale, in-domain training data. We perform
experiments on benchmark summarization datasets and report promising results.
We also describe opportunities and challenges for advancing this line of
research.Comment: 13 page
SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression
Neural sequence-to-sequence models are currently the dominant approach in
several natural language processing tasks, but require large parallel corpora.
We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting
of two chained encoder-decoder pairs, with words used as a sequence of discrete
latent variables. We apply the proposed model to unsupervised abstractive
sentence compression, where the first and last sequences are the input and
reconstructed sentences, respectively, while the middle sequence is the
compressed sentence. Constraining the length of the latent word sequences
forces the model to distill important information from the input. A pretrained
language model, acting as a prior over the latent sequences, encourages the
compressed sentences to be human-readable. Continuous relaxations enable us to
sample from categorical distributions, allowing gradient-based optimization,
unlike alternatives that rely on reinforcement learning. The proposed model
does not require parallel text-summary pairs, achieving promising results in
unsupervised sentence compression on benchmark datasets.Comment: Accepted to NAACL 201
Topic-Oriented Spoken Dialogue Summarization for Customer Service with Saliency-Aware Topic Modeling
In a customer service system, dialogue summarization can boost service
efficiency by automatically creating summaries for long spoken dialogues in
which customers and agents try to address issues about specific topics. In this
work, we focus on topic-oriented dialogue summarization, which generates highly
abstractive summaries that preserve the main ideas from dialogues. In spoken
dialogues, abundant dialogue noise and common semantics could obscure the
underlying informative content, making the general topic modeling approaches
difficult to apply. In addition, for customer service, role-specific
information matters and is an indispensable part of a summary. To effectively
perform topic modeling on dialogues and capture multi-role information, in this
work we propose a novel topic-augmented two-stage dialogue summarizer (TDS)
jointly with a saliency-aware neural topic model (SATM) for topic-oriented
summarization of customer service dialogues. Comprehensive studies on a
real-world Chinese customer service dataset demonstrated the superiority of our
method against several strong baselines.Comment: Accepted by AAAI 2021, 9 page
- …