521 research outputs found
Self-Supervised and Controlled Multi-Document Opinion Summarization
We address the problem of unsupervised abstractive summarization of
collections of user generated reviews with self-supervision and control. We
propose a self-supervised setup that considers an individual document as a
target summary for a set of similar documents. This setting makes training
simpler than previous approaches by relying only on standard log-likelihood
loss. We address the problem of hallucinations through the use of control
codes, to steer the generation towards more coherent and relevant
summaries.Finally, we extend the Transformer architecture to allow for multiple
reviews as input. Our benchmarks on two datasets against graph-based and recent
neural abstractive unsupervised models show that our proposed method generates
summaries with a superior quality and relevance.This is confirmed in our human
evaluation which focuses explicitly on the faithfulness of generated summaries
We also provide an ablation study, which shows the importance of the control
setup in controlling hallucinations and achieve high sentiment and topic
alignment of the summaries with the input reviews.Comment: 18 pages including 5 pages appendi
Read what you need: Controllable Aspect-based Opinion Summarization of Tourist Reviews
Manually extracting relevant aspects and opinions from large volumes of
user-generated text is a time-consuming process. Summaries, on the other hand,
help readers with limited time budgets to quickly consume the key ideas from
the data. State-of-the-art approaches for multi-document summarization,
however, do not consider user preferences while generating summaries. In this
work, we argue the need and propose a solution for generating personalized
aspect-based opinion summaries from large collections of online tourist
reviews. We let our readers decide and control several attributes of the
summary such as the length and specific aspects of interest among others.
Specifically, we take an unsupervised approach to extract coherent aspects from
tourist reviews posted on TripAdvisor. We then propose an Integer Linear
Programming (ILP) based extractive technique to select an informative subset of
opinions around the identified aspects while respecting the user-specified
values for various control parameters. Finally, we evaluate and compare our
summaries using crowdsourcing and ROUGE-based metrics and obtain competitive
results.Comment: 4 pages, accepted in the Proceedings of the 43rd International ACM
SIGIR Conference on Research and Development in Information Retrieval
(SIGIR), 202
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
Opinion summarization sets itself apart from other types of summarization
tasks due to its distinctive focus on aspects and sentiments. Although certain
automated evaluation methods like ROUGE have gained popularity, we have found
them to be unreliable measures for assessing the quality of opinion summaries.
In this paper, we present OpinSummEval, a dataset comprising human judgments
and outputs from 14 opinion summarization models. We further explore the
correlation between 24 automatic metrics and human ratings across four
dimensions. Our findings indicate that metrics based on neural networks
generally outperform non-neural ones. However, even metrics built on powerful
backbones, such as BART and GPT-3/3.5, do not consistently correlate well
across all dimensions, highlighting the need for advancements in automated
evaluation methods for opinion summarization. The code and data are publicly
available at https://github.com/A-Chicharito-S/OpinSummEval/tree/main.Comment: preprint, included 2 more metrics compared with the previous
submissio
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Selecting the ``right'' amount of information to include in a summary is a
difficult task. A good summary should be detailed and entity-centric without
being overly dense and hard to follow. To better understand this tradeoff, we
solicit increasingly dense GPT-4 summaries with what we refer to as a ``Chain
of Density'' (CoD) prompt. Specifically, GPT-4 generates an initial
entity-sparse summary before iteratively incorporating missing salient entities
without increasing the length. Summaries generated by CoD are more abstractive,
exhibit more fusion, and have less of a lead bias than GPT-4 summaries
generated by a vanilla prompt. We conduct a human preference study on 100 CNN
DailyMail articles and find that that humans prefer GPT-4 summaries that are
more dense than those generated by a vanilla prompt and almost as dense as
human written summaries. Qualitative analysis supports the notion that there
exists a tradeoff between informativeness and readability. 500 annotated CoD
summaries, as well as an extra 5,000 unannotated summaries, are freely
available on HuggingFace
(https://huggingface.co/datasets/griffin/chain_of_density).Comment: preprin
AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization
The rapid growth of information on the Internet has led to an overwhelming
amount of opinions and comments on various activities, products, and services.
This makes it difficult and time-consuming for users to process all the
available information when making decisions. Text summarization, a Natural
Language Processing (NLP) task, has been widely explored to help users quickly
retrieve relevant information by generating short and salient content from long
or multiple documents. Recent advances in pre-trained language models, such as
ChatGPT, have demonstrated the potential of Large Language Models (LLMs) in
text generation. However, LLMs require massive amounts of data and resources
and are challenging to implement as offline applications. Furthermore, existing
text summarization approaches often lack the ``adaptive" nature required to
capture diverse aspects in opinion summarization, which is particularly
detrimental to users with specific requirements or preferences. In this paper,
we propose an Aspect-adaptive Knowledge-based Opinion Summarization model for
product reviews, which effectively captures the adaptive nature required for
opinion summarization. The model generates aspect-oriented summaries given a
set of reviews for a particular product, efficiently providing users with
useful information on specific aspects they are interested in, ensuring the
generated summaries are more personalized and informative. Extensive
experiments have been conducted using real-world datasets to evaluate the
proposed model. The results demonstrate that our model outperforms
state-of-the-art approaches and is adaptive and efficient in generating
summaries that focus on particular aspects, enabling users to make
well-informed decisions and catering to their diverse interests and
preferences.Comment: 21 pages, 4 figures, 7 table
Scientific Opinion Summarization: Meta-review Generation with Checklist-guided Iterative Introspection
Opinions in the scientific domain can be divergent, leading to controversy or
consensus among reviewers. However, current opinion summarization datasets
mostly focus on product review domains, which do not account for this
variability under the assumption that the input opinions are non-controversial.
To address this gap, we propose the task of scientific opinion summarization,
where research paper reviews are synthesized into meta-reviews. To facilitate
this task, we introduce a new ORSUM dataset covering 10,989 paper meta-reviews
and 40,903 paper reviews from 39 conferences. Furthermore, we propose the
Checklist-guided Iterative Introspection (CGI) approach, which breaks down
the task into several stages and iteratively refines the summary under the
guidance of questions from a checklist. We conclude that (1) human-written
summaries are not always reliable since many do not follow the guidelines, and
(2) the combination of task decomposition and iterative self-refinement shows
promising discussion involvement ability and can be applied to other complex
text generation using black-box LLM
- …