2,558 research outputs found
Summarizing Dialogic Arguments from Social Media
Online argumentative dialog is a rich source of information on popular
beliefs and opinions that could be useful to companies as well as governmental
or public policy agencies. Compact, easy to read, summaries of these dialogues
would thus be highly valuable. A priori, it is not even clear what form such a
summary should take. Previous work on summarization has primarily focused on
summarizing written texts, where the notion of an abstract of the text is well
defined. We collect gold standard training data consisting of five human
summaries for each of 161 dialogues on the topics of Gay Marriage, Gun Control
and Abortion. We present several different computational models aimed at
identifying segments of the dialogues whose content should be used for the
summary, using linguistic features and Word2vec features with both SVMs and
Bidirectional LSTMs. We show that we can identify the most important arguments
by using the dialog context with a best F-measure of 0.74 for gun control, 0.71
for gay marriage, and 0.67 for abortion.Comment: Proceedings of the 21th Workshop on the Semantics and Pragmatics of
Dialogue (SemDial 2017
Evaluating Emotional Nuances in Dialogue Summarization
Automatic dialogue summarization is a well-established task that aims to
identify the most important content from human conversations to create a short
textual summary. Despite recent progress in the field, we show that most of the
research has focused on summarizing the factual information, leaving aside the
affective content, which can yet convey useful information to analyse, monitor,
or support human interactions. In this paper, we propose and evaluate a set of
measures , to quantify how much emotion is preserved in dialog summaries.
Results show that, summarization models of the state-of-the-art do not preserve
well the emotional content in the summaries. We also show that by reducing the
training set to only emotional dialogues, the emotional content is better
preserved in the generated summaries, while conserving the most salient factual
information
What's the issue here?: Task-based evaluation of reader comment summarization systems
Automatic summarization of reader comments in on-line news is an extremely challenging task and a capability for which there is a
clear need. Work to date has focussed on producing extractive summaries using well-known techniques imported from other areas of
language processing. But are extractive summaries of comments what users really want? Do they support users in performing the sorts
of tasks they are likely to want to perform with reader comments? In this paper we address these questions by doing three things. First,
we offer a specification of one possible summary type for reader comment, based on an analysis of reader comment in terms of issues
and viewpoints. Second, we define a task-based evaluation framework for reader comment summarization that allows summarization
systems to be assessed in terms of how well they support users in a time-limited task of identifying issues and characterising opinion on
issues in comments. Third, we describe a pilot evaluation in which we used the task-based evaluation framework to evaluate a prototype
reader comment clustering and summarization system, demonstrating the viability of the evaluation framework and illustrating the sorts
of insight such an evaluation affords
He Said, She Said: Style Transfer for Shifting the Perspective of Dialogues
In this work, we define a new style transfer task: perspective shift, which
reframes a dialogue from informal first person to a formal third person
rephrasing of the text. This task requires challenging coreference resolution,
emotion attribution, and interpretation of informal text. We explore several
baseline approaches and discuss further directions on this task when applied to
short dialogues. As a sample application, we demonstrate that applying
perspective shifting to a dialogue summarization dataset (SAMSum) substantially
improves the zero-shot performance of extractive news summarization models on
this data. Additionally, supervised extractive models perform better when
trained on perspective shifted data than on the original dialogues. We release
our code publicly.Comment: Findings of EMNLP 2022, 18 page
Evaluating Robustness of Dialogue Summarization Models in the Presence of Naturally Occurring Variations
Dialogue summarization task involves summarizing long conversations while
preserving the most salient information. Real-life dialogues often involve
naturally occurring variations (e.g., repetitions, hesitations) and existing
dialogue summarization models suffer from performance drop on such
conversations. In this study, we systematically investigate the impact of such
variations on state-of-the-art dialogue summarization models using publicly
available datasets. To simulate real-life variations, we introduce two types of
perturbations: utterance-level perturbations that modify individual utterances
with errors and language variations, and dialogue-level perturbations that add
non-informative exchanges (e.g., repetitions, greetings). We conduct our
analysis along three dimensions of robustness: consistency, saliency, and
faithfulness, which capture different aspects of the summarization model's
performance. We find that both fine-tuned and instruction-tuned models are
affected by input variations, with the latter being more susceptible,
particularly to dialogue-level perturbations. We also validate our findings via
human evaluation. Finally, we investigate if the robustness of fine-tuned
models can be improved by training them with a fraction of perturbed data and
observe that this approach is insufficient to address robustness challenges
with current models and thus warrants a more thorough investigation to identify
better solutions. Overall, our work highlights robustness challenges in
dialogue summarization and provides insights for future research
- …