10 research outputs found
Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning
Evaluation of a document summarization system has been a critical factor to
impact the success of the summarization task. Previous approaches, such as
ROUGE, mainly consider the informativeness of the assessed summary and require
human-generated references for each test summary. In this work, we propose to
evaluate the summary qualities without reference summaries by unsupervised
contrastive learning. Specifically, we design a new metric which covers both
linguistic qualities and semantic informativeness based on BERT. To learn the
metric, for each summary, we construct different types of negative samples with
respect to different aspects of the summary qualities, and train our model with
a ranking loss. Experiments on Newsroom and CNN/Daily Mail demonstrate that our
new evaluation method outperforms other metrics even without reference
summaries. Furthermore, we show that our method is general and transferable
across datasets.Comment: Long Paper in EMNLP 202
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning
Coherent entity-aware multi-image captioning aims to generate coherent
captions for neighboring images in a news document. There are coherence
relationships among neighboring images because they often describe same
entities or events. These relationships are important for entity-aware
multi-image captioning, but are neglected in entity-aware single-image
captioning. Most existing work focuses on single-image captioning, while
multi-image captioning has not been explored before. Hence, this paper proposes
a coherent entity-aware multi-image captioning model by making use of coherence
relationships. The model consists of a Transformer-based caption generation
model and two types of contrastive learning-based coherence mechanisms. The
generation model generates the caption by paying attention to the image and the
accompanying text. The caption-caption coherence mechanism aims to render
entities in the caption of the image be also in captions of neighboring images.
The caption-image-text coherence mechanism aims to render entities in the
caption of the image be also in the accompanying text. To evaluate coherence
between captions, two coherence evaluation metrics are proposed. The new
dataset DM800K is constructed that has more images per document than two
existing datasets GoodNews and NYT800K, and is more suitable for multi-image
captioning. Experiments on three datasets show the proposed captioning model
outperforms 7 baselines according to BLUE, Rouge, METEOR, and entity precision
and recall scores. Experiments also show that the generated captions are more
coherent than that of baselines according to caption entity scores, caption
Rouge scores, the two proposed coherence evaluation metrics, and human
evaluations.Comment: 32 pages, 11 tables, 3 figure