252 research outputs found
Collaborative Summarization of Topic-Related Videos
Large collections of videos are grouped into clusters by a topic keyword,
such as Eiffel Tower or Surfing, with many important visual concepts repeating
across them. Such a topically close set of videos have mutual influence on each
other, which could be used to summarize one of them by exploiting information
from others in the set. We build on this intuition to develop a novel approach
to extract a summary that simultaneously captures both important
particularities arising in the given video, as well as, generalities identified
from the set of videos. The topic-related videos provide visual context to
identify the important parts of the video being summarized. We achieve this by
developing a collaborative sparse optimization method which can be efficiently
solved by a half-quadratic minimization algorithm. Our work builds upon the
idea of collaborative techniques from information retrieval and natural
language processing, which typically use the attributes of other similar
objects to predict the attribute of a given object. Experiments on two
challenging and diverse datasets well demonstrate the efficacy of our approach
over state-of-the-art methods.Comment: CVPR 201
Abstractive Multi-Document Summarization via Phrase Selection and Merging
We propose an abstraction-based multi-document summarization framework that
can construct new sentences by exploring more fine-grained syntactic units than
sentences, namely, noun/verb phrases. Different from existing abstraction-based
approaches, our method first constructs a pool of concepts and facts
represented by phrases from the input documents. Then new sentences are
generated by selecting and merging informative phrases to maximize the salience
of phrases and meanwhile satisfy the sentence construction constraints. We
employ integer linear optimization for conducting phrase selection and merging
simultaneously in order to achieve the global optimal solution for a summary.
Experimental results on the benchmark data set TAC 2011 show that our framework
outperforms the state-of-the-art models under automated pyramid evaluation
metric, and achieves reasonably well results on manual linguistic quality
evaluation.Comment: 11 pages, 1 figure, accepted as a full paper at ACL 201
Multi-Document Summarization using Distributed Bag-of-Words Model
As the number of documents on the web is growing exponentially,
multi-document summarization is becoming more and more important since it can
provide the main ideas in a document set in short time. In this paper, we
present an unsupervised centroid-based document-level reconstruction framework
using distributed bag of words model. Specifically, our approach selects
summary sentences in order to minimize the reconstruction error between the
summary and the documents. We apply sentence selection and beam search, to
further improve the performance of our model. Experimental results on two
different datasets show significant performance gains compared with the
state-of-the-art baselines
Enumeration of Extractive Oracle Summaries
To analyze the limitations and the future directions of the extractive
summarization paradigm, this paper proposes an Integer Linear Programming (ILP)
formulation to obtain extractive oracle summaries in terms of ROUGE-N. We also
propose an algorithm that enumerates all of the oracle summaries for a set of
reference summaries to exploit F-measures that evaluate which system summaries
contain how many sentences that are extracted as an oracle summary. Our
experimental results obtained from Document Understanding Conference (DUC)
corpora demonstrated the following: (1) room still exists to improve the
performance of extractive summarization; (2) the F-measures derived from the
enumerated oracle summaries have significantly stronger correlations with human
judgment than those derived from single oracle summaries.Comment: 12 page
- …