7,433 research outputs found
Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization
We address the problem of maximizing an unknown submodular function that can
only be accessed via noisy evaluations. Our work is motivated by the task of
summarizing content, e.g., image collections, by leveraging users' feedback in
form of clicks or ratings. For summarization tasks with the goal of maximizing
coverage and diversity, submodular set functions are a natural choice. When the
underlying submodular function is unknown, users' feedback can provide noisy
evaluations of the function that we seek to maximize. We provide a generic
algorithm -- \submM{} -- for maximizing an unknown submodular function under
cardinality constraints. This algorithm makes use of a novel exploration module
-- \blbox{} -- that proposes good elements based on adaptively sampling noisy
function evaluations. \blbox{} is able to accommodate different kinds of
observation models such as value queries and pairwise comparisons. We provide
PAC-style guarantees on the quality and sampling cost of the solution obtained
by \submM{}. We demonstrate the effectiveness of our approach in an
interactive, crowdsourced image collection summarization application.Comment: Extended version of AAAI'16 pape
Self-Supervised and Controlled Multi-Document Opinion Summarization
We address the problem of unsupervised abstractive summarization of
collections of user generated reviews with self-supervision and control. We
propose a self-supervised setup that considers an individual document as a
target summary for a set of similar documents. This setting makes training
simpler than previous approaches by relying only on standard log-likelihood
loss. We address the problem of hallucinations through the use of control
codes, to steer the generation towards more coherent and relevant
summaries.Finally, we extend the Transformer architecture to allow for multiple
reviews as input. Our benchmarks on two datasets against graph-based and recent
neural abstractive unsupervised models show that our proposed method generates
summaries with a superior quality and relevance.This is confirmed in our human
evaluation which focuses explicitly on the faithfulness of generated summaries
We also provide an ablation study, which shows the importance of the control
setup in controlling hallucinations and achieve high sentiment and topic
alignment of the summaries with the input reviews.Comment: 18 pages including 5 pages appendi
- …