546 research outputs found
Collaborative Summarization of Topic-Related Videos
Large collections of videos are grouped into clusters by a topic keyword,
such as Eiffel Tower or Surfing, with many important visual concepts repeating
across them. Such a topically close set of videos have mutual influence on each
other, which could be used to summarize one of them by exploiting information
from others in the set. We build on this intuition to develop a novel approach
to extract a summary that simultaneously captures both important
particularities arising in the given video, as well as, generalities identified
from the set of videos. The topic-related videos provide visual context to
identify the important parts of the video being summarized. We achieve this by
developing a collaborative sparse optimization method which can be efficiently
solved by a half-quadratic minimization algorithm. Our work builds upon the
idea of collaborative techniques from information retrieval and natural
language processing, which typically use the attributes of other similar
objects to predict the attribute of a given object. Experiments on two
challenging and diverse datasets well demonstrate the efficacy of our approach
over state-of-the-art methods.Comment: CVPR 201
Multi-View Frame Reconstruction with Conditional GAN
Multi-view frame reconstruction is an important problem particularly when
multiple frames are missing and past and future frames within the camera are
far apart from the missing ones. Realistic coherent frames can still be
reconstructed using corresponding frames from other overlapping cameras. We
propose an adversarial approach to learn the spatio-temporal representation of
the missing frame using conditional Generative Adversarial Network (cGAN). The
conditional input to each cGAN is the preceding or following frames within the
camera or the corresponding frames in other overlapping cameras, all of which
are merged together using a weighted average. Representations learned from
frames within the camera are given more weight compared to the ones learned
from other cameras when they are close to the missing frames and vice versa.
Experiments on two challenging datasets demonstrate that our framework produces
comparable results with the state-of-the-art reconstruction method in a single
camera and achieves promising performance in multi-camera scenario.Comment: 5 pages, 4 figures, 3 tables, Accepted at IEEE Global Conference on
Signal and Information Processing, 201
- …