11,207 research outputs found
Collaborative Summarization of Topic-Related Videos
Large collections of videos are grouped into clusters by a topic keyword,
such as Eiffel Tower or Surfing, with many important visual concepts repeating
across them. Such a topically close set of videos have mutual influence on each
other, which could be used to summarize one of them by exploiting information
from others in the set. We build on this intuition to develop a novel approach
to extract a summary that simultaneously captures both important
particularities arising in the given video, as well as, generalities identified
from the set of videos. The topic-related videos provide visual context to
identify the important parts of the video being summarized. We achieve this by
developing a collaborative sparse optimization method which can be efficiently
solved by a half-quadratic minimization algorithm. Our work builds upon the
idea of collaborative techniques from information retrieval and natural
language processing, which typically use the attributes of other similar
objects to predict the attribute of a given object. Experiments on two
challenging and diverse datasets well demonstrate the efficacy of our approach
over state-of-the-art methods.Comment: CVPR 201
Autocalibration with the Minimum Number of Cameras with Known Pixel Shape
In 3D reconstruction, the recovery of the calibration parameters of the
cameras is paramount since it provides metric information about the observed
scene, e.g., measures of angles and ratios of distances. Autocalibration
enables the estimation of the camera parameters without using a calibration
device, but by enforcing simple constraints on the camera parameters. In the
absence of information about the internal camera parameters such as the focal
length and the principal point, the knowledge of the camera pixel shape is
usually the only available constraint. Given a projective reconstruction of a
rigid scene, we address the problem of the autocalibration of a minimal set of
cameras with known pixel shape and otherwise arbitrarily varying intrinsic and
extrinsic parameters. We propose an algorithm that only requires 5 cameras (the
theoretical minimum), thus halving the number of cameras required by previous
algorithms based on the same constraint. To this purpose, we introduce as our
basic geometric tool the six-line conic variety (SLCV), consisting in the set
of planes intersecting six given lines of 3D space in points of a conic. We
show that the set of solutions of the Euclidean upgrading problem for three
cameras with known pixel shape can be parameterized in a computationally
efficient way. This parameterization is then used to solve autocalibration from
five or more cameras, reducing the three-dimensional search space to a
two-dimensional one. We provide experiments with real images showing the good
performance of the technique.Comment: 19 pages, 14 figures, 7 tables, J. Math. Imaging Vi
Contractive De-noising Auto-encoder
Auto-encoder is a special kind of neural network based on reconstruction.
De-noising auto-encoder (DAE) is an improved auto-encoder which is robust to
the input by corrupting the original data first and then reconstructing the
original input by minimizing the reconstruction error function. And contractive
auto-encoder (CAE) is another kind of improved auto-encoder to learn robust
feature by introducing the Frobenius norm of the Jacobean matrix of the learned
feature with respect to the original input. In this paper, we combine
de-noising auto-encoder and contractive auto- encoder, and propose another
improved auto-encoder, contractive de-noising auto- encoder (CDAE), which is
robust to both the original input and the learned feature. We stack CDAE to
extract more abstract features and apply SVM for classification. The experiment
result on benchmark dataset MNIST shows that our proposed CDAE performed better
than both DAE and CAE, proving the effective of our method.Comment: Figures edite
- …