37,471 research outputs found
A Deep Siamese Network for Scene Detection in Broadcast Videos
We present a model that automatically divides broadcast videos into coherent
scenes by learning a distance measure between shots. Experiments are performed
to demonstrate the effectiveness of our approach by comparing our algorithm
against recent proposals for automatic scene segmentation. We also propose an
improved performance measure that aims to reduce the gap between numerical
evaluation and expected results, and propose and release a new benchmark
dataset.Comment: ACM Multimedia 201
Acoustic Scene Classification
This work was supported by the Centre for Digital Music Platform (grant EP/K009559/1) and a Leadership Fellowship
(EP/G007144/1) both from the United Kingdom Engineering and Physical Sciences Research Council
Toward a Taxonomy and Computational Models of Abnormalities in Images
The human visual system can spot an abnormal image, and reason about what
makes it strange. This task has not received enough attention in computer
vision. In this paper we study various types of atypicalities in images in a
more comprehensive way than has been done before. We propose a new dataset of
abnormal images showing a wide range of atypicalities. We design human subject
experiments to discover a coarse taxonomy of the reasons for abnormality. Our
experiments reveal three major categories of abnormality: object-centric,
scene-centric, and contextual. Based on this taxonomy, we propose a
comprehensive computational model that can predict all different types of
abnormality in images and outperform prior arts in abnormality recognition.Comment: To appear in the Thirtieth AAAI Conference on Artificial Intelligence
(AAAI 2016
A comparative evaluation of interest point detectors and local descriptors for visual SLAM
Abstract In this paper we compare the behavior of different interest points detectors and descriptors under the
conditions needed to be used as landmarks in vision-based simultaneous localization and mapping (SLAM).
We evaluate the repeatability of the detectors, as well as the invariance and distinctiveness of the descriptors,
under different perceptual conditions using sequences of images representing planar objects as well as 3D scenes.
We believe that this information will be useful when selecting an appropriat
- …