6,447 research outputs found
From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video Clips
Short internet video clips like vines present a significantly wild
distribution compared to traditional video datasets. In this paper, we focus on
the problem of unsupervised action classification in wild vines using
traditional labeled datasets. To this end, we use a data augmentation based
simple domain adaptation strategy. We utilise semantic word2vec space as a
common subspace to embed video features from both, labeled source domain and
unlablled target domain. Our method incrementally augments the labeled source
with target samples and iteratively modifies the embedding function to bring
the source and target distributions together. Additionally, we utilise a
multi-modal representation that incorporates noisy semantic information
available in form of hash-tags. We show the effectiveness of this simple
adaptation technique on a test set of vines and achieve notable improvements in
performance.Comment: 9 pages, GCPR, 201
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
The THUMOS Challenge on Action Recognition for Videos "in the Wild"
Automatically recognizing and localizing wide ranges of human actions has
crucial importance for video understanding. Towards this goal, the THUMOS
challenge was introduced in 2013 to serve as a benchmark for action
recognition. Until then, video action recognition, including THUMOS challenge,
had focused primarily on the classification of pre-segmented (i.e., trimmed)
videos, which is an artificial task. In THUMOS 2014, we elevated action
recognition to a more practical level by introducing temporally untrimmed
videos. These also include `background videos' which share similar scenes and
backgrounds as action videos, but are devoid of the specific actions. The three
editions of the challenge organized in 2013--2015 have made THUMOS a common
benchmark for action classification and detection and the annual challenge is
widely attended by teams from around the world.
In this paper we describe the THUMOS benchmark in detail and give an overview
of data collection and annotation procedures. We present the evaluation
protocols used to quantify results in the two THUMOS tasks of action
classification and temporal detection. We also present results of submissions
to the THUMOS 2015 challenge and review the participating approaches.
Additionally, we include a comprehensive empirical study evaluating the
differences in action recognition between trimmed and untrimmed videos, and how
well methods trained on trimmed videos generalize to untrimmed videos. We
conclude by proposing several directions and improvements for future THUMOS
challenges.Comment: Preprint submitted to Computer Vision and Image Understandin
- …