2,992 research outputs found
Unsupervised Video Understanding by Reconciliation of Posture Similarities
Understanding human activity and being able to explain it in detail surpasses
mere action classification by far in both complexity and value. The challenge
is thus to describe an activity on the basis of its most fundamental
constituents, the individual postures and their distinctive transitions.
Supervised learning of such a fine-grained representation based on elementary
poses is very tedious and does not scale. Therefore, we propose a completely
unsupervised deep learning procedure based solely on video sequences, which
starts from scratch without requiring pre-trained networks, predefined body
models, or keypoints. A combinatorial sequence matching algorithm proposes
relations between frames from subsets of the training data, while a CNN is
reconciling the transitivity conflicts of the different subsets to learn a
single concerted pose embedding despite changes in appearance across sequences.
Without any manual annotation, the model learns a structured representation of
postures and their temporal development. The model not only enables retrieval
of similar postures but also temporal super-resolution. Additionally, based on
a recurrent formulation, next frames can be synthesized.Comment: Accepted by ICCV 201
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Unsupervised Human Action Detection by Action Matching
We propose a new task of unsupervised action detection by action matching.
Given two long videos, the objective is to temporally detect all pairs of
matching video segments. A pair of video segments are matched if they share the
same human action. The task is category independent---it does not matter what
action is being performed---and no supervision is used to discover such video
segments. Unsupervised action detection by action matching allows us to align
videos in a meaningful manner. As such, it can be used to discover new action
categories or as an action proposal technique within, say, an action detection
pipeline. Moreover, it is a useful pre-processing step for generating video
highlights, e.g., from sports videos.
We present an effective and efficient method for unsupervised action
detection. We use an unsupervised temporal encoding method and exploit the
temporal consistency in human actions to obtain candidate action segments. We
evaluate our method on this challenging task using three activity recognition
benchmarks, namely, the MPII Cooking activities dataset, the THUMOS15 action
detection benchmark and a new dataset called the IKEA dataset. On the MPII
Cooking dataset we detect action segments with a precision of 21.6% and recall
of 11.7% over 946 long video pairs and over 5000 ground truth action segments.
Similarly, on THUMOS dataset we obtain 18.4% precision and 25.1% recall over
5094 ground truth action segment pairs.Comment: IEEE International Conference on Computer Vision and Pattern
Recognition CVPR 2017 Workshop
Towards an Intelligent Database System Founded on the SP Theory of Computing and Cognition
The SP theory of computing and cognition, described in previous publications,
is an attractive model for intelligent databases because it provides a simple
but versatile format for different kinds of knowledge, it has capabilities in
artificial intelligence, and it can also function like established database
models when that is required.
This paper describes how the SP model can emulate other models used in
database applications and compares the SP model with those other models. The
artificial intelligence capabilities of the SP model are reviewed and its
relationship with other artificial intelligence systems is described. Also
considered are ways in which current prototypes may be translated into an
'industrial strength' working system
- …