8,896 research outputs found
Multimodal Multipart Learning for Action Recognition in Depth Videos
The articulated and complex nature of human actions makes the task of action
recognition difficult. One approach to handle this complexity is dividing it to
the kinetics of body parts and analyzing the actions based on these partial
descriptors. We propose a joint sparse regression based learning method which
utilizes the structured sparsity to model each action as a combination of
multimodal features from a sparse set of body parts. To represent dynamics and
appearance of parts, we employ a heterogeneous set of depth and skeleton based
features. The proper structure of multimodal multipart features are formulated
into the learning framework via the proposed hierarchical mixed norm, to
regularize the structured features of each part and to apply sparsity between
them, in favor of a group feature selection. Our experimental results expose
the effectiveness of the proposed learning method in which it outperforms other
methods in all three tested datasets while saturating one of them by achieving
perfect accuracy
Point-wise mutual information-based video segmentation with high temporal consistency
In this paper, we tackle the problem of temporally consistent boundary
detection and hierarchical segmentation in videos. While finding the best
high-level reasoning of region assignments in videos is the focus of much
recent research, temporal consistency in boundary detection has so far only
rarely been tackled. We argue that temporally consistent boundaries are a key
component to temporally consistent region assignment. The proposed method is
based on the point-wise mutual information (PMI) of spatio-temporal voxels.
Temporal consistency is established by an evaluation of PMI-based point
affinities in the spectral domain over space and time. Thus, the proposed
method is independent of any optical flow computation or previously learned
motion models. The proposed low-level video segmentation method outperforms the
learning-based state of the art in terms of standard region metrics
- …