13,438 research outputs found

    Histogram of Fuzzy Local Spatio-Temporal Descriptors for Video Action Recognition

    Get PDF
    Feature extraction plays a vital role in visual action recognition. Many existing gradient-based feature extractors, including histogram of oriented gradients (HOG), histogram of optical flow (HOF), motion boundary histograms (MBH), and histogram of motion gradients (HMG), build histograms for representing different actions over the spatio-temporal domain in a video. However, these methods require to set the number of bins for information aggregation in advance. Varying numbers of bins usually lead to inherent uncertainty within the process of pixel voting with regard to the bins in the histogram. This paper proposes a novel method to handle such uncertainty by fuzzifying these feature extractors. The proposed approach has two advantages: i) it better represents the ambiguous boundarie between the bins and thus the fuzziness of th spatio-temporal visual information entailed in videos, and ii) the contribution of each pixel is flexibly controlled by a fuzziness parameter for various scenarios. The proposed family of fuzzy descriptors and a combination of them were evaluate on two publicly available datasets, demonstrating that the proposed approach outperforms the original counterparts and other state-of-the-art methods

    Activity Recognition based on a Magnitude-Orientation Stream Network

    Full text link
    The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion in a better and richer manner. Our method applies simple nonlinear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Experimental results, carried on two well-known datasets (HMDB51 and UCF101), demonstrate that using our proposed temporal stream as input to existing neural network architectures can improve their performance for activity recognition. Results demonstrate that our temporal stream provides complementary information able to improve the classical two-stream methods, indicating the suitability of our approach to be used as a temporal video representation.Comment: 8 pages, SIBGRAPI 201

    Action Classification with Locality-constrained Linear Coding

    Full text link
    We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatiotemporal subsequence of a video sequence. Our proposed method divides the input video into equally spaced overlapping spatiotemporal subsequences, each of which is decomposed into blocks and then cells. We use the Histogram of Oriented Gradient (HOG3D) feature to encode the information in each cell. We justify the use of LLC for encoding the block descriptor by demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor is obtained via a logistic regression classifier with L2 regularization. We evaluate and compare our algorithm with ten state-of-the-art algorithms on five benchmark datasets. Experimental results show that, on average, our algorithm gives better accuracy than these ten algorithms.Comment: ICPR 201

    Geodesic Distance Histogram Feature for Video Segmentation

    Full text link
    This paper proposes a geodesic-distance-based feature that encodes global information for improved video segmentation algorithms. The feature is a joint histogram of intensity and geodesic distances, where the geodesic distances are computed as the shortest paths between superpixels via their boundaries. We also incorporate adaptive voting weights and spatial pyramid configurations to include spatial information into the geodesic histogram feature and show that this further improves results. The feature is generic and can be used as part of various algorithms. In experiments, we test the geodesic histogram feature by incorporating it into two existing video segmentation frameworks. This leads to significantly better performance in 3D video segmentation benchmarks on two datasets
    • …
    corecore