6 research outputs found
Action Classification with Locality-constrained Linear Coding
We propose an action classification algorithm which uses Locality-constrained
Linear Coding (LLC) to capture discriminative information of human body
variations in each spatiotemporal subsequence of a video sequence. Our proposed
method divides the input video into equally spaced overlapping spatiotemporal
subsequences, each of which is decomposed into blocks and then cells. We use
the Histogram of Oriented Gradient (HOG3D) feature to encode the information in
each cell. We justify the use of LLC for encoding the block descriptor by
demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor
is obtained via a logistic regression classifier with L2 regularization. We
evaluate and compare our algorithm with ten state-of-the-art algorithms on five
benchmark datasets. Experimental results show that, on average, our algorithm
gives better accuracy than these ten algorithms.Comment: ICPR 201
A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset
This paper aims to determine which is the best human action recognition
method based on features extracted from RGB-D devices, such as the Microsoft
Kinect. A review of all the papers that make reference to MSR Action3D, the
most used dataset that includes depth information acquired from a RGB-D device,
has been performed. We found that the validation method used by each work
differs from the others. So, a direct comparison among works cannot be made.
However, almost all the works present their results comparing them without
taking into account this issue. Therefore, we present different rankings
according to the methodology used for the validation in orden to clarify the
existing confusion.Comment: 16 pages and 7 table
Histogram of Oriented Principal Components for Cross-View Action Recognition
Existing techniques for 3D action recognition are sensitive to viewpoint
variations because they extract features from depth images which are viewpoint
dependent. In contrast, we directly process pointclouds for cross-view action
recognition from unknown and unseen views. We propose the Histogram of Oriented
Principal Components (HOPC) descriptor that is robust to noise, viewpoint,
scale and action speed variations. At a 3D point, HOPC is computed by
projecting the three scaled eigenvectors of the pointcloud within its local
spatio-temporal support volume onto the vertices of a regular dodecahedron.
HOPC is also used for the detection of Spatio-Temporal Keypoints (STK) in 3D
pointcloud sequences so that view-invariant STK descriptors (or Local HOPC
descriptors) at these key locations only are used for action recognition. We
also propose a global descriptor computed from the normalized spatio-temporal
distribution of STKs in 4-D, which we refer to as STK-D. We have evaluated the
performance of our proposed descriptors against nine existing techniques on two
cross-view and three single-view human action recognition datasets. The
Experimental results show that our techniques provide significant improvement
over state-of-the-art methods
A Comparative Review of Recent Kinect-based Action Recognition Algorithms
Video-based human action recognition is currently one of the most active
research areas in computer vision. Various research studies indicate that the
performance of action recognition is highly dependent on the type of features
being extracted and how the actions are represented. Since the release of the
Kinect camera, a large number of Kinect-based human action recognition
techniques have been proposed in the literature. However, there still does not
exist a thorough comparison of these Kinect-based techniques under the grouping
of feature types, such as handcrafted versus deep learning features and
depth-based versus skeleton-based features. In this paper, we analyze and
compare ten recent Kinect-based algorithms for both cross-subject action
recognition and cross-view action recognition using six benchmark datasets. In
addition, we have implemented and improved some of these techniques and
included their variants in the comparison. Our experiments show that the
majority of methods perform better on cross-subject action recognition than
cross-view action recognition, that skeleton-based features are more robust for
cross-view recognition than depth-based features, and that deep learning
features are suitable for large datasets.Comment: Accepted by the IEEE Transactions on Image Processin
Feature Mapping Techniques for Improving the Performance of Fault Diagnosis of Synchronous Generator
Support vector machine (SVM) is a popular machine learning algorithm used extensively in machine fault diagnosis. In this paper, linear, radial basis function (RBF), polynomial, and sigmoid kernels are experimented to diagnose inter-turn faults in a 3kVA synchronous generator. From the preliminary results, it is observed that the performance of the baseline system is not satisfactory since the statistical features are nonlinear and does not match to the kernels used. In this work, the features are linearized to a higher dimensional space to improve the performance of fault diagnosis system for a synchronous generator using feature mapping techniques, sparse coding and locality constrained linear coding (LLC). Experiments and results show that LLC is superior to sparse coding for improving the performance of fault diagnosis of a synchronous generator. For the balanced data set, LLC improves the overall fault identification accuracy of the baseline RBF system by 22.56%, 18.43% and 17.05% for the R, Y and Bphase faults respectively