2,500 research outputs found
Histogram of Oriented Principal Components for Cross-View Action Recognition
Existing techniques for 3D action recognition are sensitive to viewpoint
variations because they extract features from depth images which are viewpoint
dependent. In contrast, we directly process pointclouds for cross-view action
recognition from unknown and unseen views. We propose the Histogram of Oriented
Principal Components (HOPC) descriptor that is robust to noise, viewpoint,
scale and action speed variations. At a 3D point, HOPC is computed by
projecting the three scaled eigenvectors of the pointcloud within its local
spatio-temporal support volume onto the vertices of a regular dodecahedron.
HOPC is also used for the detection of Spatio-Temporal Keypoints (STK) in 3D
pointcloud sequences so that view-invariant STK descriptors (or Local HOPC
descriptors) at these key locations only are used for action recognition. We
also propose a global descriptor computed from the normalized spatio-temporal
distribution of STKs in 4-D, which we refer to as STK-D. We have evaluated the
performance of our proposed descriptors against nine existing techniques on two
cross-view and three single-view human action recognition datasets. The
Experimental results show that our techniques provide significant improvement
over state-of-the-art methods
Adaptive Image Denoising by Targeted Databases
We propose a data-dependent denoising procedure to restore noisy images.
Different from existing denoising algorithms which search for patches from
either the noisy image or a generic database, the new algorithm finds patches
from a database that contains only relevant patches. We formulate the denoising
problem as an optimal filter design problem and make two contributions. First,
we determine the basis function of the denoising filter by solving a group
sparsity minimization problem. The optimization formulation generalizes
existing denoising algorithms and offers systematic analysis of the
performance. Improvement methods are proposed to enhance the patch search
process. Second, we determine the spectral coefficients of the denoising filter
by considering a localized Bayesian prior. The localized prior leverages the
similarity of the targeted database, alleviates the intensive Bayesian
computation, and links the new method to the classical linear minimum mean
squared error estimation. We demonstrate applications of the proposed method in
a variety of scenarios, including text images, multiview images and face
images. Experimental results show the superiority of the new algorithm over
existing methods.Comment: 15 pages, 13 figures, 2 tables, journa
Multi-view Graph Embedding with Hub Detection for Brain Network Analysis
Multi-view graph embedding has become a widely studied problem in the area of
graph learning. Most of the existing works on multi-view graph embedding aim to
find a shared common node embedding across all the views of the graph by
combining the different views in a specific way. Hub detection, as another
essential topic in graph mining has also drawn extensive attentions in recent
years, especially in the context of brain network analysis. Both the graph
embedding and hub detection relate to the node clustering structure of graphs.
The multi-view graph embedding usually implies the node clustering structure of
the graph based on the multiple views, while the hubs are the boundary-spanning
nodes across different node clusters in the graph and thus may potentially
influence the clustering structure of the graph. However, none of the existing
works in multi-view graph embedding considered the hubs when learning the
multi-view embeddings. In this paper, we propose to incorporate the hub
detection task into the multi-view graph embedding framework so that the two
tasks could benefit each other. Specifically, we propose an auto-weighted
framework of Multi-view Graph Embedding with Hub Detection (MVGE-HD) for brain
network analysis. The MVGE-HD framework learns a unified graph embedding across
all the views while reducing the potential influence of the hubs on blurring
the boundaries between node clusters in the graph, thus leading to a clear and
discriminative node clustering structure for the graph. We apply MVGE-HD on two
real multi-view brain network datasets (i.e., HIV and Bipolar). The
experimental results demonstrate the superior performance of the proposed
framework in brain network analysis for clinical investigation and application
Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding
Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness
- …