5,392 research outputs found
Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models
The gap between our ability to collect interesting data and our ability to
analyze these data is growing at an unprecedented rate. Recent algorithmic
attempts to fill this gap have employed unsupervised tools to discover
structure in data. Some of the most successful approaches have used
probabilistic models to uncover latent thematic structure in discrete data.
Despite the success of these models on textual data, they have not generalized
as well to image data, in part because of the spatial and temporal structure
that may exist in an image stream.
We introduce a novel unsupervised machine learning framework that
incorporates the ability of convolutional autoencoders to discover features
from images that directly encode spatial information, within a Bayesian
nonparametric topic model that discovers meaningful latent patterns within
discrete data. By using this hybrid framework, we overcome the fundamental
dependency of traditional topic models on rigidly hand-coded data
representations, while simultaneously encoding spatial dependency in our topics
without adding model complexity. We apply this model to the motivating
application of high-level scene understanding and mission summarization for
exploratory marine robots. Our experiments on a seafloor dataset collected by a
marine robot show that the proposed hybrid framework outperforms current
state-of-the-art approaches on the task of unsupervised seafloor terrain
characterization.Comment: 8 page
Down-Sampling coupled to Elastic Kernel Machines for Efficient Recognition of Isolated Gestures
In the field of gestural action recognition, many studies have focused on
dimensionality reduction along the spatial axis, to reduce both the variability
of gestural sequences expressed in the reduced space, and the computational
complexity of their processing. It is noticeable that very few of these methods
have explicitly addressed the dimensionality reduction along the time axis.
This is however a major issue with regard to the use of elastic distances
characterized by a quadratic complexity. To partially fill this apparent gap,
we present in this paper an approach based on temporal down-sampling associated
to elastic kernel machine learning. We experimentally show, on two data sets
that are widely referenced in the domain of human gesture recognition, and very
different in terms of quality of motion capture, that it is possible to
significantly reduce the number of skeleton frames while maintaining a good
recognition rate. The method proves to give satisfactory results at a level
currently reached by state-of-the-art methods on these data sets. The
computational complexity reduction makes this approach eligible for real-time
applications.Comment: ICPR 2014, International Conference on Pattern Recognition, Stockholm
: Sweden (2014
Latent Semantic Learning with Structured Sparse Representation for Human Action Recognition
This paper proposes a novel latent semantic learning method for extracting
high-level features (i.e. latent semantics) from a large vocabulary of abundant
mid-level features (i.e. visual keywords) with structured sparse
representation, which can help to bridge the semantic gap in the challenging
task of human action recognition. To discover the manifold structure of
midlevel features, we develop a spectral embedding approach to latent semantic
learning based on L1-graph, without the need to tune any parameter for graph
construction as a key step of manifold learning. More importantly, we construct
the L1-graph with structured sparse representation, which can be obtained by
structured sparse coding with its structured sparsity ensured by novel L1-norm
hypergraph regularization over mid-level features. In the new embedding space,
we learn latent semantics automatically from abundant mid-level features
through spectral clustering. The learnt latent semantics can be readily used
for human action recognition with SVM by defining a histogram intersection
kernel. Different from the traditional latent semantic analysis based on topic
models, our latent semantic learning method can explore the manifold structure
of mid-level features in both L1-graph construction and spectral embedding,
which results in compact but discriminative high-level features. The
experimental results on the commonly used KTH action dataset and unconstrained
YouTube action dataset show the superior performance of our method.Comment: The short version of this paper appears in ICCV 201
Multi-hierarchical Convolutional Network for Efficient Remote Photoplethysmograph Signal and Heart Rate Estimation from Face Video Clips
Heart beat rhythm and heart rate (HR) are important physiological parameters
of the human body. This study presents an efficient multi-hierarchical
spatio-temporal convolutional network that can quickly estimate remote
physiological (rPPG) signal and HR from face video clips. First, the facial
color distribution characteristics are extracted using a low-level face feature
Generation (LFFG) module. Then, the three-dimensional (3D) spatio-temporal
stack convolution module (STSC) and multi-hierarchical feature fusion module
(MHFF) are used to strengthen the spatio-temporal correlation of multi-channel
features. In the MHFF, sparse optical flow is used to capture the tiny motion
information of faces between frames and generate a self-adaptive region of
interest (ROI) skin mask. Finally, the signal prediction module (SP) is used to
extract the estimated rPPG signal. The experimental results on the three
datasets show that the proposed network outperforms the state-of-the-art
methods.Comment: 33 pages,9 figure
A Survey on Human Activity Analysis Techniques
Human Activity Recognition(HAR) is Popular research topic in Computer vision and Image Processing area. This Paper Provide an exhaustive survey on the Entire Process of identify or Recognize Human activity. Basically, There are Four steps are involved in HAR process, which are Pre-processing, Feature extraction, Training, and Classification of different activities from video. The need of data preprocessing , and segmentation based on camera movements are presented. This paper provide detailed survey on different features for HAR, feature extraction and selection method , and Classification methods with advantages and disadvantages. Finally, A brief discussion about various classification techniques are presented
- …