13,801 research outputs found
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
Micro Fourier Transform Profilometry (FTP): 3D shape measurement at 10,000 frames per second
Recent advances in imaging sensors and digital light projection technology
have facilitated a rapid progress in 3D optical sensing, enabling 3D surfaces
of complex-shaped objects to be captured with improved resolution and accuracy.
However, due to the large number of projection patterns required for phase
recovery and disambiguation, the maximum fame rates of current 3D shape
measurement techniques are still limited to the range of hundreds of frames per
second (fps). Here, we demonstrate a new 3D dynamic imaging technique, Micro
Fourier Transform Profilometry (FTP), which can capture 3D surfaces of
transient events at up to 10,000 fps based on our newly developed high-speed
fringe projection system. Compared with existing techniques, FTP has the
prominent advantage of recovering an accurate, unambiguous, and dense 3D point
cloud with only two projected patterns. Furthermore, the phase information is
encoded within a single high-frequency fringe image, thereby allowing
motion-artifact-free reconstruction of transient events with temporal
resolution of 50 microseconds. To show FTP's broad utility, we use it to
reconstruct 3D videos of 4 transient scenes: vibrating cantilevers, rotating
fan blades, bullet fired from a toy gun, and balloon's explosion triggered by a
flying dart, which were previously difficult or even unable to be captured with
conventional approaches.Comment: This manuscript was originally submitted on 30th January 1
Learning Deep Representations of Appearance and Motion for Anomalous Event Detection
We present a novel unsupervised deep learning framework for anomalous event
detection in complex video scenes. While most existing works merely use
hand-crafted appearance and motion features, we propose Appearance and Motion
DeepNet (AMDN) which utilizes deep neural networks to automatically learn
feature representations. To exploit the complementary information of both
appearance and motion patterns, we introduce a novel double fusion framework,
combining both the benefits of traditional early fusion and late fusion
strategies. Specifically, stacked denoising autoencoders are proposed to
separately learn both appearance and motion features as well as a joint
representation (early fusion). Based on the learned representations, multiple
one-class SVM models are used to predict the anomaly scores of each input,
which are then integrated with a late fusion strategy for final anomaly
detection. We evaluate the proposed method on two publicly available video
surveillance datasets, showing competitive performance with respect to state of
the art approaches.Comment: Oral paper in BMVC 201
- …