3,497 research outputs found
Stay-At-Home Motor Rehabilitation: Optimizing Spatiotemporal Learning on Low-Cost Capacitive Sensor Arrays
Repeated, consistent, and precise gesture performance is a key part of recovery for stroke and other motor-impaired patients. Close professional supervision to these exercises is also essential to ensure proper neuromotor repair, which consumes a large amount of medical resources. Gesture recognition systems are emerging as stay-at-home solutions to this problem, but the best solutions are expensive, and the inexpensive solutions are not universal enough to tackle patient-to-patient variability. While many methods have been studied and implemented, the gesture recognition system designer does not have a strategy to effectively predict the right method to fit the needs of a patient. This thesis establishes such a strategy by outlining the strengths and weaknesses of several spatiotemporal learning architectures combined with deep learning, specifically when low-cost, low-resolution capacitive sensor arrays are used. This is done by testing the immunity and robustness of those architectures to the type of variability that is common among stroke patients, investigating select hyperparameters and their impact on the architectures’ training progressions, and comparing test performance in different applications and scenarios. The models analyzed here are trained on a mixture of high-quality, healthy gestures and personalized, imperfectly performed gestures using a low-cost recognition system
Improved probabilistic distance based locality preserving projections method to reduce dimensionality in large datasets
In this paper, a dimensionality reduction is achieved in large datasets using the proposed distance based Non-integer Matrix Factorization (NMF) technique, which is intended to solve the data dimensionality problem. Here, NMF and distance measurement aim to resolve the non-orthogonality problem due to increased dataset dimensionality. It initially partitions the datasets, organizes them into a defined geometric structure and it avoids capturing the dataset structure through a distance based similarity measurement. The proposed method is designed to fit the dynamic datasets and it includes the intrinsic structure using data geometry. Therefore, the complexity of data is further avoided using an Improved Distance based Locality Preserving Projection. The proposed method is evaluated against existing methods in terms of accuracy, average accuracy, mutual information and average mutual information
Recurrent Attention Models for Depth-Based Person Identification
We present an attention-based model that reasons on human body shape and
motion dynamics to identify individuals in the absence of RGB information,
hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
to address the identification problem across days. Formulated as a
reinforcement learning task, our model is based on a combination of
convolutional and recurrent neural networks with the goal of identifying small,
discriminative regions indicative of human identity. We demonstrate that our
model produces state-of-the-art results on several published datasets given
only depth images. We further study the robustness of our model towards
viewpoint, appearance, and volumetric changes. Finally, we share insights
gleaned from interpretable 2D, 3D, and 4D visualizations of our model's
spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
Dynamic gesture recognition using transformation invariant hand shape recognition
In this thesis a detailed framework is presented for accurate real time gesture recognition. Our approach to develop a hand-shape classifier, trained using computer animation, along with its application in dynamic gesture recognition is described. The system developed operates in real time and provides accurate gesture recognition. It operates using a single low resolution camera and operates in Matlab on a conventional PC running Windows XP.
The hand shape classifier outlined in this thesis uses transformation invariant subspaces created using Principal Component Analysis (PCA). These subspaces are created from a large vocabulary created in a systematic maimer using computer animation. In recognising dynamic gestures we utilise both hand shape and hand position information; these are two o f the main features used by humans in distinguishing gestures. Hidden Markov Models (HMMs) are trained and employed to recognise this combination of hand shape and hand position features.
During the course o f this thesis we have described in detail the inspiration and motivation behind our research and its possible applications. In this work our emphasis is on achieving a high speed system that works in real time with high accuracy
Sparse and low rank approximations for action recognition
Action recognition is crucial area of research in computer vision with wide range of
applications in surveillance, patient-monitoring systems, video indexing, Human-
Computer Interaction and many more. These applications require automated
action recognition. Robust classification methods are sought-after despite influential
research in this field over past decade. The data resources have grown
tremendously owing to the advances in the digital revolution which cannot be
compared to the meagre resources in the past. The main limitation on a system
when dealing with video data is the computational burden due to large dimensions
and data redundancy. Sparse and low rank approximation methods have evolved
recently which aim at concise and meaningful representation of data. This thesis
explores the application of sparse and low rank approximation methods in the
context of video data classification with the following contributions.
1. An approach for solving the problem of action and gesture classification is
proposed within the sparse representation domain, effectively dealing with
large feature dimensions,
2. Low rank matrix completion approach is proposed to jointly classify more
than one action
3. Deep features are proposed for robust classification of multiple actions
within matrix completion framework which can handle data deficiencies.
This thesis starts with the applicability of sparse representations based classifi-
cation methods to the problem of action and gesture recognition. Random projection
is used to reduce the dimensionality of the features. These are referred
to as compressed features in this thesis. The dictionary formed with compressed
features has proved to be efficient for the classification task achieving comparable
results to the state of the art.
Next, this thesis addresses the more promising problem of simultaneous classifi-
cation of multiple actions. This is treated as matrix completion problem under
transduction setting. Matrix completion methods are considered as the generic
extension to the sparse representation methods from compressed sensing point
of view. The features and corresponding labels of the training and test data are
concatenated and placed as columns of a matrix. The unknown test labels would
be the missing entries in that matrix. This is solved using rank minimization
techniques based on the assumption that the underlying complete matrix would
be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities.
This thesis then extends the matrix completion framework for joint classification
of actions to handle the missing features besides missing test labels. In
this context, deep features from a convolutional neural network are proposed.
A convolutional neural network is trained on the training data and features are
extracted from train and test data from the trained network. The performance
of the deep features has proved to be promising when compared to the state of
the art hand-crafted features
A framework for developing motion-based games
Dissertação para obtenção do Grau de Mestre em
Engenharia InformáticaNowadays, whenever one intents to develop an application that allows interaction
through the use of more or less complex gestures, it is necessary to go through a long process. In this process, the gesture recognition system may not obtain high accuracy results, particularly among different users.
Since the total number of applications for mobile systems, like Android and iOS, is
close to a million and a half and is still increasing, it appears essential the development of a platform that abstracts developers from all the low-level gesture gathering and that streamlines the process of developing applications that make use of this kind of interaction, in a standardize way. In this case such was developed for the iOS system.
At the present time, given the existing environment issues, it is ideal to attract the attention, motivate and influence the greatest number of people into having more proenvironmental behaviors. Thus, as a proof of concept for the developed framework,
an educational game was created, using persuasive technology, to influence players’s
behaviors and attitudes in a pro-environmental way.
Therefore, having this idea as a basis, it was also developed a game that is presented
in a public ambient display and can be played by any participant close to the displaywho has a device with iOS mobile system
FastDTW is approximate and Generally Slower than the Algorithm it Approximates
Many time series data mining problems can be solved with repeated use of
distance measure. Examples of such tasks include similarity search, clustering,
classification, anomaly detection and segmentation. For over two decades it has
been known that the Dynamic Time Warping (DTW) distance measure is the best
measure to use for most tasks, in most domains. Because the classic DTW
algorithm has quadratic time complexity, many ideas have been introduced to
reduce its amortized time, or to quickly approximate it. One of the most cited
approximate approaches is FastDTW. The FastDTW algorithm has well over a
thousand citations and has been explicitly used in several hundred research
efforts. In this work, we make a surprising claim. In any realistic data mining
application, the approximate FastDTW is much slower than the exact DTW. This
fact clearly has implications for the community that uses this algorithm:
allowing it to address much larger datasets, get exact results, and do so in
less time
- …