255 research outputs found

    Motion-capture-based hand gesture recognition for computing and control

    Get PDF
    This dissertation focuses on the study and development of algorithms that enable the analysis and recognition of hand gestures in a motion capture environment. Central to this work is the study of unlabeled point sets in a more abstract sense. Evaluations of proposed methods focus on examining their generalization to users not encountered during system training. In an initial exploratory study, we compare various classification algorithms based upon multiple interpretations and feature transformations of point sets, including those based upon aggregate features (e.g. mean) and a pseudo-rasterization of the capture space. We find aggregate feature classifiers to be balanced across multiple users but relatively limited in maximum achievable accuracy. Certain classifiers based upon the pseudo-rasterization performed best among tested classification algorithms. We follow this study with targeted examinations of certain subproblems. For the first subproblem, we introduce the a fortiori expectation-maximization (AFEM) algorithm for computing the parameters of a distribution from which unlabeled, correlated point sets are presumed to be generated. Each unlabeled point is assumed to correspond to a target with independent probability of appearance but correlated positions. We propose replacing the expectation phase of the algorithm with a Kalman filter modified within a Bayesian framework to account for the unknown point labels which manifest as uncertain measurement matrices. We also propose a mechanism to reorder the measurements in order to improve parameter estimates. In addition, we use a state-of-the-art Markov chain Monte Carlo sampler to efficiently sample measurement matrices. In the process, we indirectly propose a constrained k-means clustering algorithm. Simulations verify the utility of AFEM against a traditional expectation-maximization algorithm in a variety of scenarios. In the second subproblem, we consider the application of positive definite kernels and the earth mover\u27s distance (END) to our work. Positive definite kernels are an important tool in machine learning that enable efficient solutions to otherwise difficult or intractable problems by implicitly linearizing the problem geometry. We develop a set-theoretic interpretation of ENID and propose earth mover\u27s intersection (EMI). a positive definite analog to ENID. We offer proof of EMD\u27s negative definiteness and provide necessary and sufficient conditions for ENID to be conditionally negative definite, including approximations that guarantee negative definiteness. In particular, we show that ENID is related to various min-like kernels. We also present a positive definite preserving transformation that can be applied to any kernel and can be used to derive positive definite EMD-based kernels, and we show that the Jaccard index is simply the result of this transformation applied to set intersection. Finally, we evaluate kernels based on EMI and the proposed transformation versus ENID in various computer vision tasks and show that END is generally inferior even with indefinite kernel techniques. Finally, we apply deep learning to our problem. We propose neural network architectures for hand posture and gesture recognition from unlabeled marker sets in a coordinate system local to the hand. As a means of ensuring data integrity, we also propose an extended Kalman filter for tracking the rigid pattern of markers on which the local coordinate system is based. We consider fixed- and variable-size architectures including convolutional and recurrent neural networks that accept unlabeled marker input. We also consider a data-driven approach to labeling markers with a neural network and a collection of Kalman filters. Experimental evaluations with posture and gesture datasets show promising results for the proposed architectures with unlabeled markers, which outperform the alternative data-driven labeling method

    Human-Centric Machine Vision

    Get PDF
    Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans

    ENHANCEMENTS TO THE MODIFIED COMPOSITE PATTERN METHOD OF STRUCTURED LIGHT 3D CAPTURE

    Get PDF
    The use of structured light illumination techniques for three-dimensional data acquisition is, in many cases, limited to stationary subjects due to the multiple pattern projections needed for depth analysis. Traditional Composite Pattern (CP) multiplexing utilizes sinusoidal modulation of individual projection patterns to allow numerous patterns to be combined into a single image. However, due to demodulation artifacts, it is often difficult to accurately recover the subject surface contour information. On the other hand, if one were to project an image consisting of many thin, identical stripes onto the surface, one could, by isolating each stripe center, recreate a very accurate representation of surface contour. But in this case, recovery of depth information via triangulation would be quite difficult. The method described herein, Modified Composite Pattern (MCP), is a conjunction of these two concepts. Combining a traditional Composite Pattern multiplexed projection image with a pattern of thin stripes allows for accurate surface representation combined with non-ambiguous identification of projection pattern elements. In this way, it is possible to recover surface depth characteristics using only a single structured light projection. The technique described utilizes a binary structured light projection sequence (consisting of four unique images) modulated according to Composite Pattern methodology. A stripe pattern overlay is then applied to the pattern. Upon projection and imaging of the subject surface, the stripe pattern is isolated, and the composite pattern information demodulated and recovered, allowing for 3D surface representation. In this research, the MCP technique is considered specifically in the context of a Hidden Markov Process Model. Updated processing methodologies explained herein make use of the Viterbi algorithm for the purpose of optimal analysis of MCP encoded images. Additionally, we techniques are introduced which, when implemented, allow fully automated processing of the Modified Composite Pattern image

    Multi-modal on-body sensing of human activities

    Get PDF
    Increased usage and integration of state-of-the-art information technology in our everyday work life aims at increasing the working efficiency. Due to unhandy human-computer-interaction methods this progress does not always result in increased efficiency, for mobile workers in particular. Activity recognition based contextual computing attempts to balance this interaction deficiency. This work investigates wearable, on-body sensing techniques on their applicability in the field of human activity recognition. More precisely we are interested in the spotting and recognition of so-called manipulative hand gestures. In particular the thesis focuses on the question whether the widely used motion sensing based approach can be enhanced through additional information sources. The set of gestures a person usually performs on a specific place is limited -- in the contemplated production and maintenance scenarios in particular. As a consequence this thesis investigates whether the knowledge about the user's hand location provides essential hints for the activity recognition process. In addition, manipulative hand gestures -- due to their object manipulating character -- typically start in the moment the user's hand reaches a specific place, e.g. a specific part of a machinery. And the gestures most likely stop in the moment the hand leaves the position again. Hence this thesis investigates whether hand location can help solving the spotting problem. Moreover, as user-independence is still a major challenge in activity recognition, this thesis investigates location context as a possible key component in a user-independent recognition system. We test a Kalman filter based method to blend absolute position readings with orientation readings based on inertial measurements. A filter structure is suggested which allows up-sampling of slow absolute position readings, and thus introduces higher dynamics to the position estimations. In such a way the position measurement series is made aware of wrist motions in addition to the wrist position. We suggest location based gesture spotting and recognition approaches. Various methods to model the location classes used in the spotting and recognition stages as well as different location distance measures are suggested and evaluated. In addition a rather novel sensing approach in the field of human activity recognition is studied. This aims at compensating drawbacks of the mere motion sensing based approach. To this end we develop a wearable hardware architecture for lower arm muscular activity measurements. The sensing hardware based on force sensing resistors is designed to have a high dynamic range. In contrast to preliminary attempts the proposed new design makes hardware calibration unnecessary. Finally we suggest a modular and multi-modal recognition system; modular with respect to sensors, algorithms, and gesture classes. This means that adding or removing a sensor modality or an additional algorithm has little impact on the rest of the recognition system. Sensors and algorithms used for spotting and recognition can be selected and fine-tuned separately for each single activity. New activities can be added without impact on the recognition rates of the other activities

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

    Automatic Recognition and Generation of Affective Movements

    Get PDF
    Body movements are an important non-verbal communication medium through which affective states of the demonstrator can be discerned. For machines, the capability to recognize affective expressions of their users and generate appropriate actuated responses with recognizable affective content has the potential to improve their life-like attributes and to create an engaging, entertaining, and empathic human-machine interaction. This thesis develops approaches to systematically identify movement features most salient to affective expressions and to exploit these features to design computational models for automatic recognition and generation of affective movements. The proposed approaches enable 1) identifying which features of movement convey affective expressions, 2) the automatic recognition of affective expressions from movements, 3) understanding the impact of kinematic embodiment on the perception of affective movements, and 4) adapting pre-defined motion paths in order to "overlay" specific affective content. Statistical learning and stochastic modeling approaches are leveraged, extended, and adapted to derive a concise representation of the movements that isolates movement features salient to affective expressions and enables efficient and accurate affective movement recognition and generation. In particular, the thesis presents two new approaches to fixed-length affective movement representation based on 1) functional feature transformation, and 2) stochastic feature transformation (Fisher scores). The resulting representations are then exploited for recognition of affective expressions in movements and for salient movement feature identification. For functional representation, the thesis adapts dimensionality reduction techniques (namely, principal component analysis (PCA), Fisher discriminant analysis, Isomap) for functional datasets and applies the resulting reduction techniques to extract a minimal set of features along which affect-specific movements are best separable. Furthermore, the centroids of affect-specific clusters of movements in the resulting functional PCA subspace along with the inverse mapping of functional PCA are used to generate prototypical movements for each affective expression. The functional discriminative modeling is however limited to cases where affect-specific movements also have similar kinematic trajectories and does not address the interpersonal and stochastic variations inherent to bodily expression of affect. To account for these variations, the thesis presents a novel affective movement representation in terms of stochastically-transformed features referred to as Fisher scores. The Fisher scores are derived from affect-specific hidden Markov model encoding of the movements and exploited to discriminate between different affective expressions using a support vector machine (SVM) classification. Furthermore, the thesis presents a new approach for systematic identification of a minimal set of movement features most salient to discriminating between different affective expressions. The salient features are identified by mapping Fisher scores to a low-dimensional subspace where dependencies between the movements and their affective labels are maximized. This is done by maximizing Hilbert Schmidt independence criterion between the Fisher score representation of movements and their affective labels. The resulting subspace forms a suitable basis for affective movement recognition using nearest neighbour classification and retains the high recognition rates achieved by SVM classification in the Fisher score space. The dimensions of the subspace form a minimal set of salient features and are used to explore the movement kinematic and dynamic cues that connote affective expressions. Furthermore, the thesis proposes the use of movement notation systems from the dance community (specifically, the Laban system) for abstract coding and computational analysis of movement. A quantification approach for Laban Effort and Shape is proposed and used to develop a new computational model for affective movement generation. Using the Laban Effort and Shape components, the proposed generation approach searches a labeled dataset for movements that are kinematically similar to a desired motion path and convey a target emotion. A hidden Markov model of the identified movements is obtained and used with the desired motion path in the Viterbi state estimation. The estimated state sequence is then used to generate a novel movement that is a version of the desired motion path, modulated to convey the target emotion. Various affective human movement corpora are used to evaluate and demonstrate the efficacy of the developed approaches for the automatic recognition and generation of affective expressions in movements. Finally, the thesis assesses the human perception of affective movements and the impact of display embodiment and the observer's gender on the affective movement perception via user studies in which participants rate the expressivity of synthetically-generated and human-generated affective movements animated on anthropomorphic and non-anthropomorphic embodiments. The user studies show that the human perception of affective movements is mainly shaped by intended emotions, and that the display embodiment and the observer's gender can significantly impact the perception of affective movements

    On computational models of animal movement behaviour

    Get PDF
    Finding structures and patterns in animal movement data is essential towards understanding a variety of behavioural phenomena, as well as shedding light into the relationships between animals among conspecifics and across different taxa with respect to their environments. The recent advances in the field of computational intelligence coupled with the proliferation of low-cost telemetry devices have made the gathering and analyses of behavioural data of animals in their natural habitat and in a wide range of context possible with aid of devices such as Global Positioning System (GPS). The sensory input that animals receive from their environment, and the corresponding motor output, as well as the neural basis of this relationship most especially as it affects movement, encode a lot of information regarding the welfare and survival of these animals and other organisms in nature's ecosystem. This has huge implications in the area of biodiversity monitoring, global health and understanding disease progression. Encoding, decoding and quantifying these functional relationships however can be challenging, boring and labour intensive. Artificial intelligence holds promise in solving some of these problems and even stand to benefit as understanding natural intelligence for instance can aid in the advancement of artificial intelligence. In this thesis, I investigate and propose several computational methods leveraging information theoretic metrics and also modern machine learning methods including supervised, unsupervised and a novel combination of both towards understanding, predicting, forecasting and quantifying a variety of animal movement phenomena at different time scales across different taxa and species. Most importantly the models proposed in this thesis tackle important problems bordering on human and animal welfare as well as their intersection. Crucially, I investigate several information theoretic metrics towards mining animal movement data, after which I propose machine learning and statistical techniques for automatically quantifying abnormal movement behaviour in sheep with Batten disease using unsupervised methods. In addition, I propose a predictive model capable of forecasting migration patterns in Turkey vulture as well as their stop-over decisions using bidirectional recurrent neural networks. And finally, I propose a model of sheep movement behaviour in a flock leveraging insights in cognitive neuroscience with modern deep learning models. Overall, the models of animal movement behaviour developed in this thesis are useful to a wide range of scientists in the field of neuroscience, ethology, veterinary science, conservation and public health. Although these models have been designed for understanding and predicting animal movement behaviour, in a lot of cases they scale easily into other domains such as human behaviour modelling with little modifications. I highlight the importance of continuous research in developing computational models of animal movement behaviour towards improving our understanding of nature in relation to the interaction between animals and their environments
    • …
    corecore