188 research outputs found

    Multimodal Affect Recognition: Current Approaches and Challenges

    Get PDF
    Many factors render multimodal affect recognition approaches appealing. First, humans employ a multimodal approach in emotion recognition. It is only fitting that machines, which attempt to reproduce elements of the human emotional intelligence, employ the same approach. Second, the combination of multiple-affective signals not only provides a richer collection of data but also helps alleviate the effects of uncertainty in the raw signals. Lastly, they potentially afford us the flexibility to classify emotions even when one or more source signals are not possible to retrieve. However, the multimodal approach presents challenges pertaining to the fusion of individual signals, dimensionality of the feature space, and incompatibility of collected signals in terms of time resolution and format. In this chapter, we explore the aforementioned challenges while presenting the latest scholarship on the topic. Hence, we first discuss the various modalities used in affect classification. Second, we explore the fusion of modalities. Third, we present publicly accessible multimodal datasets designed to expedite work on the topic by eliminating the laborious task of dataset collection. Fourth, we analyze representative works on the topic. Finally, we summarize the current challenges in the field and provide ideas for future research directions

    Real Time Animation of Virtual Humans: A Trade-off Between Naturalness and Control

    Get PDF
    Virtual humans are employed in many interactive applications using 3D virtual environments, including (serious) games. The motion of such virtual humans should look realistic (or ‘natural’) and allow interaction with the surroundings and other (virtual) humans. Current animation techniques differ in the trade-off they offer between motion naturalness and the control that can be exerted over the motion. We show mechanisms to parametrize, combine (on different body parts) and concatenate motions generated by different animation techniques. We discuss several aspects of motion naturalness and show how it can be evaluated. We conclude by showing the promise of combinations of different animation paradigms to enhance both naturalness and control

    2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images

    Get PDF
    We present a technique for estimating the spatial layout of humans in still images—the position of the head, torso and arms. The theme we explore is that once a person is localized using an upper body detector, the search for their body parts can be considerably simplified using weak constraints on position and appearance arising from that detection. Our approach is capable of estimating upper body pose in highly challenging uncontrolled images, without prior knowledge of background, clothing, lighting, or the location and scale of the person in the image. People are only required to be upright and seen from the front or the back (not side). We evaluate the stages of our approach experimentally using ground truth layout annotation on a variety of challenging material, such as images from the PASCAL VOC 2008 challenge and video frames from TV shows and feature films. We also propose and evaluate techniques for searching a video dataset for people in a specific pose. To this end, we develop three new pose descriptors and compare their classification and retrieval performance to two baselines built on state-of-the-art object detection model

    Understanding human motion : recognition and retrieval of human activities

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2008.Thesis (Ph.D.) -- Bilkent University, 2008.Includes bibliographical references leaves 111-121.Within the ever-growing video archives is a vast amount of interesting information regarding human action/activities. In this thesis, we approach the problem of extracting this information and understanding human motion from a computer vision perspective. We propose solutions for two distinct scenarios, ordered from simple to complex. In the first scenario, we deal with the problem of single action recognition in relatively simple settings. We believe that human pose encapsulates many useful clues for recognizing the ongoing action, and we can represent this shape information for 2D single actions in very compact forms, before going into details of complex modeling. We show that high-accuracy single human action recognition is possible 1) using spatial oriented histograms of rectangular regions when the silhouette is extractable, 2) using the distribution of boundary-fitted lines when the silhouette information is missing. We demonstrate that, inside videos, we can further improve recognition accuracy by means of adding local and global motion information. We also show that within a discriminative framework, shape information is quite useful even in the case of human action recognition in still images. Our second scenario involves recognition and retrieval of complex human activities within more complicated settings, like the presence of changing background and viewpoints. We describe a method of representing human activities in 3D that allows a collection of motions to be queried without examples, using a simple and effective query language. Our approach is based on units of activity at segments of the body, that can be composed across time and across the body to produce complex queries. The presence of search units is inferred automatically by tracking the body, lifting the tracks to 3D and comparing to models trained using motion capture data. Our models of short time scale limb behaviour are built using labelled motion capture set. Our query language makes use of finite state automata and requires simple text encoding and no visual examples. We show results for a large range of queries applied to a collection of complex motion and activity. We compare with discriminative methods applied to tracker data; our method offers significantly improved performance. We show experimental evidence that our method is robust to view direction and is unaffected by some important changes of clothing.İkizler, NazlıPh.D

    Expressive movement generation with machine learning

    Get PDF
    Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation

    Automatic Recognition and Generation of Affective Movements

    Get PDF
    Body movements are an important non-verbal communication medium through which affective states of the demonstrator can be discerned. For machines, the capability to recognize affective expressions of their users and generate appropriate actuated responses with recognizable affective content has the potential to improve their life-like attributes and to create an engaging, entertaining, and empathic human-machine interaction. This thesis develops approaches to systematically identify movement features most salient to affective expressions and to exploit these features to design computational models for automatic recognition and generation of affective movements. The proposed approaches enable 1) identifying which features of movement convey affective expressions, 2) the automatic recognition of affective expressions from movements, 3) understanding the impact of kinematic embodiment on the perception of affective movements, and 4) adapting pre-defined motion paths in order to "overlay" specific affective content. Statistical learning and stochastic modeling approaches are leveraged, extended, and adapted to derive a concise representation of the movements that isolates movement features salient to affective expressions and enables efficient and accurate affective movement recognition and generation. In particular, the thesis presents two new approaches to fixed-length affective movement representation based on 1) functional feature transformation, and 2) stochastic feature transformation (Fisher scores). The resulting representations are then exploited for recognition of affective expressions in movements and for salient movement feature identification. For functional representation, the thesis adapts dimensionality reduction techniques (namely, principal component analysis (PCA), Fisher discriminant analysis, Isomap) for functional datasets and applies the resulting reduction techniques to extract a minimal set of features along which affect-specific movements are best separable. Furthermore, the centroids of affect-specific clusters of movements in the resulting functional PCA subspace along with the inverse mapping of functional PCA are used to generate prototypical movements for each affective expression. The functional discriminative modeling is however limited to cases where affect-specific movements also have similar kinematic trajectories and does not address the interpersonal and stochastic variations inherent to bodily expression of affect. To account for these variations, the thesis presents a novel affective movement representation in terms of stochastically-transformed features referred to as Fisher scores. The Fisher scores are derived from affect-specific hidden Markov model encoding of the movements and exploited to discriminate between different affective expressions using a support vector machine (SVM) classification. Furthermore, the thesis presents a new approach for systematic identification of a minimal set of movement features most salient to discriminating between different affective expressions. The salient features are identified by mapping Fisher scores to a low-dimensional subspace where dependencies between the movements and their affective labels are maximized. This is done by maximizing Hilbert Schmidt independence criterion between the Fisher score representation of movements and their affective labels. The resulting subspace forms a suitable basis for affective movement recognition using nearest neighbour classification and retains the high recognition rates achieved by SVM classification in the Fisher score space. The dimensions of the subspace form a minimal set of salient features and are used to explore the movement kinematic and dynamic cues that connote affective expressions. Furthermore, the thesis proposes the use of movement notation systems from the dance community (specifically, the Laban system) for abstract coding and computational analysis of movement. A quantification approach for Laban Effort and Shape is proposed and used to develop a new computational model for affective movement generation. Using the Laban Effort and Shape components, the proposed generation approach searches a labeled dataset for movements that are kinematically similar to a desired motion path and convey a target emotion. A hidden Markov model of the identified movements is obtained and used with the desired motion path in the Viterbi state estimation. The estimated state sequence is then used to generate a novel movement that is a version of the desired motion path, modulated to convey the target emotion. Various affective human movement corpora are used to evaluate and demonstrate the efficacy of the developed approaches for the automatic recognition and generation of affective expressions in movements. Finally, the thesis assesses the human perception of affective movements and the impact of display embodiment and the observer's gender on the affective movement perception via user studies in which participants rate the expressivity of synthetically-generated and human-generated affective movements animated on anthropomorphic and non-anthropomorphic embodiments. The user studies show that the human perception of affective movements is mainly shaped by intended emotions, and that the display embodiment and the observer's gender can significantly impact the perception of affective movements

    Human body tracking and pose estimation from monocular image sequences

    Get PDF
    This thesis describes a bottom-up approach to estimating human pose over time based on monocular views with no restriction on human activities,Three approaches are proposed to address the weaknesses of existing approaches, including building a specific appearance model using clustering,utilising both the generic and specific appearance models in the estimation, and building an uncontaminated appearance model by removing backgroundpixels from the training samples. Experimental results show that the proposed system outperforms existing system significantly
    corecore