155 research outputs found
Towards multimodal affective expression:merging facial expressions and body motion into emotion
Affect recognition plays an important role in human everyday life and it is a substantial way of communication through expressions. Humans can rely on different channels of information to understand the affective messages communicated with others. Similarly, it is expected that an automatic affect recognition system should be able to analyse different types of emotion expressions. In this respect, an important issue to be addressed is the fusion of different channels of expression, taking into account the relationship and correlation across different modalities. In this work, affective facial and bodily motion expressions are addressed as channels for the communication of affect, designed as an emotion recognition system. A probabilistic approach is used to combine features from two modalities by incorporating geometric facial expression features and body motion skeleton-based features. Preliminary results show that the presented approach has potential for automatic emotion recognition and it can be used for human robot interaction
Affective facial expressions recognition for human-robot interaction
Affective facial expression is a key feature of nonverbal behaviour and is considered as a symptom of an internal emotional state. Emotion recognition plays an important role in social communication: human-to-human and also for humanto-robot. Taking this as inspiration, this work aims at the development of a framework able to recognise human emotions through facial expression for human-robot interaction. Features based on facial landmarks distances and angles are extracted to feed a dynamic probabilistic classification framework. The public online dataset Karolinska Directed Emotional Faces (KDEF) [1] is used to learn seven different emotions (e.g. angry, fearful, disgusted, happy, sad, surprised, and neutral) performed by seventy subjects. A new dataset was created in order to record stimulated affect while participants watched video sessions to awaken their emotions, different of the KDEF dataset where participants are actors (i.e. performing expressions when asked to). Offline and on-the-fly tests were carried out: leave-one-out cross validation tests on datasets and on-the-fly tests with human-robot interactions. Results show that the proposed framework can correctly recognise human facial expressions with potential to be used in human-robot interaction scenario
Representation framework of perceived object softness characteristics for active robotic hand exploration
During the last years the principles and demands guiding the design and implementation of robotic platforms are changing. Nowadays, robotic platforms are tendency equipped with a conjugation of multi-modal artificial perception systems (stereo vision, tactile sensing) and complex actuation systems (multi-articulated robotic hands, arms and legs). This artificial perception systems are required by robotic systems to navigate and interact with the environment and persons. This work is focused in the artificial perception systems related with the robotic manipulation strategies used to dexterously interact with deformable objects in the environment.In this context, the robotic dexterous manipulation of objects require that the framework used to represent the characteristics of the deformable manipulated objects should be suitable to receive inputs from multiple exploratory elements(multi- fingered robotic hands) and to progressively update that representation status as long as the exploration progresses on time. The framework should be also designed to incorporate the uncertainty and errors associated to the sensing process in this type of dynamic environments, and to deal with novelty by characterizing objects of new softness characteristics to the system based on the previous knowledge and interactions with a restricted set of reference materials, that constitute the haptic memory of the system. In order to provide to the robotic hands the capability to differentiate deformable objects with distinct softness characteristics and to dexterously manipulate them, this work analyses the principles and strategies used by humans to successfully perform such type of tasks, using predominatelyhaptic information. During the object exploration, the per-ception and discrimination capability of softness characteristics depend on both cutaneous and kinesthetic information by executing press and release movements [2] - active haptic perception. This has been demonstrated byl experiments performed by Srinivasan and Lamotte [5]
A Probabilistic Approach for Human Everyday Activities Recognition using Body Motion from RGB-D Images
In this work, we propose an approach that relies on cues from depth perception from RGB-D images, where features related to human body motion (3D skeleton features) are used on multiple learning classifiers in order to recognize human activities on a benchmark dataset. A Dynamic Bayesian Mixture Model (DBMM) is designed to combine multiple classifier likelihoods into a single form, assigning weights (by an uncertainty measure) to counterbalance the likelihoods as a posterior probability. Temporal information is incorporated in the DBMM by means of prior probabilities, taking into consideration previous probabilistic inference to reinforce current-frame classification. The publicly available Cornell Activity Dataset [1] with 12 different human activities was used to evaluate the proposed approach. Reported results on testing dataset show that our approach overcomes state of the art methods in terms of precision, recall and overall accuracy. The developed work allows the use of activities classification for applications where the human behaviour recognition is important, such as human-robot interaction, assisted living for elderly care, among others
Recommended from our members
British Sign Language Recognition via Late Fusion of Computer Vision and Leap Motion with Transfer Learning to American Sign Language
In this work, we show that a late fusion approach to multimodality in sign language recognition improves the overall ability of the model in comparison to the singular approaches of image classification (88.14%) and Leap Motion data classification (72.73%). With a large synchronous dataset of 18 BSL gestures collected from multiple subjects, two deep neural networks are benchmarked and compared to derive a best topology for each. The Vision model is implemented by a Convolutional Neural Network and optimised Artificial Neural Network, and the Leap Motion model is implemented by an evolutionary search of Artificial Neural Network topology. Next, the two best networks are fused for synchronised processing, which results in a better overall result (94.44%) as complementary features are learnt in addition to the original task. The hypothesis is further supported by application of the three models to a set of completely unseen data where a multimodality approach achieves the best results relative to the single sensor method. When transfer learning with the weights trained via British Sign Language, all three models outperform standard random weight distribution when classifying American Sign Language (ASL), and the best model overall for ASL classification was the transfer learning multimodality approach, which scored 82.55% accuracy
A human activity recognition framework using max-min features and key poses with differential evolution random forests classifier
This paper presents a novel framework for human daily activity recognition that is intended to rely on few training examples evidencing fast training times, making it suitable for real-time applications. The proposed framework starts with a feature extraction stage, where the division of each activity into actions of variable-size, based on key poses, is performed. Each action window is delimited by two consecutive and automatically identified key poses, where static (i.e. geometrical) and max-min dynamic (i.e. temporal) features are extracted. These features are first used to train a random forest (RF) classifier which was tested using the CAD-60 dataset, obtaining relevant overall average results. Then in a second stage, an extension of the RF is proposed, where the differential evolution meta-heuristic algorithm is used, as splitting node methodology. The main advantage of its inclusion is the fact that the differential evolution random forest has no thresholds to tune, but rather a few adjustable parameters with well-defined behavior
Dynamic Bayesian Network for Time-Dependent Classification Problems in Robotics
This chapter discusses the use of dynamic Bayesian networks (DBNs) for time-dependent classification problems in mobile robotics, where Bayesian inference is used to infer the class, or category of interest, given the observed data and prior knowledge. Formulating the DBN as a time-dependent classification problem, and by making some assumptions, a general expression for a DBN is given in terms of classifier priors and likelihoods through the time steps. Since multi-class problems are addressed, and because of the number of time slices in the model, additive smoothing is used to prevent the values of priors from being close to zero. To demonstrate the effectiveness of DBN in time-dependent classification problems, some experimental results are reported regarding semantic place recognition and daily-activity classification
Social activity recognition based on probabilistic merging of skeleton features with proximity priors from RGB-D data
Social activity based on body motion is a key feature for non-verbal and physical behavior defined as function for communicative signal and social interaction between individuals. Social activity recognition is important to study human-human communication and also human-robot interaction. Based on that, this research has threefold goals: (1) recognition of social behavior (e.g. human-human interaction) using a probabilistic approach that merges spatio-temporal features from individual bodies and social features from the relationship between two individuals; (2) learn priors based on physical proximity between individuals during an interaction using proxemics theory to feed a probabilistic ensemble of activity classifiers; and (3) provide a public dataset with RGB-D data of social daily activities including risk situations useful to test approaches for assisted living, since this type of dataset is still missing. Results show that using the proposed approach designed to merge features with different semantics and proximity priors improves the classification performance in terms of precision, recall and accuracy when compared with other approaches that employ alternative strategies
Automatic Detection of Human Interactions from RGB-D Data for Social Activity Classification
We present a system for temporal detection of social interactions. Many of the works until now have succeeded in recognising activities from clipped videos in datasets, but for robotic applications, it is important to be able to move to more realistic data. For this reason, the proposed approach temporally detects intervals where individual or social activity is occurring. Recognition of human activities is a key feature for analysing the human behaviour. In particular, recognition of social activities is useful to trigger human-robot interactions or to detect situations of potential danger. Based on that, this research has three goals: (1) define a new set of descriptors, which are able to characterise human interactions; (2) develop a computational model to segment temporal intervals with social interaction or individual behaviour; (3) provide a public dataset with RGB-D data with continuous stream of individual activities and social interactions. Results show that the proposed approach attained relevant performance with temporal segmentation of social activities
- …