113,836 research outputs found
Recommended from our members
Automatic affective dimension recognition from naturalistic facial expressions based on wavelet filtering and PLS regression
Automatic affective dimension recognition from facial expression continuously in naturalistic contexts is a very challenging research topic but very important in human-computer interaction. In this paper, an automatic recognition system was proposed to predict the affective dimensions such as Arousal, Valence and Dominance continuously in naturalistic facial expression videos. Firstly, visual and vocal features are extracted from image frames and audio segments in facial expression videos. Secondly, a wavelet transform based digital filtering method is applied to remove the irrelevant noise information in the feature space. Thirdly, Partial Least Squares regression is used to predict the affective dimensions from both video and audio modalities. Finally, two modalities are combined to boost overall performance in the decision fusion process. The proposed method is tested in the fourth international Audio/Visual Emotion Recognition Challenge (AVEC2014) dataset and compared to other state-of-the-art methods in the affect recognition sub-challenge with a good performance
Affect Recognition in Ads with Application to Computational Advertising
Advertisements (ads) often include strongly emotional content to leave a
lasting impression on the viewer. This work (i) compiles an affective ad
dataset capable of evoking coherent emotions across users, as determined from
the affective opinions of five experts and 14 annotators; (ii) explores the
efficacy of convolutional neural network (CNN) features for encoding emotions,
and observes that CNN features outperform low-level audio-visual emotion
descriptors upon extensive experimentation; and (iii) demonstrates how enhanced
affect prediction facilitates computational advertising, and leads to better
viewing experience while watching an online video stream embedded with ads
based on a study involving 17 users. We model ad emotions based on subjective
human opinions as well as objective multimodal features, and show how
effectively modeling ad emotions can positively impact a real-life application.Comment: Accepted at the ACM International Conference on Multimedia (ACM MM)
201
Fusion of Learned Multi-Modal Representations and Dense Trajectories for Emotional Analysis in Videos
When designing a video affective content analysis algorithm, one of the most important steps is the selection of discriminative features for the effective representation of video segments. The majority of existing affective content analysis methods either use low-level audio-visual features or generate handcrafted higher level representations based on these low-level features. We propose in this work to use deep learning methods, in particular convolutional neural networks (CNNs), in order to automatically learn and extract mid-level representations from raw data. To this end, we exploit the audio and visual modality of videos by employing Mel-Frequency Cepstral Coefficients (MFCC) and color values in the HSV color space. We also incorporate dense trajectory based motion features in order to further enhance the performance of the analysis. By means of multi-class support vector machines (SVMs) and fusion mechanisms, music video clips are classified into one of four affective categories representing the four quadrants of the Valence-Arousal (VA) space. Results obtained on a subset of the DEAP dataset show (1) that higher level representations perform better than low-level features, and (2) that incorporating motion information leads to a notable performance gain, independently from the chosen representation
- …