16,738 research outputs found
Spontaneous Subtle Expression Detection and Recognition based on Facial Strain
Optical strain is an extension of optical flow that is capable of quantifying
subtle changes on faces and representing the minute facial motion intensities
at the pixel level. This is computationally essential for the relatively new
field of spontaneous micro-expression, where subtle expressions can be
technically challenging to pinpoint. In this paper, we present a novel method
for detecting and recognizing micro-expressions by utilizing facial optical
strain magnitudes to construct optical strain features and optical strain
weighted features. The two sets of features are then concatenated to form the
resultant feature histogram. Experiments were performed on the CASME II and
SMIC databases. We demonstrate on both databases, the usefulness of optical
strain information and more importantly, that our best approaches are able to
outperform the original baseline results for both detection and recognition
tasks. A comparison of the proposed method with other existing spatio-temporal
feature extraction approaches is also presented.Comment: 21 pages (including references), single column format, accepted to
Signal Processing: Image Communication journa
Tune in to your emotions: a robust personalized affective music player
The emotional power of music is exploited in a personalized affective music player (AMP) that selects music for mood enhancement. A biosignal approach is used to measure listeners’ personal emotional reactions to their own music as input for affective user models. Regression and kernel density estimation are applied to model the physiological changes the music elicits. Using these models, personalized music selections based on an affective goal state can be made. The AMP was validated in real-world trials over the course of several weeks. Results show that our models can cope with noisy situations and handle large inter-individual differences in the music domain. The AMP augments music listening where its techniques enable automated affect guidance. Our approach provides valuable insights for affective computing and user modeling, for which the AMP is a suitable carrier application
Collective efficacy belief, within-group agreement, and performance quality among instrumental chamber ensembles
We examined collective efficacy beliefs, including
levels of within
-
group agreement and
correlation with performance quality, of instrumental chamber ensembles (70 musicians,
representing 18 ensembles). Participants were drawn from collegiate programs and intensive
summer music festivals located in the No
rthwestern and Western regions of the United States.
Individuals completed a 5
-
item survey gauging confidence in their group’s performance abilities;
each ensemble’s aggregated results represented its collective efficacy score. Ensembles provided
a video
-
r
ecorded performance excerpt that was rated by a panel of four string specialists.
Analyses revealed moderately strong levels of collective efficacy belief and uniformly
high within
-
group agreement. There was a significant, moderately strong correlation bet
ween
collective efficacy belief and within
-
group agreement (
r
S
= .67,
p
< .01). We found no
relationship between collective efficacy belief and performance quality across the total sample,
but those factors correlated significantly for festival
-
based ensem
bles (
r
S
= .82,
p
< .05).
Reliability estimates suggest that our collective efficacy survey may be suitable for use with
string chamber ensembles. Correlational findings provide partial support for the theorized link
between efficacy belief and performance
quality in chamber music settings, suggesting the
importance for music educators to ensure that positive efficacy beliefs become well founded
through quality instruction
EmoNets: Multimodal deep learning approaches for emotion recognition in video
The task of the emotion recognition in the wild (EmotiW) Challenge is to
assign one of seven emotions to short video clips extracted from Hollywood
style movies. The videos depict acted-out emotions under realistic conditions
with a large degree of variation in attributes such as pose and illumination,
making it worthwhile to explore approaches which consider combinations of
features from multiple modalities for label assignment. In this paper we
present our approach to learning several specialist models using deep learning
techniques, each focusing on one modality. Among these are a convolutional
neural network, focusing on capturing visual information in detected faces, a
deep belief net focusing on the representation of the audio stream, a K-Means
based "bag-of-mouths" model, which extracts visual features around the mouth
region and a relational autoencoder, which addresses spatio-temporal aspects of
videos. We explore multiple methods for the combination of cues from these
modalities into one common classifier. This achieves a considerably greater
accuracy than predictions from our strongest single-modality classifier. Our
method was the winning submission in the 2013 EmotiW challenge and achieved a
test set accuracy of 47.67% on the 2014 dataset
- …