Search CORE

4 research outputs found

How Well Can a Music Emotion Recognition System Predict the Emotional Responses of Participants?

Author: Simon Dixon
Yading Song
Publication venue
Publication date
Field of study

(Abstract to follow

ZENODO

Creating positive atmosphere and emotion in an office-like environment: A methodology for the lit environment

Author: Kim DH
Mansfield K
Publication venue
Publication date: 01/05/2021
Field of study

This study investigated whether positive human emotion can be set as a goal through the lighting design process. The study first used a model of emotion – the circumplex model of affect – to characterise four different emotion states (liveliness, relaxation, tense and gloom). Second, five professional lighting designers were recruited and were asked to devise the concepts of each lively and relaxing workspace lit environment. A total of fifteen lighting scenarios with the intention to explore the four emotion states were configured and their emotional effect was investigated through a controlled experiment via a self-reported questionnaire with 42 participants (within-subject design). The results indicate that positive emotions of liveliness can be cued under two lighting settings and that of relaxation under three lighting settings of varying colour temperatures and light distribution. There was also a promising link between perceived atmosphere and human emotion, indicating that atmosphere could be a predictor for human emotion

UCL Discovery

Music Emotion Detection Using Hierarchical Sparse Kernel Machines

Author: Chang-Hong Lin
Ernestasia Siahaan
Jia-Ching Wang
Yu-Hao Chin
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

For music emotion detection, this paper presents a music emotion verification system based on hierarchical sparse kernel machines. With the proposed system, we intend to verify if a music clip possesses happiness emotion or not. There are two levels in the hierarchical sparse kernel machines. In the first level, a set of acoustical features are extracted, and principle component analysis (PCA) is implemented to reduce the dimension. The acoustical features are utilized to generate the first-level decision vector, which is a vector with each element being a significant value of an emotion. The significant values of eight main emotional classes are utilized in this paper. To calculate the significant value of an emotion, we construct its 2-class SVM with calm emotion as the global (non-target) side of the SVM. The probability distributions of the adopted acoustical features are calculated and the probability product kernel is applied in the first-level SVMs to obtain first-level decision vector feature. In the second level of the hierarchical system, we merely construct a 2-class relevance vector machine (RVM) with happiness as the target side and other emotions as the background side of the RVM. The first-level decision vector is used as the feature with conventional radial basis function kernel. The happiness verification threshold is built on the probability value. In the experimental results, the detection error tradeoff (DET) curve shows that the proposed system has a good performance on verifying if a music clip reveals happiness emotion

Crossref

Directory of Open Access Journals

PubMed Central

Improving the Generalizability of Speech Emotion Recognition: Methods for Handling Data and Label Variability

Author: Zhang Biqiao
Publication venue
Publication date: 01/01/2018
Field of study

Emotion is an essential component in our interaction with others. It transmits information that helps us interpret the content of what others say. Therefore, detecting emotion from speech is an important step towards enabling machine understanding of human behaviors and intentions. Researchers have demonstrated the potential of emotion recognition in areas such as interactive systems in smart homes and mobile devices, computer games, and computational medical assistants. However, emotion communication is variable: individuals may express emotion in a manner that is uniquely their own; different speech content and environments may shape how emotion is expressed and recorded; individuals may perceive emotional messages differently. Practically, this variability is reflected in both the audio-visual data and the labels used to create speech emotion recognition (SER) systems. SER systems must be robust and generalizable to handle the variability effectively. The focus of this dissertation is on the development of speech emotion recognition systems that handle variability in emotion communications. We break the dissertation into three parts, according to the type of variability we address: (I) in the data, (II) in the labels, and (III) in both the data and the labels. Part I: The first part of this dissertation focuses on handling variability present in data. We approximate variations in environmental properties and expression styles by corpus and gender of the speakers. We find that training on multiple corpora and controlling for the variability in gender and corpus using multi-task learning result in more generalizable models, compared to the traditional single-task models that do not take corpus and gender variability into account. Another source of variability present in the recordings used in SER is the phonetic modulation of acoustics. On the other hand, phonemes also provide information about the emotion expressed in speech content. We discover that we can make more accurate predictions of emotion by explicitly considering both roles of phonemes. Part II: The second part of this dissertation addresses variability present in emotion labels, including the differences between emotion expression and perception, and the variations in emotion perception. We discover that it is beneficial to jointly model both the perception of others and how one perceives one’s own expression, compared to focusing on either one. Further, we show that the variability in emotion perception is a modelable signal and can be captured using probability distributions that describe how groups of evaluators perceive emotional messages. Part III: The last part of this dissertation presents methods that handle variability in both data and labels. We reduce the data variability due to non-emotional factors using deep metric learning and model the variability in emotion perception using soft labels. We propose a family of loss functions and show that by pairing examples that potentially vary in expression styles and lexical content and preserving the real-valued emotional similarity between them, we develop systems that generalize better across datasets and are more robust to over-training. These works demonstrate the importance of considering data and label variability in the creation of robust and generalizable emotion recognition systems. We conclude this dissertation with the following future directions: (1) the development of real-time SER systems; (2) the personalization of general SER systems.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147639/1/didizbq_1.pd

Deep Blue Documents at the University of Michigan