18 research outputs found
Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons
The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate "feature time series". 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations.
We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis
Ratings or pairwise comparisons? An experimental study on scale usability
A series of experiments was run in order to evaluate the usability of two different measurement approaches: ratings and ranks (pairwise comparisons). Respondents were asked to assess perceived characteristics (i.e. height and length) of different physical objects by using either a rating or a ranking scale. An artificial neural network model was built to analyse the ranks and standard statistical tests were applied to analyse the ratings. The results were then statistically compared with actual (real) characteristics of objects (i.e. their real height and length). Both systems for measuring values were found equally valid in projecting reality. Such findings offer some methodological and epistemological insights, as they provide information on the measurement power of each scale in terms of approximating real-life phenomena
Revealing spatio-spectral electroencephalographic dynamics of musical mode and tempo perception by independent component analysis.
BackgroundMusic conveys emotion by manipulating musical structures, particularly musical mode- and tempo-impact. The neural correlates of musical mode and tempo perception revealed by electroencephalography (EEG) have not been adequately addressed in the literature.MethodThis study used independent component analysis (ICA) to systematically assess spatio-spectral EEG dynamics associated with the changes of musical mode and tempo.ResultsEmpirical results showed that music with major mode augmented delta-band activity over the right sensorimotor cortex, suppressed theta activity over the superior parietal cortex, and moderately suppressed beta activity over the medial frontal cortex, compared to minor-mode music, whereas fast-tempo music engaged significant alpha suppression over the right sensorimotor cortex.ConclusionThe resultant EEG brain sources were comparable with previous studies obtained by other neuroimaging modalities, such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). In conjunction with advanced dry and mobile EEG technology, the EEG results might facilitate the translation from laboratory-oriented research to real-life applications for music therapy, training and entertainment in naturalistic environments
Ratings are overrated!
Are ratings of any use in human–computer interaction and user studies at large? If ratings are of limited use, is there a better alternative for quantitative subjective assessment? Beyond the intrinsic shortcomings of human reporting, there are a number of supplementary limitations and fundamental methodological flaws associated with rating-based questionnaires – i.e., questionnaires that ask participants to rate their level of agreement with a given statement, such as a Likert item. While the effect of these pitfalls has been largely downplayed, recent findings from diverse areas of study question the reliability of using ratings. Rank-based questionnaires – i.e., questionnaires that ask participants to rank two or more options – appear as the evident alternative that not only eliminates the core limitations of ratings but also simplifies the use of sound methodologies that yield more reliable models of the underlying reported construct: user emotion, preference, or opinion. This paper solicits recent findings from various disciplines interlinked with psychometrics and offers a quick guide for the use, processing, and analysis of rank-based questionnaires for the unique advantages they offer. The paper challenges the traditional state-of-practice in human–computer interaction and psychometrics directly contributing toward a paradigm shift in subjective reporting.peer-reviewe
Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons
The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate "feature time series". 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations.
We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis
Grounding truth via ordinal annotation
The question of how to best annotate affect within
available content has been a milestone challenge for affective
computing. Appropriate methods and tools addressing that question
can provide better estimations of the ground truth which, in
turn, may lead to more efficient affect detection and more reliable
models of affect. This paper introduces a rank-based real-time
annotation tool, we name AffectRank, and compares it against the
popular rating-based real-time FeelTrace tool through a proofof-
concept video annotation experiment. Results obtained suggest
that the rank-based (ordinal) annotation approach proposed
yields significantly higher inter-rater reliability and, thereby,
approximation of the underlying ground truth. The key findings
of the paper demonstrate that the current dominant practice
in continuous affect annotation via rating-based labeling is
detrimental to advancements in the field of affective computing.The authors would like to thank all annotators that participated
in the reported experiments. We would also like to
thank Gary Hili and Ryan Abela for providing access to the
Eryi dataset. The work is supported, in part, by the EU-funded
FP7 ICT iLearnRW project (project no: 318803).peer-reviewe
The ordinal nature of emotions
Representing computationally everyday emotional
states is a challenging task and, arguably, one of the most fundamental
for affective computing. Standard practice in emotion annotation
is to ask humans to assign an absolute value of intensity
to each emotional behavior they observe. Psychological theories
and evidence from multiple disciplines including neuroscience,
economics and artificial intelligence, however, suggest that the
task of assigning reference-based (relative) values to subjective
notions is better aligned with the underlying representations
than assigning absolute values. Evidence also shows that we
use reference points, or else anchors, against which we evaluate
values such as the emotional state of a stimulus; suggesting
again that ordinal labels are a more suitable way to represent
emotions. This paper draws together the theoretical reasons to
favor relative over absolute labels for representing and annotating
emotion, reviewing the literature across several disciplines. We
go on to discuss good and bad practices of treating ordinal
and other forms of annotation data, and make the case for
preference learning methods as the appropriate approach for
treating ordinal labels. We finally discuss the advantages of
relative annotation with respect to both reliability and validity
through a number of case studies in affective computing, and
address common objections to the use of ordinal data. Overall,
the thesis that emotions are by nature relative is supported by
both theoretical arguments and evidence, and opens new horizons
for the way emotions are viewed, represented and analyzed
computationally.peer-reviewe
Affective Music Information Retrieval
Much of the appeal of music lies in its power to convey emotions/moods and to
evoke them in listeners. In consequence, the past decade witnessed a growing
interest in modeling emotions from musical signals in the music information
retrieval (MIR) community. In this article, we present a novel generative
approach to music emotion modeling, with a specific focus on the
valence-arousal (VA) dimension model of emotion. The presented generative
model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the
subjectivity of emotion perception by the use of probability distributions.
Specifically, it learns from the emotion annotations of multiple subjects a
Gaussian mixture model in the VA space with prior constraints on the
corresponding acoustic features of the training music pieces. Such a
computational framework is technically sound, capable of learning in an online
fashion, and thus applicable to a variety of applications, including
user-independent (general) and user-dependent (personalized) emotion
recognition and emotion-based music retrieval. We report evaluations of the
aforementioned applications of AEG on a larger-scale emotion-annotated corpora,
AMG1608, to demonstrate the effectiveness of AEG and to showcase how
evaluations are conducted for research on emotion-based MIR. Directions of
future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio