Search CORE

17,253 research outputs found

Exploring Language-Independent Emotional Acoustic Features via Feature Selection

Author: Chen Ke
Shaukat Arslan
Publication venue
Publication date: 08/08/2010
Field of study

We propose a novel feature selection strategy to discover language-independent acoustic features that tend to be responsible for emotions regardless of languages, linguistics and other factors. Experimental results suggest that the language-independent feature subset discovered yields the performance comparable to the full feature set on various emotional speech corpora.Comment: 15 pages, 2 figures, 6 table

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Speech-based recognition of self-reported and observed emotion in a dimensional space

Author: Jong Franciska M.G. de
Leeuwen David A. van
Truong Khiet P.
Publication venue: Elsevier
Publication date: 01/01/2012
Field of study

The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings and look at how differences between these two types of ratings affect the development and performance of automatic emotion recognizers developed with these ratings. A dimensional approach to emotion modeling is adopted: the ratings are based on continuous arousal and valence scales. We describe the TNO-Gaming Corpus that contains spontaneous vocal and facial expressions elicited via a multiplayer videogame and that includes emotion annotations obtained via self-report and observation by outside observers. Comparisons show that there are discrepancies between self-reported and observed emotion ratings which are also reflected in the performance of the emotion recognizers developed. Using Support Vector Regression in combination with acoustic and textual features, recognizers of arousal and valence are developed that can predict points in a 2-dimensional arousal-valence space. The results of these recognizers show that the self-reported emotion is much harder to recognize than the observed emotion, and that averaging ratings from multiple observers improves performance

Crossref

Radboud Repository

University of Twente Research Information

Automatic Emotion Recognition from Mandarin Speech

Author: Gu Yu
Publication venue: [s.n.]
Publication date: 01/01/2018
Field of study

Tilburg University Repository

The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning

Author: Daniele Casali
Emilia Parada-Cabaleiro
Giovanni Costantini
Valerio Cesarini
Publication venue: country:CH
Publication date: 01/03/2022
Field of study

Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko’s discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER

Directory of Open Access Journals

PubMed Central

ART

Current Challenges and Visions in Music Recommender Systems Research

Author: Chen Ching-Wei
Deldjoo Yashar
Elahi Mehdi
Schedl Markus
Zamani Hamed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2018
Field of study

Music recommender systems (MRS) have experienced a boom in recent years, thanks to the emergence and success of online streaming services, which nowadays make available almost all music in the world at the user's fingertip. While today's MRS considerably help users to find interesting music in these huge catalogs, MRS research is still facing substantial challenges. In particular when it comes to build, incorporate, and evaluate recommendation strategies that integrate information beyond simple user--item interactions or content-based descriptors, but dig deep into the very essence of listener needs, preferences, and intentions, MRS research becomes a big endeavor and related publications quite sparse. The purpose of this trends and survey article is twofold. We first identify and shed light on what we believe are the most pressing challenges MRS research is facing, from both academic and industry perspectives. We review the state of the art towards solving these challenges and discuss its limitations. Second, we detail possible future directions and visions we contemplate for the further evolution of the field. The article should therefore serve two purposes: giving the interested reader an overview of current challenges in MRS research and providing guidance for young researchers by identifying interesting, yet under-researched, directions in the field

arXiv.org e-Print Archive

JKU | ePub

Auditory communication in domestic dogs: vocal signalling in the extended social environment of a companion animal

Author: Adachi
Adachi
Archer
Ashdown
Aubergé
August
Bachorowski
Bachorowski
Bahrick
Banse
Baru
Bekoff
Bekoff
Bloom
Bradshaw
Brady
Bryant
Burnham
Buttelmann
Cain
Chuenwattanapranithi
Cohen
Coleman
Coppinger
Corbett
Coren
Custance
Deaux
Düpjan
Estes
Fant
Faragó
Fedderden-Petersen
Feddersen-Petersen
Fernald
Fernald
Fitch
Fitch
Fitch
Fitch
Fitch
Fitch
Fitch
Fox
Frank
Frynta
Fukuzawa
Ghazanfar
Gisiner
Gittleman
Griebel
Hare
Harrington
Hauser
Herman
Herrel
Hillis
Hirsh-Pasek
Jaeger
Joslin
Kaminski
Koler-Matznick
Landau
Leaver
Lieberman
MacNulty
Markman
Marós
Mazzini
McComb
McConnell
Merola
Miklósi
Miles
Mills
Moehlman
Molnár
Molnár
Molnár
Morton
Ohala
Ohala
Owings
Owren
Owren
Pepperberg
Peters
Pilley
Pilley
Piérard
Plotsky
Pongrácz
Pongrácz
Prato-Previde
Price
Proops
Puts
Puts
Ramos
Reby
Rendall
Rendall
Riede
Robbins
Rutter
Ryalls
Sauter
Savage-Rumbaugh
Schassburger
Scheider
Schmidt-Nielsen
Shamir
Sillero-Zubiri
Smith
Taylor
Taylor
Taylor
Taylor
Taylor
Taylor
Theberge
Titze
Titze
Van der Zee
Vitulli
Volodin
Wayne
Yin
Yin
Zaccaroni
Zuberbuhler
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Domestic dogs produce a range of vocalisations, including barks, growls, and whimpers, which are shared with other canid species. The source–filter model of vocal production can be used as a theoretical and applied framework to explain how and why the acoustic properties of some vocalisations are constrained by physical characteristics of the caller, whereas others are more dynamic, influenced by transient states such as arousal or motivation. This chapter thus reviews how and why particular call types are produced to transmit specific types of information, and how such information may be perceived by receivers. As domestication is thought to have caused a divergence in the vocal behaviour of dogs as compared to the ancestral wolf, evidence of both dog–human and human–dog communication is considered. Overall, it is clear that domestic dogs have the potential to acoustically broadcast a range of information, which is available to conspecific and human receivers. Moreover, dogs are highly attentive to human speech and are able to extract speaker identity, emotional state, and even some types of semantic information

Crossref

Sussex Research Online

Computer audition for emotional wellbeing

Author: Baird Alice
Publication venue
Publication date: 06/02/2023
Field of study

This thesis is focused on the application of computer audition (i. e., machine listening) methodologies for monitoring states of emotional wellbeing. Computer audition is a growing field and has been successfully applied to an array of use cases in recent years. There are several advantages to audio-based computational analysis; for example, audio can be recorded non-invasively, stored economically, and can capture rich information on happenings in a given environment, e. g., human behaviour. With this in mind, maintaining emotional wellbeing is a challenge for humans and emotion-altering conditions, including stress and anxiety, have become increasingly common in recent years. Such conditions manifest in the body, inherently changing how we express ourselves. Research shows these alterations are perceivable within vocalisation, suggesting that speech-based audio monitoring may be valuable for developing artificially intelligent systems that target improved wellbeing. Furthermore, computer audition applies machine learning and other computational techniques to audio understanding, and so by combining computer audition with applications in the domain of computational paralinguistics and emotional wellbeing, this research concerns the broader field of empathy for Artificial Intelligence (AI). To this end, speech-based audio modelling that incorporates and understands paralinguistic wellbeing-related states may be a vital cornerstone for improving the degree of empathy that an artificial intelligence has. To summarise, this thesis investigates the extent to which speech-based computer audition methodologies can be utilised to understand human emotional wellbeing. A fundamental background on the fields in question as they pertain to emotional wellbeing is first presented, followed by an outline of the applied audio-based methodologies. Next, detail is provided for several machine learning experiments focused on emotional wellbeing applications, including analysis and recognition of under-researched phenomena in speech, e. g., anxiety, and markers of stress. Core contributions from this thesis include the collection of several related datasets, hybrid fusion strategies for an emotional gold standard, novel machine learning strategies for data interpretation, and an in-depth acoustic-based computational evaluation of several human states. All of these contributions focus on ascertaining the advantage of audio in the context of modelling emotional wellbeing. Given the sensitive nature of human wellbeing, the ethical implications involved with developing and applying such systems are discussed throughout

OPUS Augsburg