Search CORE

7 research outputs found

Modelling emotional valence and arousal of non-linguistic utterances for sound design support

Author: Cooper Eric
Khota Ahmed
Kovacs Mate
Yan Yu
Publication venue: Kansei Engineering and Emotion Research (KEER)
Publication date: 01/09/2022
Field of study

Non-Linguistic Utterances (NLUs), produced for popular media, computers, robots, and public spaces, can quickly and wordlessly convey emotional characteristics of a message. They have been studied in terms of their ability to convey affect in robot communication. The objective of this research is to develop a model that correctly infers the emotional Valence and Arousal of an NLU. On a Likert scale, 17 subjects evaluated the relative Valence and Arousal of 560 sounds collected from popular movies, TV shows, and video games, including NLUs and other character utterances. Three audio feature sets were used to extract features including spectral energy, spectral spread, zero-crossing rate (ZCR), Mel Frequency Cepstral Coefficients (MFCCs), and audio chroma, as well as pitch, jitter, formant, shimmer, loudness, and Harmonics-to-Noise Ratio, among others. After feature reduction by Factor Analysis, the best-performing models inferred average Valence with a Mean Absolute Error (MAE) of 0.107 and Arousal with MAE of 0.097 on audio samples removed from the training stages. These results suggest the model infers Valence and Arousal of most NLUs to less than the difference between successive rating points on the 7-point Likert scale (0.14). This inference system is applicable to the development of novel NLUs to augment robot-human communication or to the design of sounds for other systems, machines, and settings

UPCommons. Portal del coneixement obert de la UPC

Affect Recognition in Human Emotional Speech using Probabilistic Support Vector Machines

Author: Nelapati Ratna Kanth
Selvarajan Saraswathi
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/12/2022
Field of study

The problem of inferring human emotional state automatically from speech has become one of the central problems in Man Machine Interaction (MMI). Though Support Vector Machines (SVMs) were used in several worksfor emotion recognition from speech, the potential of using probabilistic SVMs for this task is not explored. The emphasis of the current work is on how to use probabilistic SVMs for the efficient recognition of emotions from speech. Emotional speech corpuses for two Dravidian languages- Telugu & Tamil- were constructed for assessing the recognition accuracy of Probabilistic SVMs. Recognition accuracy of the proposed model is analyzed using both Telugu and Tamil emotional speech corpuses and compared with three of the existing works. Experimental results indicated that the proposed model is significantly better compared with the existing methods

International Journal on Recent and Innovation Trends in Computing and Communication

Emotion recognition from syllabic units using k-nearest-neighbor classification and energy distribution

Author: Agrima Abdellah
Elmaazouzi Laila
Farchi Abdelmajid
Mounir Badia
Mounir Ilham
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/12/2021
Field of study

In this article, we present an automatic technique for recognizing emotional states from speech signals. The main focus of this paper is to present an efficient and reduced set of acoustic features that allows us to recognize the four basic human emotions (anger, sadness, joy, and neutral). The proposed features vector is composed by twenty-eight measurements corresponding to standard acoustic features such as formants, fundamental frequency (obtained by Praat software) as well as introducing new features based on the calculation of the energies in some specific frequency bands and their distributions (thanks to MATLAB codes). The extracted measurements are obtained from syllabic units’ consonant/vowel (CV) derived from Moroccan Arabic dialect emotional database (MADED) corpus. Thereafter, the data which has been collected is then trained by a k-nearest-neighbor (KNN) classifier to perform the automated recognition phase. The results reach 64.65% in the multi-class classification and 94.95% for classification between positive and negative emotions

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Time-Distributed Attention-Layered Convolution Neural Network with Ensemble Learning using Random Forest Classifier for Speech Emotion Recognition

Author: Bhanusree Yalamanchili
Kumar Samayamantula Srinivas
Rao Anne Koteswara
Publication venue: 'UUM Press, Universiti Utara Malaysia'
Publication date: 01/01/2023
Field of study

Speech Emotion Detection (SER) is a field of identifying human emotions from human speech utterances. Human speech utterances are a combination of linguistic and non-linguistic information. Nonlinguistic SER provides a generalized solution in human–computer interaction applications as it overcomes the language barrier. Machine learning and deep learning techniques were previously proposed for classifying emotions using handpicked features. To achieve effective and generalized SER, feature extraction can be performed using deep neural networks and ensemble learning for classification. The proposed model employed a time-distributed attention-layered convolution neural network (TDACNN) for extracting spatiotemporal features at the first stage and a random forest (RF) classifier, which is an ensemble classifier for efficient and generalized classification of emotions, at the second stage. The proposed model was implemented on the RAVDESS and IEMOCAP data corpora and compared with the CNN-SVM and CNN-RF models for SER. The TDACNN-RF model exhibited test classification accuracies of 92.19 percent and 90.27 percent on the RAVDESS and IEMOCAP data corpora, respectively. The experimental results proved that the proposed model is efficient in extracting spatiotemporal features from time-series speech signals and can classify emotions with good accuracy. The class confusion among the emotions was reduced for both data corpora, proving that the model achieved generalization

UUM Repository

Two-layer fuzzy multiple random forest for speech emotion recognition in human-robot interaction

Author: Albornoz
Albornoz
Ayadi
Bezdek
Breiman
Chen
Chen
Chen
Chen
Chen
Deng
Deng
Deng
Deng
Deng
Deng
Deriche
Devillers
Dileep
Eyben
Fayek
Genuer
Gonçalves
Hakhyun
Iliou
Jinhua She
Kaoru Hirota
Kim
Kim
Kim
Kondo
Laura
Leu
Luefeng Chen
McGinnis
Min Wu
Mohamed
Morrison
Oyedotun
Pal
Park
Raposo
Schuller
Schuller
Sheikhan
Song
Sun
Vaiciukynas
Wanjuan Su
Wu
Yu Feng
Yuncu
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Internet and Biometric Web Based Business Management Decision Support

Author: Kaklauskas Artūras
Publication venue
Publication date: 01/01/2023
Field of study

Internet and Biometric Web Based Business Management Decision Support MICROBE MOOC material prepared under IO1/A5 Development of the MICROBE personalized MOOCs content and teaching materials Prepared by: A. Kaklauskas, A. Banaitis, I. Ubarte Vilnius Gediminas Technical University, Lithuania Project No: 2020-1-LT01-KA203-07810

PhilPapers

KEER2022

Author
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2022
Field of study

Avanttítol: KEER2022. DiversitiesDescripció del recurs: 25 juliol 202

UPCommons. Portal del coneixement obert de la UPC