Search CORE

16 research outputs found

Multiresolution analysis (discrete wavelet transform) through Daubechies family for emotion recognition in speech

Author: Bastidas M
Bastidas M
Campo D.
Campo D.
Quintero O.L.
Quintero O.L.
Publication venue: 'IOP Publishing'
Publication date: 11/05/2016
Field of study

We propose a study of the mathematical properties of voice as an audio signal -- This work includes signals in which the channel conditions are not ideal for emotion recognition -- Multiresolution analysis- discrete wavelet transform – was performed through the use of Daubechies Wavelet Family (Db1-Haar, Db6, Db8, Db10) allowing the decomposition of the initial audio signal into sets of coefficients on which a set of features was extracted and analyzed statistically in order to differentiate emotional states -- ANNs proved to be a system that allows an appropriate classification of such states -- This study shows that the extracted features using wavelet decomposition are enough to analyze and extract emotional content in audio signals presenting a high accuracy rate in classification of emotional states without the need to use other kinds of classical frequency-time features -- Accordingly, this paper seeks to characterize mathematically the six basic emotions in humans: boredom, disgust, happiness, anxiety, anger and sadness, also included the neutrality, for a total of seven states to identify20th Argentinean Bioengineering Society Congress, SABI 2015 (XX Congreso Argentino de Bioingeniería y IX Jornadas de Ingeniería Clínica)28–30 October 2015, San Nicolás de los Arroyos, Argentin

Repositorio Institucional Universidad EAFIT

Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

Author: Shahin Ismail
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

Usually, people talk neutrally in environments where there are no abnormal talking conditions such as stress and emotion. Other emotional conditions that might affect people talking tone like happiness, anger, and sadness. Such emotions are directly affected by the patient health status. In neutral talking environments, speakers can be easily verified, however, in emotional talking environments, speakers cannot be easily verified as in neutral talking ones. Consequently, speaker verification systems do not perform well in emotional talking environments as they do in neutral talking environments. In this work, a two-stage approach has been employed and evaluated to improve speaker verification performance in emotional talking environments. This approach employs speaker emotion cues (text-independent and emotion-dependent speaker verification problem) based on both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) as classifiers. The approach is comprised of two cascaded stages that combines and integrates emotion recognizer and speaker recognizer into one recognizer. The architecture has been tested on two different and separate emotional speech databases: our collected database and Emotional Prosody Speech and Transcripts database. The results of this work show that the proposed approach gives promising results with a significant improvement over previous studies and other approaches such as emotion-independent speaker verification approach and emotion-dependent speaker verification approach based completely on HMMs.Comment: Journal of Intelligent Systems, Special Issue on Intelligent Healthcare Systems, De Gruyter, 201

arXiv.org e-Print Archive

Directory of Open Access Journals

A Review of Accent-Based Automatic Speech Recognition Models for E-Learning Environment

Author: Omojokun Gabriel Aju
Veronica Ijebusomma Osubor
Publication venue: Covenant University, Ota, Nigeria
Publication date: 16/12/2022
Field of study

The adoption of electronics learning (e-learning) as a method of disseminating knowledge in the global educational system is growing at a rapid rate, and has created a shift in the knowledge acquisition methods from the conventional classrooms and tutors to the distributed e-learning technique that enables access to various learning resources much more conveniently and flexibly. However, notwithstanding the adaptive advantages of learner-centric contents of e-learning programmes, the distributed e-learning environment has unconsciously adopted few international languages as the languages of communication among the participants despite the various accents (mother language influence) among these participants. Adjusting to and accommodating these various accents has brought about the introduction of accents-based automatic speech recognition into the e-learning to resolve the effects of the accent differences. This paper reviews over 50 research papers to determine the development so far made in the design and implementation of accents-based automatic recognition models for the purpose of e-learning between year 2001 and 2021. The analysis of the review shows that 50% of the models reviewed adopted English language, 46.50% adopted the major Chinese and Indian languages and 3.50% adopted Swedish language as the mode of communication. It is therefore discovered that majority of the ASR models are centred on the European, American and Asian accents, while unconsciously excluding the various accents peculiarities associated with the less technologically resourced continents

Covenant Journals (Covenant University)

Emotion Recognition from Speech with Acoustic, Non-Linear and Wavelet-based Features Extracted in Different Acoustic Conditions

Author: Vásquez Correa Juan Camilo
Publication venue: Medellín, Colombia
Publication date: 01/01/2016
Field of study

ABSTRACT: In the last years, there has a great progress in automatic speech recognition. The challenge now it is not only recognize the semantic content in the speech but also the called "paralinguistic" aspects of the speech, including the emotions, and the personality of the speaker. This research work aims in the development of a methodology for the automatic emotion recognition from speech signals in non-controlled noise conditions. For that purpose, different sets of acoustic, non-linear, and wavelet based features are used to characterize emotions in different databases created for such purpose

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblioteca Digital del Sistema de Bibliotecas de la Universidad de Antioquia

The classification problem in machine learning: an overview with study cases in emotion recognition and music-speech differentiation

Author: Rodríguez Cadavid Santiago
Publication venue: 'Escuela de Ingenieria de Antioquia - EIA'
Publication date: 08/03/2016
Field of study

This work addresses the well-known classification problem in machine learning -- The goal of this study is to approach the reader to the methodological aspects of the feature extraction, feature selection and classifier performance through simple and understandable theoretical aspects and two study cases -- Finally, a very good classification performance was obtained for the emotion recognition from speec

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad EAFIT