Search CORE

301 research outputs found

Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals

Author: Hariharan Muthusamy
Kemal Polat
Sazali Yaacob
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM) was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM) and k-nearest neighbor (kNN) classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature

Directory of Open Access Journals

Fusion for Audio-Visual Laughter Detection

Author: Reuderink B.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Laughter is a highly variable signal, and can express a spectrum of emotions. This makes the automatic detection of laughter a challenging but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is performed by combining (fusing) the results of a separate audio and video classifier on the decision level. The video-classifier uses features based on the principal components of 20 tracked facial points, for audio we use the commonly used PLP and RASTA-PLP features. Our results indicate that RASTA-PLP features outperform PLP features for laughter detection in audio. We compared hidden Markov models (HMMs), Gaussian mixture models (GMMs) and support vector machines (SVM) based classifiers, and found that RASTA-PLP combined with a GMM resulted in the best performance for the audio modality. The video features classified using a SVM resulted in the best single-modality performance. Fusion on the decision-level resulted in laughter detection with a significantly better performance than single-modality classification

University of Twente Research Information

Automatic classification of speaker characteristics

Author: Huang Xu
Nguyen Phuoc
Sharma Dharmendra
Tran Dat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

University of Canberra Research Repository

Music Emotion Recognition: From Content- to Context-Based Models

Author: Barthet M
Fazekas G
Sandler M
Publication venue
Publication date: 13/01/2018
Field of study

Normalization and Transformation Techniques for Robust Speaker Recognition

Author: Baojie Li
Dalei Wu
Hui Jiang
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study