845 research outputs found

    A Review on Emotion Recognition Algorithms using Speech Analysis

    Get PDF
    In recent years, there is a growing interest in speech emotion recognition (SER) by analyzing input speech. SER can be considered as simply pattern recognition task which includes features extraction, classifier, and speech emotion database. The objective of this paper is to provide a comprehensive review on various literature available on SER. Several audio features are available, including linear predictive coding coefficients (LPCC), Mel-frequency cepstral coefficients (MFCC), and Teager energy based features. While for classifier, many algorithms are available including hidden Markov model (HMM), Gaussian mixture model (GMM), vector quantization (VQ), artificial neural networks (ANN), and deep neural networks (DNN). In this paper, we also reviewed various speech emotion database. Finally, recent related works on SER using DNN will be discussed

    Need of Boosted GMM in Speech Emotion Recognition System Implemented Using Gaussian Mixture Model

    Get PDF
    Speech feeling recognition is a vital issue that affects the human machine interaction. Automatic recognition of human feeling in speech aims at recognizing the underlying spirit of a speaker from the speech signal. Gaussian mixture models (GMMs) and therefore the minimum error rate classifier (i.e., theorem optimum classifier) is widespread and effective tools for speech feeling recognition. Typically, GMMs are wont to model the class-conditional distributions of acoustic options and their parameters are calculable by the expectation maximization (EM) algorithmic rule supported a coaching information set. During this paper, we have a tendency to introduce a boosting algorithmic rule for faithfully and accurately estimating the class-conditional GMMs. The ensuing algorithmic rule is known as the Boosted-GMM algorithmic rule. Our speech feeling recognition experiments show that the feeling recognition rates are effectively and considerably boosted by the Boosted-GMM algorithmic rule as compared to the EM-GMM algorithmic rule. During this interaction, human beings have some feelings that they want to convey to their communication partner with whom they are communicating, and then their communication partner may be the human or machine. This work dependent on the emotion recognition of the human beings from their speech signal. Emotion recognition from the speaker’s speech is very difficult because of the following reasons: Because of the existence of the different sentences, speakers, speaking styles, speaking rates accosting variability was introduced. The same utterance may show different emotions. Therefore, it is very difficult to differentiate these portions of utterance. Another problem is that emotion expression is depending on the speaker and his or her culture and environment. As the culture and environment gets change the speaking style also gets change, which is another challenge in front of the speech emotion recognition system.Human beings normally used their essential potentials to make communication better between themselves as well as between human and machine. During this interaction, human beings have some feelings that they want to convey to their communication partner with whom they are communicating, and then their communication partner may be the human or machine. This dissertation work dependent on the emotion recognition of the human beings from their speech signal. In this chapter introduction of the speech emotion recognition based on the problem overview and need of the system is provided. Emotional speech recognition aims at automatically identifying the emotional or physical state of a human being from his or her voice. Although feeling detection from speech could be a comparatively new field of analysis, it is several potential applications. In human-computer or human-human interaction systems, feeling recognition systems might give users with improved services by being adaptative to their emotions. The body of labor on sleuthing feeling in speech is sort of restricted. Currently, researchers area unit still debating what options influence the popularity of feeling in speech. There is conjointly appreciable uncertainty on the simplest algorithmic program for classifying feeling, and those emotions to category along.

    Gaussian mixture model classifiers for detection and tracking in UAV video streams.

    Get PDF
    Masters Degree. University of KwaZulu-Natal, Durban.Manual visual surveillance systems are subject to a high degree of human-error and operator fatigue. The automation of such systems often employs detectors, trackers and classifiers as fundamental building blocks. Detection, tracking and classification are especially useful and challenging in Unmanned Aerial Vehicle (UAV) based surveillance systems. Previous solutions have addressed challenges via complex classification methods. This dissertation proposes less complex Gaussian Mixture Model (GMM) based classifiers that can simplify the process; where data is represented as a reduced set of model parameters, and classification is performed in the low dimensionality parameter-space. The specification and adoption of GMM based classifiers on the UAV visual tracking feature space formed the principal contribution of the work. This methodology can be generalised to other feature spaces. This dissertation presents two main contributions in the form of submissions to ISI accredited journals. In the first paper, objectives are demonstrated with a vehicle detector incorporating a two stage GMM classifier, applied to a single feature space, namely Histogram of Oriented Gradients (HoG). While the second paper demonstrates objectives with a vehicle tracker using colour histograms (in RGB and HSV), with Gaussian Mixture Model (GMM) classifiers and a Kalman filter. The proposed works are comparable to related works with testing performed on benchmark datasets. In the tracking domain for such platforms, tracking alone is insufficient. Adaptive detection and classification can assist in search space reduction, building of knowledge priors and improved target representations. Results show that the proposed approach improves performance and robustness. Findings also indicate potential further enhancements such as a multi-mode tracker with global and local tracking based on a combination of both papers

    Speaker Recognition: Advancements and Challenges

    Get PDF

    Exploring Language-Independent Emotional Acoustic Features via Feature Selection

    Full text link
    We propose a novel feature selection strategy to discover language-independent acoustic features that tend to be responsible for emotions regardless of languages, linguistics and other factors. Experimental results suggest that the language-independent feature subset discovered yields the performance comparable to the full feature set on various emotional speech corpora.Comment: 15 pages, 2 figures, 6 table
    • …
    corecore