14,372 research outputs found
Emotional State Categorization from Speech: Machine vs. Human
This paper presents our investigations on emotional state categorization from
speech signals with a psychologically inspired computational model against
human performance under the same experimental setup. Based on psychological
studies, we propose a multistage categorization strategy which allows
establishing an automatic categorization model flexibly for a given emotional
speech categorization task. We apply the strategy to the Serbian Emotional
Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human
performance was reported in previous psychological studies. Our work is the
first attempt to apply machine learning to the GEES corpus where the human
recognition rates were only available prior to our study. Unlike the previous
work on the DES corpus, our work focuses on a comparison to human performance
under the same experimental settings. Our studies suggest that
psychology-inspired systems yield behaviours that, to a great extent, resemble
what humans perceived and their performance is close to that of humans under
the same experimental setup. Furthermore, our work also uncovers some
differences between machine and humans in terms of emotional state recognition
from speech.Comment: 14 pages, 15 figures, 12 table
Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization
In this paper, a new and novel Automatic Speaker Recognition (ASR) system is presented. The new ASR system includes novel feature extraction and vector classification steps utilizing distributed Discrete Cosine Transform (DCT-II) based Mel Frequency Cepstral Coef?cients (MFCC) and Fuzzy Vector Quantization (FVQ). The ASR algorithm utilizes an approach based on MFCC to identify dynamic features that are used for Speaker Recognition (SR)
SPEECH RECOGNITION FOR CONNECTED WORD USING CEPSTRAL AND DYNAMIC TIME WARPING ALGORITHMS
Speech Recognition or Speech Recognizer (SR) has become an important tool for people with physical disabilities when handling Home Automation (HA) appliances. This technology is expected to improve the daily life of the elderly and the disabled so that they are always in control over their lives, and continue to live independently, to learn and stay involved in social life. The goal of the research is to solve the constraints of current Malay SR that is still in its infancy stage where there is limited research in Malay words, especially for HA applications. Since, most of the previous works were confined to wired microphone; this limitation of using wireless microphone type makes it an important area of the research. Research was carried out to develop SR word model for five (5) Malay words and five (5) English words as commands to activate and deactivate home appliances
FPGA-based implementation of speech recognition for robocar control using MFCC
This research proposes a simulation of the logic series of speech recognition on the MFCC (Mel Frequency Spread Spectrum) based FPGA and Euclidean Distance to control the robotic car motion. The speech known would be used as a command to operate the robotic car. MFCC in this study was used in the feature extraction process, while Euclidean distance was applied in the feature classification process of each speech that later would be forwarded to the part of decision to give the control logic in robotic motor. The test that has been conducted showed that the logic series designed was precise here by measuring the Mel Frequency Warping and Power Cepstrum. With the achievement of logic design in this research proven with a comparison between the Matlab computation and Xilinx simulation, it enables to facilitate the researchers to continue its implementation to FPGA hardware
- …