1 research outputs found
Optimizing Speech Emotion Recognition using Manta-Ray Based Feature Selection
Emotion recognition from audio signals has been regarded as a challenging
task in signal processing as it can be considered as a collection of static and
dynamic classification tasks. Recognition of emotions from speech data has been
heavily relied upon end-to-end feature extraction and classification using
machine learning models, though the absence of feature selection and
optimization have restrained the performance of these methods. Recent studies
have shown that Mel Frequency Cepstral Coefficients (MFCC) have been emerged as
one of the most relied feature extraction methods, though it circumscribes the
accuracy of classification with a very small feature dimension. In this paper,
we propose that the concatenation of features, extracted by using different
existing feature extraction methods can not only boost the classification
accuracy but also expands the possibility of efficient feature selection. We
have used Linear Predictive Coding (LPC) apart from the MFCC feature extraction
method, before feature merging. Besides, we have performed a novel application
of Manta Ray optimization in speech emotion recognition tasks that resulted in
a state-of-the-art result in this field. We have evaluated the performance of
our model using SAVEE and Emo-DB, two publicly available datasets. Our proposed
method outperformed all the existing methods in speech emotion analysis and
resulted in a decent result in these two datasets with a classification
accuracy of 97.06% and 97.68% respectively.Comment: 10 pages, 8 figure