    Study on Feature Extraction of Speech Emotion Recognition

    Speech emotion recognition system aims at automatically identifying the emotion of the speaker from the speech. It is a modification of the speech recognition system which only identifies the speech. In this paper, we study the feature extraction algorithm such as pitch, formant frequency and MFCC.Keywords:Feature extraction, pitch, formant frequency, MFC

    Dual-level segmentation method for feature extraction enhancement strategy in speech emotion recognition

    The speech segmentation approach could be one of the significant factors contributing to a Speech Emotion Recognition (SER) system's overall performance. An utterance may contain more than one perceived emotion, the boundaries between the changes of emotion in an utterance are challenging to determine. Speech segmented through the conventional fixed window did not correspond to the signal changes, due to the random segment point, an arbitrary segmented frame is produced, the segment boundary might be within the sentence or in-between emotional changes. This study introduced an improvement of segment-based segmentation on a fixed-window Relative Time Interval (RTI) by using Signal Change (SC) segmentation approach to discover the signal boundary concerning the signal transition. A segment-based feature extraction enhancement strategy using a dual-level segmentation method was proposed: RTI-SC segmentation utilizing the conventional approach. Instead of segmenting the whole utterance at the relative time interval, this study implements peak analysis to obtain segment boundaries defined by the maximum peak value within each temporary RTI segment. In peak selection, over-segmentation might occur due to connections with the input signal, impacting the boundary selection decision. Two approaches in finding the maximum peaks were implemented, firstly; peak selection by distance allocation, and secondly; peak selection by Maximum function. The substitution of the temporary RTI segment with the segment concerning signal change was intended to capture better high-level statistical-based features within the signal transition. The signal's prosodic, spectral, and wavelet properties were integrated to structure a fine feature set based on the proposed method. 36 low-level descriptors and 12 statistical features and their derivative were extracted on each segment resulted in a fixed vector dimension. Correlation-based Feature Subset Selection (CFS) with the Best First search method was applied for dimensionality reduction before Support Vector Machine (SVM) with Sequential Minimal Optimization (SMO) was implemented for classification. The performance of the feature fusion constructed from the proposed method was evaluated through speaker-dependent and speaker-independent tests on EMO-DB and RAVDESS databases. The result indicated that the prosodic and spectral feature derived from the dual-level segmentation method offered a higher recognition rate for most speaker-independent tasks with a significant improvement of the overall accuracy of 82.2% (150 features), the highest accuracy among other segmentation approaches used in this study. The proposed method outperformed the baseline approach in a single emotion assessment in both full dimensions and an optimized set. The highest accuracy for every emotion was mostly contributed by the proposed method. Using the EMO-DB database, accuracy was enhanced, specifically, happy (67.6%), anger (89%), fear (85.5%), disgust (79.3%), while neutral and sadness emotion obtained a similar accuracy with the baseline method (91%) and (93.5%) respectively. A 100% accuracy for boredom emotion (female speaker) was observed in the speaker-dependent test, the highest single emotion classified, reported in this study

    Improvement business communications in banking by optimization of contact centres

    S obzirom na brojne i značajne tehnološke promene koje karakterišu savremeno poslovanje, postalo je neophodno da se unapređuje i poslovna komunikacija. Da bi se unapredila poslovna komunikacija, između ostalih referentnih inovacija, organizuju se kontakt centri. Organizacija kontakt centra ima ključnu ulogu u obezbeđivanju njegove produktivnosti i efikasnosti, jer na osnovu nje klijenti i potencijalni klijenti ocenjuju ne samo rad kontakt centra, nego i rad banke, kao i njenih usluga i proizvoda. Da bi se izvršila optimizacija rada kontakt centra, potrebno je da se identifikuju sledeće promenljive i pojave: tip kontakt centra, tehnologija, kanali komunikacije, zaposleni i njihova obuka, zatim, potrebno je da se odredi potreban broj zaposlenih, ali i da se zaposleni pravilno rasporede po smenama, kao i da se prati njihov rad. Stavljanjem u optimalnu korelaciju navedenog, utiče se na poboljšanje poslovne komunikacije. Početkom pedesetih godina prošlog veka počeli su da se formiraju prvi kontakt centri i do danas bilo je dosta transformacija koje su uticale i na poslovnu komunikaciju. Poslovna komunikacija na više načina može da se unapređuje različitom organizacijom kontakt centara.Many significant tehnological changes describe modern business. It is necessary to improve business communication with the help of referential innovation i.e. contact centers. The contact center has an important role in securing productivity and effectiveness because clients and potential client evaluate the contact center, bank`s work services and products. In order to optimize the contact center`s work, certain variables must be identified e.g. the type of contact center, technology, channels of communication, employees and their training, the number of employees, shift working, supervision over employees. All of the issues mentioned above have great influence on business communication. In the early fifties of the last century began to form the first contact centers, and there have been a lot of transformations that have affected to the business communication. Business communication in many ways can promote contact centers with different organization

    Speech Emotion Recognition in Acted and Spontaneous Context

    AbstractLittle attention has been paid so far in the context in which databases used for the study of emotion through vocal channel are recorded. Thus, we propose and evaluate an emotion classification system focusing on the differences between acted and spontaneous emotional speech through the use of two different databases: SAVEE and IEMOCAP. For the purpose of this work, we have examined wavelet packet energy and entropy features applied to Mel, Bark and ERB scale applied with Hidden Markov Model (HMM) as classification system. Experimental results show that the proposed method is a feasible technique for emotion classification for both acted and spontaneous context, pointing out the performance difference of the system between the two contexts. The experimental results shows that ERB scale features gives better performance in comparison with other studied features with recognition accuracy of 78.75% for acted context and 50.06% for spontaneous context