5 research outputs found

    Gender dependent word-level emotion detection using global spectral speech features

    Get PDF
    In this study, global spectral features extracted from word and sentence levels are studied for speech emotion recognition. MFCC (Mel Frequency Cepstral Coefficient) were used as spectral information for recognition purpose. Global spectral features representing gross statistics such as mean of MFCC are used. This study also examine words at different positions (initial, middle and end) separately in a sentence. Word-level feature extraction is used to analyze emotion recognition performance of words at different positions. Word boundaries are manually identified. Gender dependent and independent models are also studied to analyze the gender impact on emotion recognition performance. Berlin’s Emo-DB (Emotional Database) was used for emotional speech dataset. Performance of different classifiers also been studied. NN (Neural Network), KNN (K-Nearest Neighbor) and LDA (Linear Discriminant Analysis) are included in the classifiers. Anger and neutral emotions were also studied. Results showed that, using all 13 MFCC coefficients provide better classification results than other combinations of MFCC coefficients for the mentioned emotions. Words at initial and ending positions provide more emotion, specific information than words at middle position. Gender dependent models are more efficient than gender independent models. Moreover, female are more efficient than male model and female exhibit emotions better than the male. General, NN performs the worst compared to KNN and LDA in classifying anger and neutral. LDA performs better than KNN almost 15% for gender independent model and almost 25% for gender dependent

    Investigating the Phonetic Expression of Successful Motivation

    Get PDF
    Voße J, Wagner P. Investigating the Phonetic Expression of Successful Motivation. In: ExLing 2018. Paris, France; 2018.Object of study: Motivation is an essential concept for manifold domains. Whenever a person is confronted with an inconvenient, but necessary task, a stimulation of the person’s inherent motives can ensure its successful performance. As communication is one of the most intuitive ways to create a motivating stimulation, its effect should be noticeable in everyday human-human interaction, but also in more specific domains such as training, teaching or nursing care. So far, the motivational patterns in speech have not been studied intensively. Research progress exists on the rhetoric structure of convincing speeches [1] and on the phonetic expression of related concepts such as charisma [2], persuasion [3] and volition [4], but it is so far unclear if and how these results can be extended to the expression of motivation likewise. Methodology: The present study provides a comprehensive acoustic phonetic analysis of motivational speech. We collected, annotated and processed 50 minutes of speech data representing less and more successful degrees of motivation. Based on these, we identified and analyzed a set of phonetic features potentially relevant for motivational impact. The data consists of the audio extracted from 6 motivational YouTube videos, each presented by a different female speaker aged between 16 and 30 years. The aim of these videos is to motivate their audience to engage in sports and to be on a healthy diet. While presenters’ age and gender, video topic and structure as well as upload date are fairly constant, the videos differ in their online ratings. We used online ratings as an indicator to differentiate between more and less successful motivation. This leaves us with 3 videos of less successful (15 minutes), and 3 videos of more successful motivation (35 minutes). The data were force-aligned both on a phone and syllable level, and corrected manually. Interpausal units are used as a measure of utterance segmentation. We hypothesized that the following phonetic features differ significantly between less and more successful levels of motivation, and analyzed them within interpausal units using Praat scripts: • Pitch: mean, median, range (log Hertz) and coefficient of variation • Intensity: mean, median, range (dB) and coefficient of variation • Speaking rate (syllables/second) Results and conclusions: A comparison of the parameter distributions of two speakers with different levels of motivation indicates that highly motivational speech is faster, but less variable in terms of tempo. Also, highly motivational speech seems to be characterized by a higher pitch median, a higher pitch range and a more variable pitch in terms of coefficient of variation. Regarding intensity, we observe an overall louder articulation and a less variable pattern for intensity range, which is also supported by findings from the coefficient of variation. In sum, motivational speech appears to behave similarly to charismatic speech, but somewhat more balanced, i.e. more homogeneous, in terms of tempo and intensity variation. We plan to augment this characterization by investigating and comparing the prosodic patterns of individual speakers. References: [1] Heracleous, L., & Klaering, L. A. (2014). Charismatic leadership and rhetorical competence: An analysis of Steve Jobs’s rhetoric. Group & Organization Management, 39(2), 131-161. [2] Niebuhr, O., Voße, J., & Brem, A. (2016). What makes a charismatic speaker? A computer-based acoustic-prosodic analysis of Steve Jobs tone of voice. Computers in Human Behavior, 64, 366-382 [3] Redecker, B. (2006). Persuasion und Prosodie: Untersuchung zur Perzeption emotionaler Sprechweisen am Beispiel einer Parfumwerbung (Doctoral dissertation). [4] Skutella, L. V., Süssenbach, L., Pitsch, K., & Wagner, P. (2014). The prosody of motivation. First results from an indoor cycling scenario. Elektronische Sprachsignalverarbeitung 2014, 71ff.

    The Contribution of Sound Intensity in Vocal Emotion Perception: Behavioral and Electrophysiological Evidence

    Get PDF
    Although its role is frequently stressed in acoustic profile for vocal emotion, sound intensity is frequently regarded as a control parameter in neurocognitive studies of vocal emotion, leaving its role and neural underpinnings unclear. To investigate these issues, we asked participants to rate the angry level of neutral and angry prosodies before and after sound intensity modification in Experiment 1, and recorded electroencephalogram (EEG) for mismatching emotional prosodies with and without sound intensity modification and for matching emotional prosodies while participants performed emotional feature or sound intensity congruity judgment in Experiment 2. It was found that sound intensity modification had significant effect on the rating of angry level for angry prosodies, but not for neutral ones. Moreover, mismatching emotional prosodies, relative to matching ones, induced enhanced N2/P3 complex and theta band synchronization irrespective of sound intensity modification and task demands. However, mismatching emotional prosodies with reduced sound intensity showed prolonged peak latency and decreased amplitude in N2/P3 complex and smaller theta band synchronization. These findings suggest that though it cannot categorically affect emotionality conveyed in emotional prosodies, sound intensity contributes to emotional significance quantitatively, implying that sound intensity should not simply be taken as a control parameter and its unique role needs to be specified in vocal emotion studies

    Affective learning companions : strategies for empathetic agents with real-time multimodal affective sensing to foster meta-cognitive and meta-affective approaches to learning, motivation, and perseverance

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006.Includes bibliographical references (leaves 93-98).This thesis has developed an affective agent research platform that advances the architecture of relational agents and intelligent tutoring systems. The system realizes non-invasive multimodal real-time sensing of elements of user's affective state and couples this ability with an agent capable of supporting learners by engaging in real-time responsive expressivity. The agent mirrors several non-verbal behaviors believed to influence persuasion, liking, and social rapport, and responds to frustration with empathetic or task-support dialogue. Pilot studies involved 60 participants, ages 10-14 years-old, and led to an experiment involving 76 participants, ages 11-13 years-old, engaging in the Towers of Hanoi activity. The system (data collection, architecture, character interaction, and activity presentation) was iteratively tested and refined, and two "mirroring" conditions were developed: "sensor driven non-verbal interactions" and "pre-recorded non-verbal interactions". The development and training of the classifier algorithms showed the ability to predict frustration/help seeking behavior with 79% accuracy across a pilot group of 24 participants.(cont.) Informed by the theory of optimal experience (Flow) and a parallel theory of a state of non-optimal experience (Stuck), developed in this thesis, the effects of "affective support" and "task support" interventions, through agent dialogue and non-verbal interactions, were evaluated relative to their appropriateness for the learner's affective state. Outcomes were assessed with respect to measures of agent emotional intelligence, social bond, and persuasion, and with respect to learner frustration, perseverance, metacognitive and meta-affective ability, beliefs of one's ability to increase one's own intelligence, and goal-mastery-orientation. A new simple measure of departure dialogue was shown to have a significant relationship with the more lengthy and explicit social bond Working Alliance Inventory survey instrument; its validity was further supported through its use in assessing the social bond relationship with other measures. Over-estimation of the duration of the activity was associated with increased frustration. Gender differences were obtained with girls showing stronger outcomes when presented with affect-support interventions and boys with task-support interventions. Coordinating the character's mirroring with intervention type and learners' frustration was shown to be important.by Winslow Burleson.Ph.D

    Bioaccumulation potential of 'Meeker' and 'Willamette' raspberry (Rubus idaeus L.) fruits towards macro- and microelements and their nutritional evaluation

    Get PDF
    Raspberry (Rubus idaeus L.) is the most important type of berry fruit in the Republic of Serbia. The bioaccumulation factor (BF) for the elements detected in the fruits of the raspberry cultivars 'Willamette' and 'Meeker' was calculated to determine their bioaccumulation potential. In addition, the nutritional quality of fruits in relation to nutritionally essential elements was evaluated and compared with the recommended daily intake. For determining the concentrations of 19 macro- and microelements in fruits and the soil, the analytical technique of optical emission spectrometry with inductively coupled plasma was used. Among the analyzed elements, As, Cd, Co, Cr, Li and Mo were below the limit of detection in the fruits of both raspberry cultivars, whereas Na and Ni were detected only in fruits of the 'Meeker' cultivar. All analyzed elements were detected in the soil. The results of the work indicated the high potential of the studied cultivars to accumulate nutritional elements K and Ca. In both raspberry cultivars, there were no substantial differences in the bioaccumulation of most elements. However, two elements (B and Mn) can be singled out; the BF for B in the 'Willamette' fruit was 3 times lower compared to the BF in the 'Meeker' fruit, whereas, the BF value for Mn in the 'Willamette' fruit was almost 8 times higher compared to the BF value for the 'Meeker' fruit. Furthermore, the cultivars did not tend to accumulate potentially toxic elements such as Ba, Co, Cu and Ni. The nutritional evaluation revealed that the studied raspberry fruits are a good source of K, Ca, Mg, Fe, Mn and Cu. Based on the BF values, differences observed in the accumulation of B, Ba, Na, Ni and Mn may be attributed to the characteristics of the cultivars
    corecore