15,704 research outputs found

    Cultural dialects of real and synthetic emotional facial expressions

    Get PDF
    In this article we discuss the aspects of designing facial expressions for virtual humans (VHs) with a specific culture. First we explore the notion of cultures and its relevance for applications with a VH. Then we give a general scheme of designing emotional facial expressions, and identify the stages where a human is involved, either as a real person with some specific role, or as a VH displaying facial expressions. We discuss how the display and the emotional meaning of facial expressions may be measured in objective ways, and how the culture of displayers and the judges may influence the process of analyzing human facial expressions and evaluating synthesized ones. We review psychological experiments on cross-cultural perception of emotional facial expressions. By identifying the culturally critical issues of data collection and interpretation with both real and VHs, we aim at providing a methodological reference and inspiration for further research

    Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

    No full text
    In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisakis model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: Linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers error rates and then to evaluate the information expressed by the classifiers confusion matrices. © Springer Science+Business Media, LLC 2011

    Evaluation of a transplantation algorithm for expressive speech synthesis

    Get PDF
    When designing human-machine interfaces it is important to consider not only the bare bones functionality but also the ease of use and accessibility it provides. When talking about voice-based inter- faces, it has been proven that imbuing expressiveness into the synthetic voices increases signi?cantly its perceived naturalness, which in the end is very helpful when building user friendly interfaces. This paper proposes an adaptation based expressiveness transplantation system capable of copying the emotions of a source speaker into any desired target speaker with just a few minutes of read speech and without requiring the record- ing of additional expressive data. This system was evaluated through a perceptual test for 3 speakers showing up to an average of 52% emotion recognition rates relative to the natural voice recognition rates, while at the same time keeping good scores in similarity and naturality

    Expression of basic emotions in Estonian parametric text-to-speech synthesis

    Get PDF
    The goal of this study was to conduct modelling experiments, the purpose of which was the expression of three basic emotions (joy, sadness and anger) in Estonian parametric text-to-speech synthesis on the basis of both a male and a female voice. For each emotion, three different test models were constructed and presented for evaluation to subjects in perception tests. The test models were based on the basic emotions’ characteristic parameter values that had been determined on the basis of human speech. In synthetic speech, the test subjects most accurately recognized the emotion of sadness, and least accurately the emotion of joy. The results of the test showed that, in the case of the synthesized male voice, the model with enhanced parameter values performed best for all three emotions, whereas in the case of the synthetic female voice, different emotions called for different models: the model with decreased values was the most suitable one for the expression of joy, and the model with enhanced values was the most suitable for the expression of sadness and anger. Logistic regression was applied to the results of the perception tests in order to determine the significance and contribution of each acoustic parameter in the emotion models, and the possible need to adjust the values of the parameters.Kokkuvõte. Kairi Tamuri ja Meelis Mihkla: Põhiemotsioonide väljendusvõimalused eestikeelsel parameetrilisel kõnesünteesil. Uurimistöö eesmärk oli läbi viia modelleerimiseksperimente kolme põhiemotsiooni (rõõmu, kurbuse ja viha) väljendamiseks eestikeelsel parameetrilisel kõnesünteesil nii mees- kui ka naissünteeshääle baasil. Selleks koostati iga emotsiooni kohta kolm erinevat katsemudelit, mida lasti katseisikutel tajutestidel hinnata. Katsemudelite aluseks oli inimkõne põhjal määratud põhiemotsioonidele omased parameetrite väärtused. Emotsioonidest tunti sünteeskõnes kõige paremini ära kurbuse-emotsioon ning kõige halvemini rõõmu-emotsioon. Testitulemused näitasid, et kui meessünteeshääle puhul töötas kõigi kolme emotsiooni puhul kõige paremini võimendatud väärtuste mudel, siis naissünteeshääle puhul vajasid erinevad emotsioonid erinevaid mudeleid: rõõmu väljendamiseks sobis kõige paremini vähendatud väärtuste mudel, kurbuse ja viha väljendamiseks võimendatud väärtuste mudel. Tajutestide tulemusi analüüsiti logistilisel regressioonil, et teha kindlaks üksikute akustiliste parameetrite olulisus ja osakaal emotsiooni mudelites ning parameetrite väärtuste korrigeerimisvajadused.Märksõnad: eesti keel, emotsioonid, kõnesüntees, akustiline mudel, kõnetempo, intensiivsus, põhitoo

    A virtual diary companion

    Get PDF
    Chatbots and embodied conversational agents show turn based conversation behaviour. In current research we almost always assume that each utterance of a human conversational partner should be followed by an intelligent and/or empathetic reaction of chatbot or embodied agent. They are assumed to be alert, trying to please the user. There are other applications which have not yet received much attention and which require a more patient or relaxed attitude, waiting for the right moment to provide feedback to the human partner. Being able and willing to listen is one of the conditions for being successful. In this paper we have some observations on listening behaviour research and introduce one of our applications, the virtual diary companion
    corecore