3,785 research outputs found

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    DizertačnĂ­ prĂĄce se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ho stavu mluvčích z ƙečovĂ©ho signĂĄlu. PrĂĄce je rozdělena do dvou hlavnĂ­ch častĂ­, prvnĂ­ část popisuju navrĆŸenĂ© metody pro rozpoznĂĄnĂ­ emočnĂ­ho stavu z hranĂœch databĂĄzĂ­. V rĂĄmci tĂ©to části jsou pƙedstaveny vĂœsledky rozpoznĂĄnĂ­ pouĆŸitĂ­m dvou rĆŻznĂœch databĂĄzĂ­ s rĆŻznĂœmi jazyky. HlavnĂ­mi pƙínosy tĂ©to části je detailnĂ­ analĂœza rozsĂĄhlĂ© ĆĄkĂĄly rĆŻznĂœch pƙíznakĆŻ zĂ­skanĂœch z ƙečovĂ©ho signĂĄlu, nĂĄvrh novĂœch klasifikačnĂ­ch architektur jako je napƙíklad „emočnĂ­ pĂĄrovĂĄní“ a nĂĄvrh novĂ© metody pro mapovĂĄnĂ­ diskrĂ©tnĂ­ch emočnĂ­ch stavĆŻ do dvou dimenzionĂĄlnĂ­ho prostoru. DruhĂĄ část se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ch stavĆŻ z databĂĄze spontĂĄnnĂ­ ƙeči, kterĂĄ byla zĂ­skĂĄna ze zĂĄznamĆŻ hovorĆŻ z reĂĄlnĂœch call center. Poznatky z analĂœzy a nĂĄvrhu metod rozpoznĂĄnĂ­ z hranĂ© ƙeči byly vyuĆŸity pro nĂĄvrh novĂ©ho systĂ©mu pro rozpoznĂĄnĂ­ sedmi spontĂĄnnĂ­ch emočnĂ­ch stavĆŻ. JĂĄdrem navrĆŸenĂ©ho pƙístupu je komplexnĂ­ klasifikačnĂ­ architektura zaloĆŸena na fĂșzi rĆŻznĂœch systĂ©mĆŻ. PrĂĄce se dĂĄle zabĂœvĂĄ vlivem emočnĂ­ho stavu mluvčího na Ășspěơnosti rozpoznĂĄnĂ­ pohlavĂ­ a nĂĄvrhem systĂ©mu pro automatickou detekci ĂșspěơnĂœch hovorĆŻ v call centrech na zĂĄkladě analĂœzy parametrĆŻ dialogu mezi ĂșčastnĂ­ky telefonnĂ­ch hovorĆŻ.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    An exploration of sarcasm detection in children with Attention Deficit Hyperactivity Disorder

    Get PDF
    This document is the Accepted Manuscript version of the following article: Amanda K. Ludlow, Eleanor Chadwick, Alice Morey, Rebecca Edwards, and Roberto Gutierrez, ‘An exploration of sarcasm detection in children with Attention Deficit Hyperactivity Disorder’, Journal of Communication Disorders, Vol. 70: 25-34, November 2017. Under embargo. Embargo end date: 31 October 2019. The Version of Record is available at doi: https://doi.org/10.1016/j.jcomdis.2017.10.003.The present research explored the ability of children with ADHD to distinguish between sarcasm and sincerity. Twenty-two children with a clinical diagnosis of ADHD were compared with 22 age and verbal IQ matched typically developing children using the Social Inference–Minimal Test from The Awareness of Social Inference Test (TASIT, McDonald, Flanagan, & Rollins, 2002). This test assesses an individual’s ability to interpret naturalistic social interactions containing sincerity, simple sarcasm and paradoxical sarcasm. Children with ADHD demonstrated specific deficits in comprehending paradoxical sarcasm and they performed significantly less accurately than the typically developing children. While there were no significant differences between the children with ADHD and the typically developing children in their ability to comprehend sarcasm based on the speaker’s intentions and beliefs, the children with ADHD were found to be significantly less accurate when basing their decision on the feelings of the speaker, but also on what the speaker had said. Results are discussed in light of difficulties in their understanding of complex cues of social interactions, and non-literal language being symptomatic of children with a clinical diagnosis of ADHD. The importance of pragmatic language skills in their ability to detect social and emotional information is highlighted.Peer reviewe

    Machine Understanding of Human Behavior

    Get PDF
    A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior

    Influence of Voice Intonation on Understanding Irony by Polish-Speaking Preschool Children

    Get PDF
    The main aim of the presented study was to investigate the influence of voice intonation on the comprehension of ironic utterances in 4- to 6-year-old Polish-speaking children. 83 preschool children were tested with the Irony Comprehension Task (Banasik & Bokus, 2012). In the Irony Comprehension Task, children are presented with stories in which ironic utterances were prerecorded and read by professional speakers using an ironic intonation. Half of the subjects performed the regular Irony Comprehension Task while the other half were given a modified version of the Irony Comprehension Task (ironic content was uttered using a non-ironic intonation). Results indicate that children from the ironic intonation group scored higher on the Irony Comprehension Task than children who heard ironic statements uttered using a neutral voice. Ironic voice intonation appeared to be a helpful cue to irony comprehension

    What's in a voice? Prosody as a test case for the Theory of Mind account of autism

    Get PDF
    The human voice conveys a variety of information about people's feelings, emotions and mental states. Some of this information relies on sophisticated Theory of Mind (ToM) skills, whilst others are simpler and do not require ToM. This variety provides an interesting test case for the ToM account of autism, which would predict greater impairment as ToM requirements increase. In this paper, we draw on psychological and pragmatic theories to classify vocal cues according to the amount of mindreading required to identify them. Children with a high functioning Autism Spectrum Disorder and matched controls were tested in three experiments where the speakers' state had to be extracted from their vocalizations. Although our results confirm that people with autism have subtle difficulties dealing with vocal cues, they show a pattern of performance that is inconsistent with the view that atypical recognition of vocal cues is caused by impaired ToM

    Speech with pauses sounds deceptive to listeners with and without hearing impairment

    Get PDF
    Purpose: Communication is as much persuasion as it is the transfer of information. This creates a tension between the interests of the speaker and those of the listener as dishonest speakers naturally attempt to hide deceptive speech, and listeners are faced with the challenge of sorting truths from lies. Hearing impaired listeners in particular may have differing levels of access to the acoustical cues that give away deceptive speech. A greater tendency towards speech pauses has been hypothesised to result from the cognitive demands of lying convincingly. Higher vocal pitch has also been hypothesised to mark the increased anxiety of a dishonest speaker.// Method: listeners with or without hearing impairments heard short utterances from natural conversations some of which had been digitally manipulated to contain either increased pausing or raised vocal pitch. Listeners were asked to guess whether each statement was a lie in a two alternative forced choice task. Participants were also asked explicitly which cues they believed had influenced their decisions.// Results: Statements were more likely to be perceived as a lie when they contained pauses, but not when vocal pitch was raised. This pattern held regardless of hearing ability. In contrast, both groups of listeners self-reported using vocal pitch cues to identify deceptive statements, though at lower rates than pauses.// Conclusions: Listeners may have only partial awareness of the cues that influence their impression of dishonesty. Hearing impaired listeners may place greater weight on acoustical cues according to the differing degrees of access provided by hearing aids./
    • 

    corecore