310 research outputs found
Impaired generalization of speaker identity in the perception of familiar and unfamiliar voices
In 2 behavioral experiments, we explored how the extraction of identity-related information from familiar and unfamiliar voices is affected by naturally occurring vocal flexibility and variability, introduced by different types of vocalizations and levels of volitional control during production. In a first experiment, participants performed a speaker discrimination task on vowels, volitional (acted) laughter, and spontaneous (authentic) laughter from 5 unfamiliar speakers. We found that performance was significantly impaired for spontaneous laughter, a vocalization produced under reduced volitional control. We additionally found that the detection of identity-related information fails to generalize across different types of nonverbal vocalizations (e.g., laughter vs. vowels) and across mismatches in volitional control within vocalization pairs (e.g., volitional laughter vs. spontaneous laughter), with performance levels indicating an inability to discriminate between speakers. In a second experiment, we explored whether personal familiarity with the speakers would afford greater accuracy and better generalization of identity perception. Using new stimuli, we largely replicated our previous findings: whereas familiarity afforded a consistent performance advantage for speaker discriminations, the experimental manipulations impaired performance to similar extents for familiar and unfamiliar listener groups. We discuss our findings with reference to prototype-based models of voice processing and suggest potential underlying mechanisms and representations of familiar and unfamiliar voice perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved
Speech, laughter and everything in between: A modulation spectrum-based analysis
Ludusan B, Wagner P. Speech, laughter and everything in between: A modulation spectrum-based analysis. In: Proceedings. 10th International Conference on Speech Prosody 2020. 25-28 May 2020, Tokyo, Japan. ISCA; 2020: 995-999.Laughter and speech-laughs are pervasive phenomena found in conversational speech. Nevertheless, few previous studies have compared their acoustic realization to speech. We investigated in this work the suprasegmental characteristics of these two phenomena in relation to speech, by means of a modulation spectrum analysis. Two types of modulation spectra, one encoding the variation of the envelope of the signal and the other one its temporal fine structure, were considered. Using a corpus of spontaneous dyadic interactions, we computed the modulation index spectrum and the f0 spectrum of the three classes of vocalizations considered and we fitted separate generalized additive mixed models for them. The results obtained for the former modulation showed a clear separation between speech, on the one hand, and laughter and speech-laugh, on the other hand, while the f0 spectrum was able to discriminate between all three classes. We conclude with a discussion of the importance of these findings and their implication for laughter detection
Real-time magnetic resonance imaging reveals distinct vocal tract configurations during spontaneous and volitional laughter
A substantial body of acoustic and behavioural evidence points to the existence of two broad
categories of laughter in humans: spontaneous laughter that is emotionally genuine and
somewhat involuntary, and volitional laughter that is produced on demand. In this study, we
tested the hypothesis these are also physiologically distinct vocalisations, by measuring and
comparing them using real-time MRI (rtMRI) of the vocal tract. Following Ruch & Ekman
(2001), we further predicted that spontaneous laughter should be relatively less speech-like
(i.e. less articulate) than volitional laughter. We collected rtMRI data from five adult human
participants during spontaneous laughter, volitional laughter, and spoken vowels. We report
distinguishable vocal tract shapes during the vocalic portions of these three vocalisation types,
where volitional laughs were intermediate between spontaneous laughs and vowels.
Inspection of local features within the vocal tract across the different vocalisation types offers
some additional support for Ruch and Ekman’s (2001) predictions. We discuss our findings in
light of a dual-pathway hypothesis for the neural control of human volitional and spontaneous
vocal behaviours, identifying tongue shape and velum lowering as potential biomarkers of
spontaneous laughter to be investigated in future research
Speaker Sex Perception from Spontaneous and Volitional Nonverbal Vocalizations.
In two experiments, we explore how speaker sex recognition is affected by vocal flexibility, introduced by volitional and spontaneous vocalizations. In Experiment 1, participants judged speaker sex from two spontaneous vocalizations, laughter and crying, and volitionally produced vowels. Striking effects of speaker sex emerged: For male vocalizations, listeners' performance was significantly impaired for spontaneous vocalizations (laughter and crying) compared to a volitional baseline (repeated vowels), a pattern that was also reflected in longer reaction times for spontaneous vocalizations. Further, performance was less accurate for laughter than crying. For female vocalizations, a different pattern emerged. In Experiment 2, we largely replicated the findings of Experiment 1 using spontaneous laughter, volitional laughter and (volitional) vowels: here, performance for male vocalizations was impaired for spontaneous laughter compared to both volitional laughter and vowels, providing further evidence that differences in volitional control over vocal production may modulate our ability to accurately perceive speaker sex from vocal signals. For both experiments, acoustic analyses showed relationships between stimulus fundamental frequency (F0) and the participants' responses. The higher the F0 of a vocal signal, the more likely listeners were to perceive a vocalization as being produced by a female speaker, an effect that was more pronounced for vocalizations produced by males. We discuss the results in terms of the availability of salient acoustic cues across different vocalizations
Laugh machine
The Laugh Machine project aims at endowing virtual agents with the capability to laugh naturally, at the right moment and with the correct intensity, when interacting with human participants. In this report we present the technical development and evaluation of such an agent in one specific scenario: watching TV along with a participant. The agent must be able to react to both, the video and the participant’s behaviour. A full processing chain has been implemented, inte- grating components to sense the human behaviours, decide when and how to laugh and, finally, synthesize audiovisual laughter animations. The system was evaluated in its capability to enhance the affective experience of naive participants, with the help of pre and post-experiment questionnaires. Three interaction conditions have been compared: laughter-enabled or not, reacting to the participant’s behaviour or not. Preliminary results (the number of experiments is currently to small to obtain statistically significant differences) show that the interactive, laughter-enabled agent is positively perceived and is increasing the emotional dimension of the experiment
Paralinguistic event detection in children's speech
Paralinguistic events are useful indicators of the affective state of a speaker. These cues, in children's speech, are used to form social bonds with their caregivers. They have also been found to be useful in the very early detection of developmental disorders such as autism spectrum disorder (ASD) in children's speech. Prior work on children's speech has focused on the use of a limited number of subjects which don't have sufficient diversity in the type of vocalizations that are produced. Also, the features that are necessary to understand the production of paralinguistic events is not fully understood. To account for the lack of an off-the-shelf solution to detect instances of laughter and crying in children's speech, the focus of the thesis is to investigate and develop signal processing algorithms to extract acoustic features and use machine learning algorithms on various corpora. Results obtained using baseline spectral and prosodic features indicate the ability of the combination of spectral, prosodic, and dysphonation-related features that are needed to detect laughter and whining in toddlers' speech with different age groups and recording environments. The use of long-term features were found to be useful to capture the periodic properties of laughter in adults' and children's speech and detected instances of laughter to a high degree of accuracy. Finally, the thesis focuses on the use of multi-modal information using acoustic features and computer vision-based smile-related features to detect instances of laughter and to reduce the instances of false positives in adults' and children's speech. The fusion of the features resulted in an improvement of the accuracy and recall rates than when using either of the two modalities on their own.Ph.D
Making Sense of Laughter: a comparison of self-reported experience, perception and production in autistic and non-autistic adults
Laughter has primarily been viewed as positive emotional vocalisation associated with humour and amusement. It is commonly used as a communicative tool in social interaction. Our mentalising network automatically engages in laughter processing to understand other people’s laughter. However, autistic individuals struggle with social communication, driven by their difficulty mentalising. Therefore, this thesis investigated how the self-reported experience, perception and production of laughter differ between non-autistic and autistic adults:
Compared to non-autistic adults, autistic adults reported that they laugh less, enjoy laughter less and find it more difficult to understand other people’s laughter. However, autistic adults reported that they laugh on purpose as often as non-autistic adults via a questionnaire study.
Autistic adults show a different pattern of laughter production relative to non- autistic adults. A multi-level dyadic study found that non-autistic pairs laughed more when interacting with their friend than a stranger, whilst the amount of laughter produced by pairs of one autistic and one non-autistic adult was not affected by the closeness of the relationship.
An explicit processing task found subtle differences in differentiating the authenticity of laughter and perceiving its affective properties between the two groups. Moreover, the addition of laughter increased non-autistic adults’ perceived funniness of humorous stimuli; and they found humorous stimuli funnier when paired with genuine than posed laughs. However, this effect was not consistently observed in autistic adults. A follow-up fMRI study investigated the neural mechanism of implicit laughter processing and how these abilities relate to mentalising ability; subregions in
2
the prefrontal cortex showed greater activation while processing words paired with posed laughter than with real laughter in non-autistic adults but not in autistic adults.
In summary, this thesis demonstrated different patterns of laughter behaviour between autistic and non-autistic adults, including self-reported laughter experience, laughter production in social situations, laughter processing and its underlying neurocognitive mechanism. It extended our current understanding of the social- emotional signature of laughter from non-autistic adults to autistic adults and therefore highlighted the critical role of laughter in social interaction
Recommended from our members
The nature and function of human nonverbal vocalisations
Though human nonverbal vocalisations are widespread, scientific consideration of their mechanisms and communicative functions has been largely overlooked. This is despite their close alignment with the vocal communicative systems of primates and other mammals, whose primary function is to signal indexical information relevant to sexual and natural selection processes. In this thesis, I examine human nonverbal vocalisations from an evolutionary perspective, with the central hypothesis that they are functionally and structurally homologous to nonhuman mammal calls, communicating evolutionarily relevant indexical information that is perceived and utilised by listeners. In Chapter 1, I introduce the methodological framework (source-filter theory) necessary to understand the production of vocal signals in mammals, before summarising the information contained within the acoustic structure of nonhuman mammals and human speech, and the effects these cues have on both vocaliser and listener. I then examine the current evidence for functional and structural homology between human and nonhuman nonverbal vocalisations. In Chapters 2 to 5, I quantitatively analyse the acoustic structure of a number of nonverbal vocalisations, and perform playback experiments to examine their functional effects on listeners. In Chapters 2 and 3, I investigate whether aggressive roars and distress screams communicate acoustic cues to absolute and relative strength and height. In Chapter 4, I analyse the acoustic structure of pain cries of varying intensity, and conduct playback experiments to explore the acoustic and perceptual correlates of pain. In Chapter 5, I examine whether the fundamental frequency of tennis grunts produced during professional tennis matches is dependent on the sex and body posture of the vocaliser, as well as the progress and outcome of the contest, and whether listeners can infer these cues. In Chapter 6, I tie these findings together, arguing that the acoustic structure of human nonverbal vocalisations, in continuity with nonhuman mammal vocalisations, has been selected to support the functional communication of indexical and motivational information
- …