1,234 research outputs found
Similarities in face and voice cerebral processing
In this short paper I illustrate by a few selected examples several compelling similarities in the functional organization of face and voice cerebral processing: (1) Presence of cortical areas selective to face or voice stimuli, also observed in non-human primates, and causally related to perception; (2) Coding of face or voice identity using a “norm-based” scheme; (3) Personality inferences from faces and voices in a same Trustworthiness–Dominance “social space”
The sound of trustworthiness: acoustic-based modulation of perceived voice personality
When we hear a new voice we automatically form a "first impression" of the voice owner’s
personality; a single word is sufficient to yield ratings highly consistent across listeners. Past
studies have shown correlations between personality ratings and acoustical parameters of
voice, suggesting a potential acoustical basis for voice personality impressions, but its
nature and extent remain unclear. Here we used data-driven voice computational modelling
to investigate the link between acoustics and perceived trustworthiness in the single word
"hello". Two prototypical voice stimuli were generated based on the acoustical features of
voices rated low or high in perceived trustworthiness, respectively, as well as a continuum
of stimuli inter- and extrapolated between these two prototypes. Five hundred listeners
provided trustworthiness ratings on the stimuli via an online interface. We observed an
extremely tight relationship between trustworthiness ratings and position along the trustworthiness
continuum (r = 0.99). Not only were trustworthiness ratings higher for the high- than
the low-prototypes, but the difference could be modulated quasi-linearly by reducing or
exaggerating the acoustical difference between the prototypes, resulting in a strong
caricaturing effect. The f0 trajectory, or intonation, appeared a parameter of particular relevance:
hellos rated high in trustworthiness were characterized by a high starting f0 then a
marked decrease at mid-utterance to finish on a strong rise. These results demonstrate a
strong acoustical basis for voice personality impressions, opening the door to multiple
potential applications
Judgements of a speaker’s personality are correlated across differing content and stimulus type
It has previously been shown that first impressions of a speaker’s personality, whether accurate or not, can be judged from short utterances of vowels and greetings, as well as from prolonged sentences and readings of complex paragraphs. From these studies, it is established that listeners’ judgements are highly consistent with one another, suggesting that different people judge personality traits in a similar fashion, with three key personality traits being related to measures of valence (associated with trustworthiness), dominance, and attractiveness. Yet, particularly in voice perception, limited research has established the reliability of such personality judgements across stimulus types of varying lengths. Here we investigate whether first impressions of trustworthiness, dominance, and attractiveness of novel speakers are related when a judgement is made on hearing both one word and one sentence from the same speaker. Secondly, we test whether what is said, thus adjusting content, influences the stability of personality ratings. 60 Scottish voices (30 females) were recorded reading two texts: one of ambiguous content and one with socially-relevant content. One word (~500 ms) and one sentence (~3000 ms) were extracted from each recording for each speaker. 181 participants (138 females) rated either male or female voices across both content conditions (ambiguous, socially-relevant) and both stimulus types (word, sentence) for one of the three personality traits (trustworthiness, dominance, attractiveness). Pearson correlations showed personality ratings between words and sentences were strongly correlated, with no significant influence of content. In short, when establishing an impression of a novel speaker, judgments of three key personality traits are highly related whether you hear one word or one sentence, irrespective of what they are saying. This finding is consistent with initial personality judgments serving as elucidators of approach or avoidance behaviour, without modulation by time or content. All data and sounds are available on OSF (osf.io/s3cxy)
Cracking the social code of speech prosody using reverse correlation
Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker's traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker's perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word "Hello," which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers' physical characteristics, such as sex and mean pitch. By characterizing how any given individual's mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals
Auditory smiles trigger unconscious facial imitation
Smiles, produced by the bilateral contraction of the zygomatic major muscles, are one of the most powerful expressions of positive affect and affiliation and also one of the earliest to develop [1]. The perception-action loop responsible for the fast and spontaneous imitation of a smile is considered a core component of social cognition [2]. In humans, social interaction is overwhelmingly vocal, and the visual cues of a smiling face co-occur with audible articulatory changes on the speaking voice [3]. Yet remarkably little is known about how such 'auditory smiles' are processed and reacted to. We have developed a voice transformation technique that selectively simulates the spectral signature of phonation with stretched lips and report here how we have used this technique to study facial reactions to smiled and non-smiled spoken sentences, finding that listeners' zygomatic muscles tracked auditory smile gestures even when they did not consciously detect them
A neural marker for social bias toward in-group accents
Accents provide information about the speaker's geographical, socio-economic, and ethnic background. Research in applied psychology and sociolinguistics suggests that we generally prefer our own accent to other varieties of our native language and attribute more positive traits to it. Despite the widespread influence of accents on social interactions, educational and work settings the neural underpinnings of this social bias toward our own accent and, what may drive this bias, are unexplored. We measured brain activity while participants from two different geographical backgrounds listened passively to 3 English accent types embedded in an adaptation design. Cerebral activity in several regions, including bilateral amygdalae, revealed a significant interaction between the participants' own accent and the accent they listened to: while repetition of own accents elicited an enhanced neural response, repetition of the other group's accent resulted in reduced responses classically associated with adaptation. Our findings suggest that increased social relevance of, or greater emotional sensitivity to in-group accents, may underlie the own-accent bias. Our results provide a neural marker for the bias associated with accents, and show, for the first time, that the neural response to speech is partly shaped by the geographical background of the listener
A language-familiarity effect for speaker discrimination without comprehension
The influence of language familiarity upon speaker identification is well established, to such an extent that it has been argued that “Human voice recognition depends on language ability” [Perrachione TK, Del Tufo SN, Gabrieli JDE (2011) Science 333(6042):595]. However, 7-mo-old infants discriminate speakers of their mother tongue better than they do foreign speakers [Johnson EK, Westrek E, Nazzi T, Cutler A (2011) Dev Sci 14(5):1002–1011] despite their limited speech comprehension abilities, suggesting that speaker discrimination may rely on familiarity with the sound structure of one’s native language rather than the ability to comprehend speech. To test this hypothesis, we asked Chinese and English adult participants to rate speaker dissimilarity in pairs of sentences in English or Mandarin that were first time-reversed to render them unintelligible. Even in these conditions a language-familiarity effect was observed: Both Chinese and English listeners rated pairs of native-language speakers as more dissimilar than foreign-language speakers, despite their inability to understand the material. Our data indicate that the language familiarity effect is not based on comprehension but rather on familiarity with the phonology of one’s native language. This effect may stem from a mechanism analogous to the “other-race” effect in face recognition
Hippocampal sclerosis affects fMR-adaptation of lyrics and melodies in songs
Songs constitute a natural combination of lyrics and melodies, but it is unclear whether and how these two song components are integrated during the emergence of a memory trace. Network theories of memory suggest a prominent role of the hippocampus, together with unimodal sensory areas, in the build-up of conjunctive representations. The present study tested the modulatory influence of the hippocampus on neural adaptation to songs in lateral temporal areas. Patients with unilateral hippocampal sclerosis and healthy matched controls were presented with blocks of short songs in which lyrics and/or melodies were varied or repeated in a crossed factorial design. Neural adaptation effects were taken as correlates of incidental emergent memory traces. We hypothesized that hippocampal lesions, particularly in the left hemisphere, would weaken adaptation effects, especially the integration of lyrics and melodies. Results revealed that lateral temporal lobe regions showed weaker adaptation to repeated lyrics as well as a reduced interaction of the adaptation effects for lyrics and melodies in patients with left hippocampal sclerosis. This suggests a deficient build-up of a sensory memory trace for lyrics and a reduced integration of lyrics with melodies, compared to healthy controls. Patients with right hippocampal sclerosis showed a similar profile of results although the effects did not reach significance in this population. We highlight the finding that the integrated representation of lyrics and melodies typically shown in healthy participants is likely tied to the integrity of the left medial temporal lobe. This novel finding provides the first neuroimaging evidence for the role of the hippocampus during repetitive exposure to lyrics and melodies and their integration into a song
Gender differences in the temporal voice areas
There is not only evidence for behavioral differences in voice perception between female and male listeners, but also recent suggestions for differences in neural correlates between genders. The fMRI functional voice localizer (comprising a univariate analysis contrasting stimulation with vocal versus non-vocal sounds) is known to give robust estimates of the temporal voice areas (TVAs). However there is growing interest in employing multivariate analysis approaches to fMRI data (e.g. multivariate pattern analysis; MVPA). The aim of the current study was to localize voice-related areas in both female and male listeners and to investigate whether brain maps may differ depending on the gender of the listener. After a univariate analysis, a random effects analysis was performed on female (n = 149) and male (n = 123) listeners and contrasts between them were computed. In addition, MVPA with a whole-brain searchlight approach was implemented and classification maps were entered into a second-level permutation based random effects models using statistical non-parametric mapping (SnPM; Nichols & Holmes 2002). Gender differences were found only in the MVPA. Identified regions were located in the middle part of the middle temporal gyrus (bilateral) and the middle superior temporal gyrus (right hemisphere). Our results suggest differences in classifier performance between genders in response to the voice localizer with higher classification accuracy from local BOLD signal patterns in several temporal-lobe regions in female listeners
- …
