79 research outputs found
Cerebral correlates and statistical criteria of cross-modal face and voice integration
Perception of faces and voices plays a prominent role in human social interaction, making multisensory integration of cross-modal speech a topic of great interest in cognitive neuroscience. How to define po- tential sites of multisensory integration using functional magnetic resonance imaging (fMRI) is currently under debate, with three statistical criteria frequently used (e.g., super-additive, max and mean criteria). In the present fMRI study, 20 participants were scanned in a block design under three stimulus conditions: dynamic unimodal face, unimodal voice and bimodal face–voice. Using this single dataset, we examine all these statistical criteria in an attempt to define loci of face–voice integration. While the super-additive and mean criteria essentially revealed regions in which one of the unimodal responses was a deactivation, the max criterion appeared stringent and only highlighted the left hippocampus as a potential site of face– voice integration. Psychophysiological interaction analysis showed that connectivity between occipital and temporal cortices increased during bimodal compared to unimodal conditions. We concluded that, when investigating multisensory integration with fMRI, all these criteria should be used in conjunction with ma- nipulation of stimulus signal-to-noise ratio and/or cross-modal congruency
The Glasgow Voice Memory Test: Assessing the ability to memorize and recognize unfamiliar voices
One thousand one hundred and twenty subjects as well as a developmental phonagnosic subject (KH) along with age-matched controls performed the Glasgow Voice Memory Test, which assesses the ability to encode and immediately recognize, through an old/new judgment, both unfamiliar voices (delivered as vowels, making language requirements minimal) and bell sounds. The inclusion of non-vocal stimuli allows the detection of significant dissociations between the two categories (vocal vs. non-vocal stimuli). The distributions of accuracy and sensitivity scores (d’) reflected a wide range of individual differences in voice recognition performance in the population. As expected, KH showed a dissociation between the recognition of voices and bell sounds, her performance being significantly poorer than matched controls for voices but not for bells. By providing normative data of a large sample and by testing a developmental phonagnosic subject, we demonstrated that the Glasgow Voice Memory Test, available online and accessible from all over the world, can be a valid screening tool (~5 min) for a preliminary detection of potential cases of phonagnosia and of “super recognizers” for voices
Dissociating task difficulty from incongruence in face-voice emotion integration
In the everyday environment, affective information is conveyed by both the face and the voice. Studies have demonstrated that a concurrently presented voice can alter the way that an emotional face expression is perceived, and vice versa, leading to emotional conflict if the information in the two modalities is mismatched. Additionally, evidence suggests that incongruence of emotional valence activates cerebral networks involved in conflict monitoring and resolution. However, it is currently unclear whether this is due to task difficulty—that incongruent stimuli are harder to categorize—or simply to the detection of mismatching information in the two modalities. The aim of the present fMRI study was to examine the neurophysiological correlates of processing incongruent emotional information, independent of task difficulty. Subjects were scanned while judging the emotion of face-voice affective stimuli. Both the face and voice were parametrically morphed between anger and happiness and then paired in all audiovisual combinations, resulting in stimuli each defined by two separate values: the degree of incongruence between the face and voice, and the degree of clarity of the combined face-voice information. Due to the specific morphing procedure utilized, we hypothesized that the clarity value, rather than incongruence value, would better reflect task difficulty. Behavioral data revealed that participants integrated face and voice affective information, and that the clarity, as opposed to incongruence value correlated with categorization difficulty. Cerebrally, incongruence was more associated with activity in the superior temporal region, which emerged after task difficulty had been accounted for. Overall, our results suggest that activation in the superior temporal region in response to incongruent information cannot be explained simply by task difficulty, and may rather be due to detection of mismatching information between the two modalities
How do you say ‘hello’? Personality impressions from brief novel voices
On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word ‘hello’ on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional ‘social voice space’ with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices
The effects of stimulus complexity on the preattentive processing of self-generated and nonself voices: an ERP study
The ability to differentiate one's own voice from the voice of somebody else plays a critical role in successful verbal self-monitoring processes and in communication. However, most of the existing studies have only focused on the sensory correlates of self-generated voice processing, whereas the effects of attentional demands and stimulus complexity on self-generated voice processing remain largely unknown. In this study, we investigated the effects of stimulus complexity on the preattentive processing of self and nonself voice stimuli. Event-related potentials (ERPs) were recorded from 17 healthy males who watched a silent movie while ignoring prerecorded self-generated (SGV) and nonself (NSV) voice stimuli, consisting of a vocalization (vocalization category condition: VCC) or of a disyllabic word (word category condition: WCC). All voice stimuli were presented as standard and deviant events in four distinct oddball sequences. The mismatch negativity (MMN) ERP component peaked earlier for NSV than for SGV stimuli. Moreover, when compared with SGV stimuli, the P3a amplitude was increased for NSV stimuli in the VCC only, whereas in the WCC no significant differences were found between the two voice types. These findings suggest differences in the time course of automatic detection of a change in voice identity. In addition, they suggest that stimulus complexity modulates the magnitude of the orienting response to SGV and NSV stimuli, extending previous findings on self-voice processing.This work was supported by Grant Numbers IF/00334/2012, PTDC/PSI-PCL/116626/2010, and PTDC/MHN-PCN/3606/2012, funded by the Fundacao para a Ciencia e a Tecnologia (FCT, Portugal) and the Fundo Europeu de Desenvolvimento Regional through the European programs Quadro de Referencia Estrategico Nacional and Programa Operacional Factores de Competitividade, awarded to A.P.P., and by FCT Doctoral Grant Number SFRH/BD/77681/2011, awarded to T.C.info:eu-repo/semantics/publishedVersio
Summary statistics in auditory perception
Sensory signals are transduced at high resolution, but their structure must be stored in a more compact format. Here we provide evidence that the auditory system summarizes the temporal details of sounds using time-averaged statistics. We measured discrimination of 'sound textures' that were characterized by particular statistical properties, as normally result from the superposition of many acoustic features in auditory scenes. When listeners discriminated examples of different textures, performance improved with excerpt duration. In contrast, when listeners discriminated different examples of the same texture, performance declined with duration, a paradoxical result given that the information available for discrimination grows with duration. These results indicate that once these sounds are of moderate length, the brain's representation is limited to time-averaged statistics, which, for different examples of the same texture, converge to the same values with increasing duration. Such statistical representations produce good categorical discrimination, but limit the ability to discern temporal detail.Howard Hughes Medical Institut
The activation of eco-driving mental models: can text messages prime drivers to use their existing knowledge and skills?
Eco-driving campaigns have traditionally assumed that drivers lack the necessary knowledge and skills and that this is something that needs rectifying. Therefore, many support systems have been designed to closely guide drivers and fine-tune their proficiency. However, research suggests that drivers already possess a substantial amount of the necessary knowledge and skills regarding eco-driving. In previous studies, participants used these effectively when they were explicitly asked to drive fuel-efficiently. In contrast, they used their safe driving skills when they were instructed to drive as they would normally. Hence, it is assumed that many drivers choose not to engage purposefully in eco-driving in their everyday lives. The aim of the current study was to investigate the effect of simple, periodic text messages (nine messages in 2 weeks) on drivers’ eco- and safe driving performance. It was hypothesised that provision of eco-driving primes and advice would encourage the activation of their eco-driving mental models and that comparable safety primes increase driving safety. For this purpose, a driving simulator experiment was conducted. All participants performed a pre-test drive and were then randomly divided into four groups, which received different interventions. For a period of 2 weeks, one group received text messages with eco-driving primes and another group received safety primes. A third group received advice messages on how to eco-drive. The fourth group were instructed by the experimenter to drive fuel-efficiently, immediately before driving, with no text message intervention. A post-test drive measured behavioural changes in scenarios deemed relevant to eco- and safe driving. The results suggest that the eco-driving prime and advice text messages did not have the desired effect. In comparison, asking drivers to drive fuel-efficiently led to eco-driving behaviours. These outcomes demonstrate the difficulty in changing ingrained habits. Future research is needed to strengthen such messages or activate existing knowledge and skills in other ways, so driver behaviour can be changed in cost-efficient ways
A neural signature of the unique hues
Since at least the 17th century there has been the idea that there are four simple and perceptually pure “unique” hues: red, yellow, green, and blue, and that all other hues are perceived as mixtures of these four hues. However, sustained scientific investigation has not yet provided solid evidence for a neural representation that separates the unique hues from other colors. We measured event-related potentials elicited from unique hues and the ‘intermediate’ hues in between them. We find a neural signature of the unique hues 230 ms after stimulus onset at a post-perceptual stage of visual processing. Specifically, the posterior P2 component over the parieto-occipital lobe peaked significantly earlier for the unique than for the intermediate hues (Z = -2.9, p = .004). Having identified a neural marker for unique hues, fundamental questions about the contribution of neural hardwiring, language and environment to the unique hues can now be addressed
Listeners form average-based representations of individual voice identities.
Models of voice perception propose that identities are encoded relative to an abstracted average or prototype. While there is some evidence for norm-based coding when learning to discriminate different voices, little is known about how the representation of an individual's voice identity is formed through variable exposure to that voice. In two experiments, we show evidence that participants form abstracted representations of individual voice identities based on averages, despite having never been exposed to these averages during learning. We created 3 perceptually distinct voice identities, fully controlling their within-person variability. Listeners first learned to recognise these identities based on ring-shaped distributions located around the perimeter of within-person voice spaces - crucially, these distributions were missing their centres. At test, listeners' accuracy for old/new judgements was higher for stimuli located on an untrained distribution nested around the centre of each ring-shaped distribution compared to stimuli on the trained ring-shaped distribution
Cerebral processing of voice gender studied using a continuous carryover fMRI design
Normal listeners effortlessly determine a person's gender by voice, but the cerebral mechanisms underlying this ability remain unclear. Here, we demonstrate 2 stages of cerebral processing during voice gender categorization. Using voice morphing along with an adaptation-optimized functional magnetic resonance imaging design, we found that secondary auditory cortex including the anterior part of the temporal voice areas in the right hemisphere responded primarily to acoustical distance with the previously heard stimulus. In contrast, a network of bilateral regions involving inferior prefrontal and anterior and posterior cingulate cortex reflected perceived stimulus ambiguity. These findings suggest that voice gender recognition involves neuronal populations along the auditory ventral stream responsible for auditory feature extraction, functioning in pair with the prefrontal cortex in voice gender perception
- …
