Search CORE

7,655 research outputs found

Norm-based coding of voice identity in human auditory cortex

Author: Andics
Baumann
Belin
Belin
Bestelmeyer
Bruckert
Bruckert
Campanella
Charest
Fitch
Formisano
Garrido
Grill-Spector
Hillenbrand
Johnston
Kahn
Kikuchi
Kreiman
Kreiman
Latinus
Latinus
Lavner
Leopold
Leopold
Lewis
Loffler
Logothetis
Marianne Latinus
Nichols
Papcun
Pascal Belin
Patel
Patricia E.G. Bestelmeyer
Phil McAleer
Rauschecker
Rousselet
Schweinberger
Smith
Titze
Tsao
Valentine
Van Lancker
von Kriegstein
Wilcox
Publication venue: 'Elsevier BV'
Publication date: 23/05/2013
Field of study

Listeners exploit small interindividual variations around a generic acoustical structure to discriminate and identify individuals from their voice—a key requirement for social interactions. The human brain contains temporal voice areas (TVA) [1] involved in an acoustic-based representation of voice identity [2, 3, 4, 5 and 6], but the underlying coding mechanisms remain unknown. Indirect evidence suggests that identity representation in these areas could rely on a norm-based coding mechanism [4, 7, 8, 9, 10 and 11]. Here, we show by using fMRI that voice identity is coded in the TVA as a function of acoustical distance to two internal voice prototypes (one male, one female)—approximated here by averaging a large number of same-gender voices by using morphing [12]. Voices more distant from their prototype are perceived as more distinctive and elicit greater neuronal activity in voice-sensitive cortex than closer voices—a phenomenon not merely explained by neuronal adaptation [13 and 14]. Moreover, explicit manipulations of distance-to-mean by morphing voices toward (or away from) their prototype elicit reduced (or enhanced) neuronal activity. These results indicate that voice-sensitive cortex integrates relevant acoustical features into a complex representation referenced to idealized male and female voice prototypes. More generally, they shed light on remarkable similarities in cerebral representations of facial and vocal identity

Elsevier - Publisher Connector

Phonetic content influences voice discriminability

Author: Andics A.
McQueen J.
Van Turennout M.
Publication venue
Publication date: 01/01/2007
Field of study

We present results from an experiment which shows that voice perception is influenced by the phonetic content of speech. Dutch listeners were presented with thirteen speakers pronouncing CVC words with systematically varying segmental content, and they had to discriminate the speakers’ voices. Results show that certain segments help listeners discriminate voices more than other segments do. Voice information can be extracted from every segmental position of a monosyllabic word and is processed rapidly. We also show that although relative discriminability within a closed set of voices appears to be a stable property of a voice, it is also influenced by segmental cues – that is, perceived uniqueness of a voice depends on what that voice says

MPG.PuRe

How do you say ‘hello’? Personality impressions from brief novel voices

Author: A Todorov
A Todorov
AC Little
AC Little
AC Little
Alexander Todorov
AW Young
B Yao
B Yao
BC Jones
C Ferdenzi
C Nass
CA Klofstad
CA Sutherland
CC Tigue
CD Aronovitch
Charles R. Larson
CL Apicella
CL Apicella
CP Said
CT Ferrand
CY Olivola
D Rendall
DA Kenny
DA Leopold
DA Puts
DC Funder
DR Feinberg
DR Feinberg
DS Berry
DS Berry
DS Berry
DS Berry
E Vannoni
FT Passini
G Rhodes
G Rhodes
GW Allport
IR Titze
IS Penton-Voak
J Kreiman
J Willis
JH Langlois
JH Langlois
JJ Horton
JJ Ohala
JM Montepare
JS Wiggins
JW Lewis
K Grammer
K Miyake
KR Scherer
L Bruckert
L Bruckert
L Germine
LA Zebrowitz
LA Zebrowitz
LA Zebrowitz
LA Zebrowitz
LA Zebrowitz-McArthur
LZ McArthur
M Latinus
M Latinus
M Latinus
M Shevlin
M Zuckerman
M Zuckerman
M Zuckerman
NJ Lass
NJ Lass
NN Oosterhof
O Baumann
P Belin
P Belin
P Boersma
P Borkenau
Pascal Belin
PE Bestelmeyer
PEG Bestelmeyer
Phil McAleer
R Hassin
RA Page
RM Krauss
RR Mccrae
RS Kramer
S Evans
S Patel
S Rosenberg
SA Collins
SC Verosky
SJ Ko
SM Hughes
SM Hughes
SM Hughes
ST Fiske
TK Perrachione
V Bruce
V Pivonkova
VX Luevano
WA van Dommelen
WA van Dommelen
WT Fitch
WT Fitch
WT Norman
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word ‘hello’ on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional ‘social voice space’ with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices

Crossref

HAL AMU

Directory of Open Access Journals

PubMed Central

Enlighten

FigShare

Cracking the social code of speech prosody using reverse correlation

Author: Aucouturier Jean-Julien
Belin Pascal
Burred Juan José
Ponsot Emmanuel
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2018
Field of study

Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker's traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker's perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word "Hello," which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers' physical characteristics, such as sex and mean pitch. By characterizing how any given individual's mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals

Crossref

HAL AMU

Enlighten

Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

Author: Ames Heather
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 24/11/2007
Field of study

Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

Boston University Institutional Repository (OpenBU)

How Do I Address You? Modelling addressing behavior based on an analysis of a multi-modal corpora of conversational discourse

Author: Akker Rieks op den
Theune Mariët
Publication venue: Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AISB)
Publication date: 01/01/2008
Field of study

Addressing is a special kind of referring and thus principles of multi-modal referring expression generation will also be basic for generation of address terms and addressing gestures for conversational agents. Addressing is a special kind of referring because of the different (second person instead of object) role that the referent has in the interaction. Based on an analysis of addressing behaviour in multi-party face-to-face conversations (meetings, TV discussions as well as theater plays), we present outlines of a model for generating multi-modal verbal and non-verbal addressing behaviour for agents in multi-party interactions

CiteSeerX

University of Twente Research Information

The sound of trustworthiness: acoustic-based modulation of perceived voice personality

Author: Belin Pascal
Boehme Bibi
McAleer Philip
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/10/2017
Field of study

When we hear a new voice we automatically form a "first impression" of the voice owner’s personality; a single word is sufficient to yield ratings highly consistent across listeners. Past studies have shown correlations between personality ratings and acoustical parameters of voice, suggesting a potential acoustical basis for voice personality impressions, but its nature and extent remain unclear. Here we used data-driven voice computational modelling to investigate the link between acoustics and perceived trustworthiness in the single word "hello". Two prototypical voice stimuli were generated based on the acoustical features of voices rated low or high in perceived trustworthiness, respectively, as well as a continuum of stimuli inter- and extrapolated between these two prototypes. Five hundred listeners provided trustworthiness ratings on the stimuli via an online interface. We observed an extremely tight relationship between trustworthiness ratings and position along the trustworthiness continuum (r = 0.99). Not only were trustworthiness ratings higher for the high- than the low-prototypes, but the difference could be modulated quasi-linearly by reducing or exaggerating the acoustical difference between the prototypes, resulting in a strong caricaturing effect. The f0 trajectory, or intonation, appeared a parameter of particular relevance: hellos rated high in trustworthiness were characterized by a high starting f0 then a marked decrease at mid-utterance to finish on a strong rise. These results demonstrate a strong acoustical basis for voice personality impressions, opening the door to multiple potential applications

HAL AMU

Directory of Open Access Journals

Enlighten

The evolution of auditory contrast

Author: Boersma Paul
Hamann Silke
Publication venue
Publication date: 01/10/2009
Field of study

This paper reconciles the standpoint that language users do not aim at improving their sound systems with the observation that languages seem to improve their sound systems. Computer simulations of inventories of sibilants show that Optimality-Theoretic learners who optimize their perception grammars automatically introduce a so-called prototype effect, i.e. the phenomenon that the learner’s preferred auditory realization of a certain phonological category is more peripheral than the average auditory realization of this category in her language environment. In production, however, this prototype effect is counteracted by an articulatory effect that limits the auditory form to something that is not too difficult to pronounce. If the prototype effect and the articulatory effect are of a different size, the learner must end up with an auditorily different sound system from that of her language environment. The computer simulations show that, independently of the initial auditory sound system, a stable equilibrium is reached within a small number of generations. In this stable state, the dispersion of the sibilants of the language strikes an optimal balance between articulatory ease and auditory contrast. The important point is that this is derived within a model without any goal-oriented elements such as dispersion constraints

Hochschulschriftenserver - Universität Frankfurt am Main

Recommended from our members

How many voices did you hear? Natural variability disrupts identity perception from unfamiliar voices

Author: Burston L. F. K.
Garrido L.
Lavan N.
Publication venue: 'Wiley'
Publication date: 01/08/2019
Field of study

Our voices sound different depending on the context (laughing vs. talking to a child vs. giving a speech), making within‐person variability an inherent feature of human voices. When perceiving speaker identities, listeners therefore need to not only ‘tell people apart’ (perceiving exemplars from two different speakers as separate identities) but also ‘tell people together’ (perceiving different exemplars from the same speaker as a single identity). In the current study, we investigated how such natural within‐person variability affects voice identity perception. Using voices from a popular TV show, listeners, who were either familiar or unfamiliar with this show, sorted naturally varying voice clips from two speakers into clusters to represent perceived identities. Across three independent participant samples, unfamiliar listeners perceived more identities than familiar listeners and frequently mistook exemplars from the same speaker to be different identities. These findings point towards a selective failure in ‘telling people together’. Our study highlights within‐person variability as a key feature of voices that has striking effects on (unfamiliar) voice identity perception. Our findings not only open up a new line of enquiry in the field of voice perception but also call for a re‐evaluation of theoretical models to account for natural variability during identity perception

City Research Online

Crossref

Queen Mary Research Online