Search CORE

139 research outputs found

Auditory smiles trigger unconscious facial imitation

Author: Arias Pablo
Aucouturier Jean-Julien
Belin Pascal
Publication venue: 'Elsevier BV'
Publication date: 23/06/2018
Field of study

Smiles, produced by the bilateral contraction of the zygomatic major muscles, are one of the most powerful expressions of positive affect and affiliation and also one of the earliest to develop [1]. The perception-action loop responsible for the fast and spontaneous imitation of a smile is considered a core component of social cognition [2]. In humans, social interaction is overwhelmingly vocal, and the visual cues of a smiling face co-occur with audible articulatory changes on the speaking voice [3]. Yet remarkably little is known about how such 'auditory smiles' are processed and reacted to. We have developed a voice transformation technique that selectively simulates the spectral signature of phonation with stretched lips and report here how we have used this technique to study facial reactions to smiled and non-smiled spoken sentences, finding that listeners' zygomatic muscles tracked auditory smile gestures even when they did not consciously detect them

HAL AMU

Enlighten

The bag-of-frames approach: a not so sufficient model for urban soundscapes

Author: Aucouturier Jean-Julien
Defreville Boris
Lafay Grégoire
Lagrange Mathieu
Publication venue
Publication date: 20/10/2015
Field of study

The "bag-of-frames" approach (BOF), which encodes audio signals as the long-term statistical distribution of short-term spectral features, is commonly regarded as an effective and sufficient way to represent environmental sound recordings (soundscapes) since its introduction in an influential 2007 article. The present paper describes a concep-tual replication of this seminal article using several new soundscape datasets, with results strongly questioning the adequacy of the BOF approach for the task. We show that the good accuracy originally re-ported with BOF likely result from a particularly thankful dataset with low within-class variability, and that for more realistic datasets, BOF in fact does not perform significantly better than a mere one-point av-erage of the signal's features. Soundscape modeling, therefore, may not be the closed case it was once thought to be. Progress, we ar-gue, could lie in reconsidering the problem of considering individual acoustical events within each soundscape

arXiv.org e-Print Archive

Cracking the social code of speech prosody using reverse correlation

Author: Aucouturier Jean-Julien
Belin Pascal
Burred Juan José
Ponsot Emmanuel
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2018
Field of study

Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker's traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker's perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word "Hello," which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers' physical characteristics, such as sex and mean pitch. By characterizing how any given individual's mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals

Crossref

HAL AMU

Enlighten

One hundred ways to process time, frequency, rate and scale in the central auditory system: a pattern-recognition meta-analysis

Author: Aucouturier Jean-Julien
Hemery Edgar
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

International audienceThe mammalian auditory system extracts features from the acoustic environment based on the responses of spatially distributed sets of neurons in the subcortical and cortical auditory structures. The characteristic responses of these neurons (linearly approximated by their spectro-temporal receptive fields, or STRFs) suggest that auditory representations are formed, as early as in the inferior colliculi, on the basis of a time, frequency, rate (temporal modulations) and scale (spectral modulations) analysis of sound. However, how these four dimensions are integrated and processed in subsequent neural networks remains unclear. In this work, we present a new methodology to generate computational insights into the functional organization of such processes. We first propose a systematic framework to explore more than a hundred different computational strategies proposed in the literature to process the output of a generic STRF model. We then evaluate these strategies on their ability to compute perceptual distances between pairs of environmental sounds. Finally, we conduct a meta-analysis of the dataset of all these algorithms' accuracies to examine whether certain combinations of dimensions and certain ways to treat such dimensions are, on the whole, more computationally effective than others. We present an application of this methodology to a dataset of ten environmental sound categories, in which the analysis reveals that (1) models are most effective when they organize STRF data into frequency groupings—which is consistent with the known tonotopic organization of receptive fields in auditory structures-, and that (2) models that treat STRF data as time series are no more effective than models that rely only on summary statistics along time—which corroborates recent experimental evidence on texture discrimination by summary statistics

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

HAL-MINES ParisTech

Content Management for the Live Music Industry in Virtual Worlds: Challenges and Opportunities

Author: Aucouturier Jean-Julien
Lüthy Marco
Publication venue: The Virtual Worlds Research Consortium
Publication date: 01/01/2013
Field of study

International audienceThe real-world music industry is undergoing a transition away from the retailing and distribution of fixed objects (records, files) to the consumption of live,interactive events (concerts, happenings). This development is paralleled with the recent flourishing of live music in virtual worlds, which in many ways could become the epitome of its real-world counterpart: for the artists, virtual concerts are cheap and easy to organize, and can therefore be a viable alternative to performing in the real world; for the music promoter and marketer, virtual concert attendance can be traced and analyzed more easily than in the real world; for the virtual concertgoer, attending concerts that are happening a (virtual) world away is possible with a single click. Taking insights from both a survey among the Second-Life music practitioners and from our own prototype of a live music recommendation system built on top of Second-Life, this article shows that the technical infrastructure of current virtual worlds is not well-suited to the development of the content management tools needed to support this opportunity. We propose several new ways to address these problems, and advocate for their recognition both by the artistic and the technical community

Journal For Virtual Worlds Research (Texas Digital Library - TDL E-Journals)

Uncovering mental representations of smiled speech using reverse correlation

Author: Arias Pablo
Aucouturier Jean-Julien
Ponsot Emmanuel
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2018
Field of study

International audienc

L’interaction sociale en musique

Author: Aucouturier Jean-Julien
Publication venue: APEMU
Publication date: 01/01/2017
Field of study

Paris. Il a répondu à l'interview de l'APÉMu mais collabore avec Clément Cannone au sein du même laboratoire. Le site de leur équipe de recherche est : http://cream.ircam.fr. L'interaction sociale en musique Jean-Julien AUCOUTURIER Interview réalisée par Mathilde AUGER Vous êtes chercheur à l'IRCAM. Pouvez-vous préciser ce qu' est cette structure et votre rôle en son sein? L'IRCAM est un centre qui a une forte culture informatique depuis sa création il y a 40 ans par Pierre Boulez. Il l'a créé pour faire travailler les compositeurs de musique contemporaine avec les nouveaux outils de l'informatique et du traitement du son par ordinateur. L'équipe avec laquelle je travaille s'intéresse à l'humain, à la cognition, au cerveau, à ce que l'on pourrait appeler la psychologie expérimentale de l'émotion. Beaucoup de gens dans le monde s'intéressent à la « psychologie de la musique ». Notre force, en travaillant à l'IRCAM, est de disposer d'outils qui nous permettent d'analyser informatiquement le son et de le modifier, autrement dit d'étudier ce qui, dans le son, la voix ou la musique, crée des émotions. L' un de vos projets de recherche est particulière-ment intéressant pour les professeurs d' éducation musicale : il concerne les duos improvisés et l' observation des interactions sociales. Pouvez-vous nous en dire plus? Le dispositif a consisté à installer dans deux studios séparés un musicien instrumentiste avec un casque qui lui permet d'entendre ce que joue l'autre ; l'un des deux a pour consigne d'improviser selon une attitude particulière (agressivité, empathie, etc.) et l'autre doit improviser en interaction. Au terme, de l'expérience, le second doit deviner quelle avait été la consigne donnée au premier. Cette expérience ne vise pas seulement l'observation des émotions, elle permet surtout de constater comment se construit en musique la compréhension de l'information sociale ; on étudie ce langage non verbal et comment il existe dans le son. Nos outils d'observation sont informatiques, nous développons des logiciels qui nous permettent de faire des expériences, de créer des situations qui sont très intéressantes d'un point de vue scientifique mais qui ne sont pas habituelles dans le quotidien humain ! Cela nous permet de tester des idées puis de dresser des hypothèses : si je modifie des éléments sonores, comment l'auditeur perçoit ces changements? Contrairement à la musicologie traditionnelle, l'analyse de la partition ne nous intéresse pas beaucoup. En revanche, la façon dont les choses sont dites par les modifications de timbre, de phrasé, d'inflexion est beaucoup plus parlante. Il est intéressant de constater que nos cours d' éducation musicale vont un peu dans le même sens : le rapport à la partition a un peu disparu, nous nous focalisons plutôt sur l'univers sonore dans ses dimensions physiques. Ma fille est en classe de cinquième donc j'observe effectivement cela ! En tout cas, mon équipe s'intéresse à la cognition musicale et cherche à observer comment le cerveau comprend la musique. C'est un sujet de recherche assez récent-qui date des années 50-qui a été initié par des chercheurs qui avaient une vraie culture de musicologie historique ; leur travail était principalement centré sur le contenu de la partition, sur la conviction que la musique exprime quelque-chose dans la façon dont elle est construite : tel enchaînement harmonique entraîne telle émotion, avec l'idée d'attentes qui sont réalisées ou pas ; cela reflète la manière dont la musique occidentale manipule l'auditeur. À cette époque-là, le seul moyen dont disposait le chercheur était de demander à l'interprète de jouer puis de lui demander de modifier un élément-un do# ici plutôt qu'un do ; ce qui fait notre richesse aujourd'hui, c'est que nous pouvons modifier directement le son : la note, mais aussi sa couleur, sa durée, y compris avec des changements qui ne serait pas possible de réaliser avec un instrument

Vers un appareillage auditif personnalisé qui préserve les émotions

Author: Aucouturier Jean-Julien
Publication venue: HAL CCSD
Publication date: 01/09/2019
Field of study

International audienceLes résultats scientifiques de la psychoacoustique et des sciences de l'audition décrivent principalement les mécanismes bas-niveau permettant à un auditeur d'extraire les dimensions fondamentales d'un son, comme sa hauteur ou son intensité, ainsi que les caractéristiques permettant de comprendre la parole. La préservation de ces caractéristiques sonores a été traditionnellement, et à juste titre, la cible privilégiée de l'appareillage auditif. Cependant, notre utilisation quotidienne du son et de la parole ne se limite pas à sa seule détection et compréhension, mais englobe aussi toute une série de mécanismes haut-niveau permettant, par exemple, de percevoir une intonation bienveillante ou moqueuse chez notre interlocuteur, de reconnaître le timbre de voix particulier d'un proche, ou d'apprécier l'émotion joyeuse ou triste d'une mélodie musicale. Alors que ces mécanismes de perception haut-niveau sont aujourd'hui relativement bien étudiés dans le domaine visuel (Jack & Schyns, 2015), nous manquons encore de recul sur les caractéristiques sonores à préserver pour faciliter ce genre de traitements chez la personne appareillée auditivement

Notre voix peut-elle prendre le pouvoir sur nos émotions ?

Author: Aucouturier Jean-Julien
Publication venue: FNAREN
Publication date: 01/01/2017
Field of study

Sound context modulates perceived vocal emotion

Author: Aucouturier Jean-Julien
Bryant Gregory,
Liuni Marco
PONSOT Emmanuel
Publication venue: 'Elsevier BV'
Publication date: 01/03/2020
Field of study

International audienceMany animal vocalizations contain nonlinear acoustic phenomena as a consequence of physiological arousal. In humans, nonlinear features are processed early in the auditory system, and are used to efficiently detect alarm calls and other urgent signals. Yet, high-level emotional and semantic contextual factors likely guide the perception and evaluation of roughness features in vocal sounds. Here we examined the relationship between perceived vocal arousal and auditory context. We presented listeners with nonverbal vocalizations (yells of a single vowel) at varying levels of portrayed vocal arousal, in two musical contexts (clean guitar, distorted guitar) and one non-musical context (modulated noise). As predicted, vocalizations with higher levels of portrayed vocal arousal were judged as more negative and more emotionally aroused than the same voices produced with low vocal arousal. Moreover, both the perceived valence and emotional arousal of vocalizations were significantly affected by both musical and non-musical contexts. These results show the importance of auditory context in judging emotional arousal and valence in voices and music, and suggest that nonlinear features in music are processed similarly to communicative vocal signals

eScholarship - University of California