17 research outputs found
Revisiting the two-mass model of the vocal folds
Realistic mathematical modeling of voice production has been recently boosted
by applications to different fields like bioprosthetics, quality speech
synthesis and pathological diagnosis. In this work, we revisit a two-mass model
of the vocal folds that includes accurate fluid mechanics for the air passage
through the folds and nonlinear properties of the tissue. We present the
bifurcation diagram for such a system, focusing on the dynamical properties of
two regimes of interest: the onset of oscillations and the normal phonation
regime. We also show theoretical support to the nonlinear nature of the elastic
properties of the folds tissue by comparing theoretical isofrequency curves
with reported experimental data.Comment: 7 pages, 5 figure
Vocal caricatures reveal signatures of speaker identity
What are the features that impersonators select to elicit a speaker’s identity? We built a voice database of
public figures (targets) and imitations produced by professional impersonators. They produced one
imitation based on their memory of the target (caricature) and another one after listening to the target audio
(replica). A set of naive participants then judged identity and similarity of pairs of voices. Identity was better
evoked by the caricatures and replicas were perceived to be closer to the targets in terms of voice similarity.
We used this data to map relevant acoustic dimensions for each task. Our results indicate that speaker
identity is mainly associated with vocal tract features, while perception of voice similarity is related to vocal
folds parameters.Wetherefore show the way in which acoustic caricatures emphasize identity features at the
cost of loosing similarity, which allows drawing an analogy with caricatures in the visual space.Fil: López, Sabrina. Dynamical Systems Lab, IFIBA-Physics dept, University of Buenos Aires, Pabellón 1, Ciudad Universitaria, CABA 1428EGA, ArgentinaFil: Riera, Pablo. Acoustics and Sound Perception Lab, Universidad of Quilmes, Roque Saénz Peña 352, Bernal, Buenos Aires B1876BXD, ArgentinaFil: Assaneo, MarÃa Florencia. Dynamical Systems Lab, IFIBA-Physics dept, University of Buenos Aires, Pabellón 1, Ciudad Universitaria, CABA 1428EGA, ArgentinaFil: EguÃa, Manuel. Acoustics and Sound Perception Lab, Universidad of Quilmes, Roque Saénz Peña 352, Bernal, Buenos Aires B1876BXD, Argentin
The Anatomy of Onomatopoeia
Virtually every human faculty engage with imitation. One of the most natural and unexplored objects for the study of the mimetic elements in language is the onomatopoeia, as it implies an imitative-driven transformation of a sound of nature into a word. Notably, simple sounds are transformed into complex strings of vowels and consonants, making difficult to identify what is acoustically preserved in this operation. In this work we propose a definition for vocal imitation by which sounds are transformed into the speech elements that minimize their spectral difference within the constraints of the vocal system. In order to test this definition, we use a computational model that allows recovering anatomical features of the vocal system from experimental sound data. We explore the vocal configurations that best reproduce non-speech sounds, like striking blows on a door or the sharp sounds generated by pressing on light switches or computer mouse buttons. From the anatomical point of view, the configurations obtained are readily associated with co-articulated consonants, and we show perceptual evidence that these consonants are positively associated with the original sounds. Moreover, the pairs vowel-consonant that compose these co-articulations correspond to the most stable syllables found in the knock and click onomatopoeias across languages, suggesting a mechanism by which vocal imitation naturally embeds single sounds into more complex speech structures. Other mimetic forces received extensive attention by the scientific community, such as cross-modal associations between speech and visual categories. The present approach helps building a global view of the mimetic forces acting on language and opens a new venue for a quantitative study of word formation in terms of vocal imitation
Dynamics and coexistence in a system with intraguild mutualism
It is a tenet of ecological theory that two competing consumers cannot stably coexist on a single limiting resource in a homogeneous environment. Many mechanisms and processes have since been evoked and studied, empirically and theoretically, to explain species coexistence and the observed biological diversity. Facilitative interactions clearly have the potential to enhance coexistence. Yet, even though mutual facilitation between species of the same guild is widely documented empirically, the subject has received very little theoretical attention. Here, we study one form of intraguild mutualism in the simplest possibly community module of one resource and two consumers. We incorporate mutualism as enhanced consumption in the presence of the other consumers. We find that intraguild mutualism can (a) significantly enhance coexistence of consumers, (b) induce cyclic dynamics, and (c) give rise to a bi-stability (a 'joint' Allee effect) and potentially catastrophic collapse of both consumer species.Fil: Assaneo, MarÃa Florencia. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de FÃsica. Laboratorio de Sistemas Dinámicos; ArgentinaFil: Coutinho, Renato Mendes. Universidade Estadual Paulista Julio de Mesquita Filho; BrasilFil: Lin, Yangchen. University of Cambridge; Estados UnidosFil: Mantilla, Carlos. Universidad de Carabobo.; VenezuelaFil: Lutscher, Frithjof. University of Ottawa; Canad
The audiovisual structure of onomatopoeias: An intrusion of real-world physics in lexical creation
Sound-symbolic word classes are found in different cultures and languages worldwide. These words are continuously produced to code complex information about events. Here we explore the capacity of creative language to transport complex multisensory information in a controlled experiment, where our participants improvised onomatopoeias from noisy moving objects in audio, visual and audiovisual formats. We found that consonants communicate movement types (slide, hit or ring) mainly through the manner of articulation in the vocal tract. Vowels communicate shapes in visual stimuli (spiky or rounded) and sound frequencies in auditory stimuli through the configuration of the lips and tongue. A machine learning model was trained to classify movement types and used to validate generalizations of our results across formats. We implemented the classifier with a list of cross-linguistic onomatopoeias simple actions were correctly classified, while different aspects were selected to build onomatopoeias of complex actions. These results show how the different aspects of complex sensory information are coded and how they interact in the creation of novel onomatopoeias.Fil: Taitz, Alan. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; ArgentinaFil: Assaneo, MarÃa Florencia. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; ArgentinaFil: Elisei, Natalia Gabriela. Universidad de Buenos Aires. Facultad de Medicina; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas; ArgentinaFil: Tripodi, Monica Noemi. Universidad de Buenos Aires; ArgentinaFil: Cohen, Laurent. Centre National de la Recherche Scientifique; Francia. Universite Pierre et Marie Curie; Francia. Institut National de la Santé et de la Recherche Médicale; FranciaFil: Sitt, Jacobo Diego. Centre National de la Recherche Scientifique; Francia. Consejo Nacional de Investigaciones CientÃficas y Técnicas; Argentina. Institut National de la Santé et de la Recherche Médicale; Francia. Universite Pierre et Marie Curie; FranciaFil: Trevisan, Marcos Alberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; Argentin
Vocal caricatures reveal signatures of speaker identity
What are the features that impersonators select to elicit a speaker’s identity? We built a voice database of public figures (targets) and imitations produced by professional impersonators. They produced one imitation based on their memory of the target (caricature) and another one after listening to the target audio (replica). A set of naive participants then judged identity and similarity of pairs of voices. Identity was better evoked by the caricatures and replicas were perceived to be closer to the targets in terms of voice similarity. We used this data to map relevant acoustic dimensions for each task. Our results indicate that speaker identity is mainly associated with vocal tract features, while perception of voice similarity is related to vocal folds parameters. We therefore show the way in which acoustic caricatures emphasize identity features at the cost of loosing similarity, which allows drawing an analogy with caricatures in the visual space.Fil: Lopez, Sabrina Laura. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de FÃsica. Laboratorio de Sistemas Dinámicos; ArgentinaFil: Riera, Pablo Ernesto. Universidad Nacional de Quilmes. Departamento de Ciencia y TecnologÃa. Laboratorio de Acustica y Percepción Sonora; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas; ArgentinaFil: Assaneo, MarÃa Florencia. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de FÃsica. Laboratorio de Sistemas Dinámicos; ArgentinaFil: Eguia, Manuel Camilo. Universidad Nacional de Quilmes. Departamento de Ciencia y TecnologÃa. Laboratorio de Acustica y Percepción Sonora; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas; ArgentinaFil: Sigman, Mariano. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de FÃsica. Laboratorio de Neurociencia Integrativa; Argentina. Universidad Torcuato Di Tella; ArgentinaFil: Trevisan, Marcos Alberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica de Buenos Aires; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de FÃsica. Laboratorio de Sistemas Dinámicos; Argentin
Modeling of the human vocal system and its application to studies of speech perception and production
Desde el punto de vista biológico el proceso del habla puede separarse endos etapas moduladas entre sÃ: la producción y la percepción. En este trabajonos ocupamos de ambas, concentrándonos especialmente en la primera. El sistema vocal humano está formado por dos grandes bloques: las cuerdasvocales y el tracto vocal. Las cuerdas vocales constituyen la fuente acústica,determinando la entonación del discurso, mientras que el contenido fonético (los sonidos propios de la lengua) es definido por la dinámica del tractovocal. En esta tesis presentamos un modelo completo de producción vocal,incluyendo el estudio dinámico de un modelo detallado de cuerdas vocales ysu adaptación a un modelo de baja dimensión del tracto vocal. Para evaluar la calidad de la voz sintetizada con el modelo, utilizamosuna combinación de test perceptuales y de resonancia magnética funcional,cuyos resultados muestran que la voz sintética es indistinguible de segmentosde voz real. Los sintetizadores basados en la fÃsica de la producción de vozpermiten además el estudio de la percepción de voz controlando parámetrosbiológicos. En particular, en este trabajo mostramos que la identidad de lavoz está codificada en términos de las dimensiones relativas entre las cuerdasvocales y el tracto vocal. Usamos este modelo de voz verificado experimentalmente para responderpreguntas de la biolingüÃstica y la biomimética. En primer lugar, investigamosel rol de la fÃsica del aparato vocal en la formación de las onomatopeyas. A pesar de considerarse palabras vinculadas directamente con la imitación,es difÃcil establecer qué se preserva acústicamente entre los sonidos y susonomatopeyas. Utilizamos el modelo vocal para mostrar que las configuracionesdel tracto vocal que producen los sonidos más parecidos a los originalescorresponden a consonantes co-articuladas. Estos pares vocal-consonante secorresponden, además, con las sÃlabas más estables de las onomatopeyas endistintos idiomas, sugiriendo un mecanismo por el cual la imitación vocal permite asociar sonidos simples a estructuras de habla más complejas. Por otra parte, nos preguntamos cuál es la dimensionalidad del espaciomotor que gobierna la producción de habla. Para abordar este problema diseñamos un dispositivo experimental que permite monitorear tres puntos dela cavidad oral durante el discurso. Con esta herramienta, logramos una descripción discreta para las coordenadas motoras de las vocales y consonantesoclusivas del español, mostrando además la viabilidad de controlar el modelode producción vocal con variables anatómicas para la sÃntesis de voz entiempo real a partir de los gestos anatómicos producidos durante el habla.From a biological point of view the ability of speaking can be split intwo intermodulated processes: production and perception. In this work weinvestigated both of them from a physical perspective, focusing on the firstone. The physical process associated with the production of voice rely on thevocal anatomy, composed of two main blocks: the vocal folds and the vocaltract. The folds are the acoustic source that specify the intonation of thespeech, while the phonetic content is determined by the vocal tract dynamics. In this thesis we developed a complete model of voice production, we studiedthe different dynamic regimes of a detailed mathematical model of the folds,and adjusted it to a low dimensional model of the tract. This model allowsto synthesize voice by controlling physical parameters of the vocal system. In order to evaluate the quality of the synthetic voices, we carried outa combination of perceptual and fMRI tests, showing that synthetic voicesare indistinguishable from real ones. Such an articulatory synthesizer, basedon the physics processes involved in the voice production, allows to studythe perceptual effects of precise variations in the anatomical parameters. Weused it to show that the voice identity is encoded in the relative dimensionsof the tract and the folds. Using this validated model, we addressed two specific questions. First,we investigated the role of imitation within the generation of onomatopoeias. Despite it is widely know that onomatopoeias are based on imitation, itremains unclear which are the acoustic features shared between the soundsand their onomatopoeias. Using our vocal model we show that co-articulated consonants are the sounds that best fit the original noises. This pairs ofvowel-consonant also are the more stable syllables within the onomatopoeiasacross languages, suggesting a mechanism through which vocal imitation associatessimple sounds with more complex speech structures. We also inquire about the dimension of the vocal motor space controllingthe production of speech, in order to study this problem we designed an experimentaldevice that allows monitoring 3 points of the upper vocal tractwhile speaking. Making use of this novel tool, we reach a discrete descriptionfor the motor coordinates of Spanish vowels and occlusive consonants. Thisresults show the plausibility to control the vocal model with direct anatomicalmeasures, synthesizing speech in real time from simple motor gesturesproduced during the vocalization.Fil: Assaneo, MarÃa Florencia. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina