141 research outputs found

    The effects of larynx height on vowel production are mitigated by the active control of articulators

    Get PDF
    The influence of larynx position on vowel articulation is an important topic in understanding speech production, the present-day distribution of linguistic diversity and the evolution of speech and language in our lineage. We introduce here a realistic computer model of the vocal tract, constructed from actual human MRI data, which can learn, using machine learning techniques, to control the articulators in such a way as to produce speech sounds matching as closely as possible to a given set of target vowels. We systematically control the vertical position of the larynx and we quantify the differences between the target and produced vowels for each such position across multiple replications. We report that, indeed, larynx height does affect the accuracy of reproducing the target vowels and the distinctness of the produced vowel system, that there is a “sweet spot” of larynx positions that are optimal for vowel production, but that nevertheless, even extreme larynx positions do not result in a collapsed or heavily distorted vowel space that would make speech unintelligible. Together with other lines of evidence, our results support the view that the vowel space of human languages is influenced by our larynx position, but that other positions of the larynx may also be fully compatible with speech

    Speech Communication

    Get PDF
    Contains research objectives and summary of research on six research projects and reports on three research projects.National Institutes of Health (Grant 5 RO1 NS04332-13)National Institutes of Health (Fellowship 1 F22 MH5825-01)National Institutes of Health (Grant 1 T32 NS07040-01)National Institutes of Health (Fellowship 1 F22 NS007960)National Institutes of Health (Fellowship 1 F22 HD019120)National Institutes of Health (Fellowship 1 F22 HD01919-01)U. S. Army (Contract DAAB03-75-C-0489)National Institutes of Health (Grant 5 RO1 NS04332-12

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Early Human Vocalization Development: A Collection of Studies Utilizing Automated Analysis of Naturalistic Recordings and Neural Network Modeling

    Get PDF
    Understanding early human vocalization development is a key part of understanding the origins of human communication. What are the characteristics of early human vocalizations and how do they change over time? What mechanisms underlie these changes? This dissertation is a collection of three papers that take a computational approach to addressing these questions, using neural network simulation and automated analysis of naturalistic data.The first paper uses a self-organizing neural network to automatically derive holistic acoustic features characteristic of prelinguistic vocalizations. A supervised neural network is used to classify vocalizations into human-judged categories and to predict the age of the child vocalizing. The study represents a first step toward taking a data-driven approach to describing infant vocalizations. Its performance in classification represents progress toward developing automated analysis tools for coding infant vocalization types.The second paper is a computational model of early vocal motor learning. It adapts a popular type of neural network, the self-organizing map, in order to control a vocal tract simulator and in order to have learning be dependent on whether the model\u27s actions are reinforced. The model learns both to control production of sound at the larynx (phonation), an early-developing skill that is a prerequisite for speech, and to produce vowels that gravitate toward the vowels in a target language (either English or Korean) for which it is reinforced. The model provides a computationally-specified explanation for how neuromotor representations might be acquired in infancy through the combination of exploration, reinforcement, and self-organized learning.The third paper utilizes automated analysis to uncover patterns of vocal interaction between child and caregiver that unfold over the course of day-long, totally naturalistic recordings. The participants include 16- to 48-month-old children with and without autism. Results are consistent with the idea that there is a social feedback loop wherein children produce speech-related vocalizations, these are preferentially responded to by adults, and this contingency of adult response shapes future child vocalizations. Differences in components of this feedback loop are observed in autism, as well as with different maternal education levels

    Expression of gender in the human voice: investigating the “gender code”

    Get PDF
    We can easily and reliably identify the gender of an unfamiliar interlocutor over the telephone. This is because our voice is “sexually dimorphic”: men typically speak with a lower fundamental frequency (F0 - lower pitch) and lower vocal tract resonances (ΔF – “deeper” timbre) than women. While the biological bases of these differences are well understood, and mostly down to size differences between men and women, very little is known about the extent to which we can play with these differences to accentuate or de-emphasise our perceived gender, masculinity and femininity in a range of social roles and contexts. The general aim of this thesis is to investigate the behavioural basis of gender expression in the human voice in both children and adults. More specifically, I hypothesise that, on top of the biologically determined sexual dimorphism, humans use a “gender code” consisting of vocal gestures (global F0 and ΔF adjustments) aimed at altering the gender attributes conveyed by their voice. In order to test this hypothesis, I first explore how acoustic variation of sexually dimorphic acoustic cues (F0 and ΔF) relates to physiological differences in pre-pubertal speakers (vocal tract length) and adult speakers (body height and salivary testosterone levels), and show that voice gender variation cannot be solely explained by static, biologically determined differences in vocal apparatus and body size of speakers. Subsequently, I show that both children and adult speakers can spontaneously modify their voice gender by lowering (raising) F0 and ΔF to masculinise (feminise) their voice, a key ability for the hypothesised control of voice gender. Finally, I investigate the interplay between voice gender expression and social context in relation to cultural stereotypes. I report that listeners spontaneously integrate stereotypical information in the auditory and visual domain to make stereotypical judgments about children’s gender and that adult actors manipulate their gender expression in line with stereotypical gendered notions of homosexuality. Overall, this corpus of data supports the existence of a “gender code” in human nonverbal vocal communication. This “gender code” provides not only a methodological framework with which to empirically investigate variation in voice gender and its role in expressing gender identity, but also a unifying theoretical structure to understand the origins of such variation from both evolutionary and social perspectives

    Sociololinguistic competence and the bilingual's adoption of phonetic variants: auditory and instrumental data from English-Arabic bilinguals

    Get PDF
    This study is an auditory and acoustic investigation of the speech production patterns developed by English-Arabic bilingual children. The subjects are three Lebanese children aged five, seven and ten, all born and raised in Yorkshire, England. Monolingual friends of the same age were chosen as controls, and the parents of all bilingual and monolingual children were also taped to obtain a detailed assessment of the sound patterns available in the subjects' environment. The study addresses the question of interaction between the bilingual's phonological systems by calling for a refinement of the notion of a `phonological system' using insights from recent phonetic and sociolinguistic work on variability in speech (e. g. Docherty, Foulkes, Tillotson, & Watt, 2002; Docherty & Foulkes, 2000; Local, 1983; Pisoni, 1997; Roberts, 1997; Scobbie, 2002). The variables under study include /1/, In, and VOT production. These were chosen due to the existence of different patterns in their production in English and Arabic that vary according to contextual and dialectal factors. Data were collected using a variety of picture-naming, story-telling, and free-play activities for the children, and reading lists, story-telling, and interviews for the adults. To control for language mode (Grosjean, 1998), the bilinguals were recorded in different language sessions with different interviewers. Results for the monolingual children and adults in this study underline the importance of including controls in any study of bilingual speech development for a better interpretation of the bilinguals' patterns. Input from the adults proved highly variable and at times conflicted with published patterns normally found in the literature for the variables under study. Results for the bilinguals show that they have developed separate sociolinguistically-appropriate production patterns for each of their languages that are on the whole similar to those of monolinguals but that also reflect the bilinguals' rich socio-phonetic repertoire. The interaction between the bilinguals' languages is mainly restricted to the bilingual mode and is a sign of their developing sociolinguistic competence

    Rhotics.New Data and Perspectives

    Get PDF
    This book provides an insight into the patterns of variation and change of rhotics in different languages and from a variety of perspectives. It sheds light on the phonetics, the phonology, the socio-linguistics and the acquisition of /r/-sounds in languages as diverse as Dutch, English, French, German, Greek, Hebrew, Italian, Kuikuro, Malayalam, Romanian, Slovak, Tyrolean and Washili Shingazidja thus contributing to the discussion on the unity and uniqueness of this group of sounds
    corecore