Search CORE

55 research outputs found

Acoustic Space Movement Planning in a Neural Model of Motor Equivalent Vowel Production

Author: Guenther Frank H.
Johnson Dave
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/03/1995
Field of study

Recent evidence suggests that speakers utilize an acoustic-like reference frame for the planning of speech movements. DIVA, a computational model of speech acquisition and motor equivalent speech production, has previously been shown to provide explanations for a wide range of speech production data using a constriction-based reference frame for movement planning. This paper extends the previous work by investigating an acoustic-like planning frame in the DIVA modeling framework. During a babbling phase, the model self-organizes targets in the planning space for each of ten vowels and learns a mapping from desired movement directions in this planning space into appropriate articulator velocities. Simulation results verify that after babbling the model is capable of producing easily recognizable vowel sounds using an acoustic planning space consisting of the formants F1 and F2. The model successfully reaches all vowel targets from any initial vocal tract configuration, even in the presence of constraints such as a blocked jaw.Office of Naval Research (N00014-91-J-4100, N00014-92-J-4015); Air Force Office of Scientific Research (F49620-92-J-0499

CiteSeerX

Boston University Institutional Repository (OpenBU)

Articulatory Tradeoffs Reduce Acoustic Variability During American English /r/ Production

Author: Browman C.
Carol Y. Espy-Wilson
Delattre P.
Fowler C. A.
Frank H. Guenther
Hagiwara R.
Hagiwara R.
Joseph S. Perkell
Majid Zandipour
Melanie L. Matthies
Ong D.
Suzanne E. Boyce
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1998
Field of study

Acoustic and articulatory recordings reveal that speakers utilize systematic articulatory tradeoffs to maintain acoustic stability when producing the phoneme /r/. Distinct articulator configurations used to produce /r/ in various phonetic contexts show systematic tradeoffs between the cross-sectional areas of different vocal tract sections. Analysis of acoustic and articulatory variabilities reveals that these tradeoffs act to reduce acoustic variability, thus allowing large contextual variations in vocal tract shape; these contextual variations in turn apparently reduce the amount of articulatory movement required. These findings contrast with the widely held view that speaking involves a canonical vocal tract shape target for each phoneme.National Institute on Deafness and Other Communication Disorders (1R29-DC02852-02, 5R01-DC01925-04, 1R03-C2576-0l); National Science Foundation (IRI-9310518

Crossref

Boston University Institutional Repository (OpenBU)

Open challenges in understanding development and evolution of speech forms: The roles of embodied self-organization, motivation and active exploration

Author: Oudeyer Pierre-Yves
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

This article discusses open scientific challenges for understanding development and evolution of speech forms, as a commentary to Moulin-Frier et al. (Moulin-Frier et al., 2015). Based on the analysis of mathematical models of the origins of speech forms, with a focus on their assumptions , we study the fundamental question of how speech can be formed out of non--speech, at both developmental and evolutionary scales. In particular, we emphasize the importance of embodied self-organization , as well as the role of mechanisms of motivation and active curiosity-driven exploration in speech formation. Finally , we discuss an evolutionary-developmental perspective of the origins of speech

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Training a Vocal Tract Synthesiser to imitate speech using Distal Supervised Learning

Author: Howard I
Huckvale M
Publication venue: University of Patras, Wire Communications Laboratory
Publication date: 01/01/2005
Field of study

Imitation is a powerful mechanism by which both animals and people can learn useful behavior, by copying the actions of others. We adopt this approach as a means to control an articulatory speech synthesizer. The goal of our project is to build a system that can learn to mimic speech using its own vocal tract. We approach this task by training an inverse mapping between the synthesizer’s control parameters and their auditory consequences. In this paper we compare the direct estimation of this inverse model with the distal supervised learning scheme proposed by Jordan & Rumelhart (1992). Both of these approaches involve a babbling phase, which is used to learn the auditory consequences of the articulatory controls. We show that both schemes perform well on speech generated by the synthesizer itself, when no normalization is needed, but that distal learning provided slightly better performance with speech generated by a real human subject

UCL Discovery

Recommended from our members

Long-term and persistent vocal plasticity in adult bats.

Author: Desai Janki
Genzel Daria
Paras Elana
Yartsev Michael
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

Bats exhibit a diverse and complex vocabulary of social communication calls some of which are believed to be learned during development. This ability to produce learned, species-specific vocalizations - a rare trait in the animal kingdom - requires a high-degree of vocal plasticity. Bats live extremely long lives in highly complex and dynamic social environments, which suggests that they might also retain a high degree of vocal plasticity in adulthood, much as humans do. Here, we report persistent vocal plasticity in adult bats (Rousettus aegyptiacus) following exposure to broad-band, acoustic perturbation. Our results show that adult bats can not only modify distinct parameters of their vocalizations, but that these changes persist even after noise cessation - in some cases lasting several weeks or months. Combined, these findings underscore the potential importance of bats as a model organism for studies of vocal plasticity, including in adulthood

eScholarship - University of California

KLAIR: A virtual infant for spoken language acquisition research

Author: Fagel S
Howard IS
Huckvale M
Publication venue
Publication date: 01/01/2009
Field of study

Recent research into the acquisition of spoken language has stressed the importance of learning through embodied linguistic interaction with caregivers rather than through passive observation. However the necessity of interaction makes experimental work into the simulation of infant speech acquisition difficult because of the technical complexity of building real-time embodied systems. In this paper we present KLAIR: a software toolkit for building simulations of spoken language acquisition through interactions with a virtual infant. The main part of KLAIR is a sensori-motor server that supplies a client machine learning application with a virtual infant on screen that can see, hear and speak. By encapsulating the real-time complexities of audio and video processing within a server that will run on a modern PC, we hope that KLAIR will encourage and facilitate more experimental research into spoken language acquisition through interaction. Copyright © 2009 ISCA

UCL Discovery

Plymouth Electronic Archive and Research Library

Effect of Visual Input on Vowel Production in English Speakers

Author: Richardson Amanda C.
Publication venue: DigitalCommons@Macalester College
Publication date: 05/05/2010
Field of study

This study analyzes whether there should be a visual component to a model of speech perception and production by comparing the jaw opening, advancement, and rounding of American English and non-English vowels in the presence and absence of a visual stimulus. Surprisingly, jaw opening did not change production, but the presence of the visual stimulus was found to be a significant factor in participants’ vowel advancement for non-English vowels. This may be explained by lip rounding, but requires further research in order to develop a full understanding of the impact of visual input on vowel production to be used in teaching and learning languages

DigitalCommons@Macalester College

Pre-Low Raising in Japanese Pitch Accent

Author: Lee KLA
Santitham P
Xu Y
Publication venue: 'S. Karger AG'
Publication date: 01/01/2017
Field of study

Japanese has been observed to have 2 versions of the H tone, the higher of which is associated with an accented mora. However, the distinction of these 2 versions only surfaces in context but not in isolation, leading to a long-standing debate over whether there is 1 H tone or 2. This article reports evidence that the higher version may result from a pre-low raising mechanism rather than being inherently higher. The evidence is based on an analysis of F0 of words that varied in length, accent condition and syllable structure, produced by native speakers of Japanese at 2 speech rates. The data indicate a clear separation between effects that are due to mora-level preplanning and those that are mechanical. These results are discussed in terms of mechanisms of laryngeal control during tone production, and highlight the importance of articulation as a link between phonology and surface acoustics.postprin

Crossref

UCL Discovery

HKU Scholars Hub

Error Detection and Correction During Object Naming in Individuals with Aphasia

Author: O\u27Donnell Anne
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/05/2022
Field of study

Aphasia is a neurogenic communication disorder that occurs following a left hemisphere stroke and commonly co-occurs with apraxia of speech (AOS). Individuals with aphasia typically make errors in their lexical retrieval and have difficulties detecting and correcting them. While there is ample research in how errors occur, few researchers go as far as to look at error detection and subsequent correction in this population. Given this need for research, we took a pre-existing data set of 23 individuals with aphasia grouped for presence of AOS (nine with comorbid AOS) and coded their spoken responses on the Object Naming subtest of the Western Aphasia Battery-Revised to characterize the types of error made, as well as whether those errors were detected and corrected. Groups did not differ for total number of errors; however, participants with AOS produced more late-stage errors than the participants without AOS, meaning they made errors that occurred after the level of lemma selection (i.e., phonemic paraphasias and neologisms). In this sample, people with aphasia were generally able to detect their errors, though the presence of AOS impacted their ability to correct

UNH Scholars' Repository