841 research outputs found

    The two targets of speech production: Two levels of specification

    Get PDF
    A thread of this work is the difference in how articulatory and perceptual features of phonology are integrated into speech production. This idea emerges during the research and simulation of appropriate speech synthesizers for one fellow graduate student to adopt in a proposal for development of an automatic speech acquisition system. While using available synthesizers to produce speech utterances, it occurs that these two features actually determine the two levels of input to speech synthesizers, namely, tasks and muscle activities. This thesis adheres to two existing models, each accepting one level of input. The TADA approach [1] maintains that input to a speech synthesizer should be tasks, which consist of specifications of tract variables, such as locations and degrees of constrictions, as functions of time. To be more precise, these tasks are given the name of gestural scores, which is explained in the paper later. On the other hand, Praat [2] takes as input muscle activities: the articulatory input specifications initially control the lengths and tensions of the muscles, instead of the positions of articulators. After making a brief introduction to the above two speech synthesizers, assessment and comparison of their time efficiencies as well as perceptual accuracies are provided (in Part I and Part II), by confronting simulated results from each of them with sounds in the real world. In the end of both parts, suggestion is offered on which category of synthesizers should be adopted with respect to various aspects of research concentrations in articulatory phonology.Ope

    Perceptuo-motor biases in the perceptual organization of the height feature in French vowels

    No full text
    A paraître dans Acta AcusticaInternational audienceThis paper reports on the organization of the perceived vowel space in French. In a previous paper [28], we investigated the implementation of vocal height contrasts along the F1 dimension in French speakers. In this paper, we present results from perceptual identification tests performed by twelve participants who took part in the production experiment reported in the earlier paper. For each subject, stimuli presented in the identification test were synthesized in two different vowel spaces, corresponding to two different vocal tract lengths. The results showed that first, the perceived French vowels belonging to similar height degrees were aligned on stable F1 values, independent of place of articulation and roundedness, as was the case for produced vowels. Second, the produced F1 distances between height degrees correlated with the perceived F1 distances. This suggests that there is a link between perceptual and motor phonemic prototypes in the human brain. The results are discussed using the framework of the Perception for Action Control (PACT) theory, in which speech units are considered to be gestures shaped by perceptual processes

    Physical mechanisms may be as important as brain mechanisms in evolution of speech [Commentary on Ackerman, Hage, & Ziegler. Brain Mechanisms of acoustic communication in humans and nonhuman primates: an evolutionary perspective]

    No full text
    We present two arguments why physical adaptations for vocalization may be as important as neural adaptations. First, fine control over vocalization is not easy for physical reasons, and modern humans may be exceptional. Second, we present an example of a gorilla that shows rudimentary voluntary control over vocalization, indicating that some neural control is already shared with great apes

    Consonant-vowel coarticulation patterns in Swedish and Mandarin

    Get PDF
    This paper reports a cross linguistic study that compares the coarticulation patterns between consonant and vowel (CV) in Mandarin Chinese and Southern Swedish. Kinematic data were collected using the Electromagnetic Articulography (EMA) for both languages and were subjected to three types of CV time lag measurement, based on more or less equivalent landmarks on lips and tongue, and partially adopted in previous studies [1, 2, 3]. We found rather consistent CV coordination patterns in these two typologically different languages with both the velocity-based and the acceleration-based measurements on the lips and the tongue body. The most striking result to emerge from the data is the same effect of gender on the variation of CV coarticulation in both languages, which has not been reported previously. In addition, only when gender was added as a factor, did we find the language differences on the CV time lags

    Articulating: the neural mechanisms of speech production

    Full text link
    Speech production is a highly complex sensorimotor task involving tightly coordinated processing across large expanses of the cerebral cortex. Historically, the study of the neural underpinnings of speech suffered from the lack of an animal model. The development of non-invasive structural and functional neuroimaging techniques in the late 20th century has dramatically improved our understanding of the speech network. Techniques for measuring regional cerebral blood flow have illuminated the neural regions involved in various aspects of speech, including feedforward and feedback control mechanisms. In parallel, we have designed, experimentally tested, and refined a neural network model detailing the neural computations performed by specific neuroanatomical regions during speech. Computer simulations of the model account for a wide range of experimental findings, including data on articulatory kinematics and brain activity during normal and perturbed speech. Furthermore, the model is being used to investigate a wide range of communication disorders.R01 DC002852 - NIDCD NIH HHS; R01 DC007683 - NIDCD NIH HHS; R01 DC016270 - NIDCD NIH HHSAccepted manuscrip

    Brain mechanisms of acoustic communication in humans and nonhuman primates: An evolutionary perspective

    Get PDF
    Any account of “what is special about the human brain” (Passingham 2008) must specify the neural basis of our unique ability to produce speech and delineate how these remarkable motor capabilities could have emerged in our hominin ancestors. Clinical data suggest that the basal ganglia provide a platform for the integration of primate-general mechanisms of acoustic communication with the faculty of articulate speech in humans. Furthermore, neurobiological and paleoanthropological data point at a two-stage model of the phylogenetic evolution of this crucial prerequisite of spoken language: (i) monosynaptic refinement of the projections of motor cortex to the brainstem nuclei that steer laryngeal muscles, presumably, as part of a “phylogenetic trend” associated with increasing brain size during hominin evolution; (ii) subsequent vocal-laryngeal elaboration of cortico-basal ganglia circuitries, driven by human-specific FOXP2 mutations.;>This concept implies vocal continuity of spoken language evolution at the motor level, elucidating the deep entrenchment of articulate speech into a “nonverbal matrix” (Ingold 1994), which is not accounted for by gestural-origin theories. Moreover, it provides a solution to the question for the adaptive value of the “first word” (Bickerton 2009) since even the earliest and most simple verbal utterances must have increased the versatility of vocal displays afforded by the preceding elaboration of monosynaptic corticobulbar tracts, giving rise to enhanced social cooperation and prestige. At the ontogenetic level, the proposed model assumes age-dependent interactions between the basal ganglia and their cortical targets, similar to vocal learning in some songbirds. In this view, the emergence of articulate speech builds on the “renaissance” of an ancient organizational principle and, hence, may represent an example of “evolutionary tinkering” (Jacob 1977)

    Quantification of vocal tract configuration of older children with Down syndrome: A pilot study

    Get PDF
    Objective: To quantify the vocal tract (VT) lumen of older children with Down syndrome using acoustic reflection (AR) technology. Design: Comparative study. Setting: Vocal tract lab with sound-proof booth. Participants: Ten children (4 males and 6 females), aged 9-17 years old diagnosed with Down syndrome. Ten typically developing children (4 males and 6 females) matched for age, gender, and race. Intervention: Each participant's vocal tract measurements were obtained by using an Eccovision Acoustic Pharyngometer. Main outcome measures: Six vocal tract dimensional parameters (oral length, oral volume, pharyngeal length, pharyngeal volume, total vocal tract length, and total vocal tract volume) from children with Down syndrome and the typically developing children were measured and compared. Results: Children with Down syndrome exhibited small oral cavities when compared to control group (F(1, 18) = 6.55, p = 0.02). They also demonstrated a smaller vocal tract volumes (F(1, 18) = 2.58, p = 0.13), although the results were not statistically significant at the 0.05 level. Pharyngeal length, pharyngeal volume, and vocal tract length were not significantly different between the two groups. Conclusion: Children with Down syndrome had smaller oral cavities, and smaller vocal tract volumes. No significant differences were found for pharyngeal length, pharyngeal volume, and vocal tract length between these two groups. © 2010 Elsevier Ireland Ltd. All rights reserved.postprin

    Current trends in British sociophonetics.

    Get PDF

    Acoustic-to-Articulatory Speech Inversion Features for Mispronunciation Detection of /r/ in Child Speech Sound Disorders

    Full text link
    Acoustic-to-articulatory speech inversion could enhance automated clinical mispronunciation detection to provide detailed articulatory feedback unattainable by formant-based mispronunciation detection algorithms; however, it is unclear the extent to which a speech inversion system trained on adult speech performs in the context of (1) child and (2) clinical speech. In the absence of an articulatory dataset in children with rhotic speech sound disorders, we show that classifiers trained on tract variables from acoustic-to-articulatory speech inversion meet or exceed the performance of state-of-the-art features when predicting clinician judgment of rhoticity. Index Terms: rhotic, speech sound disorder, mispronunciation detectionComment: *denotes equal contribution. To appear in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 202
    • …
    corecore