1,841 research outputs found

    THE CHILD AND THE WORLD: How Children acquire Language

    Get PDF
    HOW CHILDREN ACQUIRE LANGUAGE Over the last few decades research into child language acquisition has been revolutionized by the use of ingenious new techniques which allow one to investigate what in fact infants (that is children not yet able to speak) can perceive when exposed to a stream of speech sound, the discriminations they can make between different speech sounds, differentspeech sound sequences and different words. However on the central features of the mystery, the extraordinarily rapid acquisition of lexicon and complex syntactic structures, little solid progress has been made. The questions being researched are how infants acquire and produce the speech sounds (phonemes) of the community language; how infants find words in the stream of speech; and how they link words to perceived objects or action, that is, discover meanings. In a recent general review in Nature of children's language acquisition, Patricia Kuhl also asked why we do not learn new languages as easily at 50 as at 5 and why computers have not cracked the human linguistic code. The motor theory of language function and origin makes possible a plausible account of child language acquisition generally from which answers can be derived also to these further questions. Why computers so far have been unable to 'crack' the language problem becomes apparent in the light of the motor theory account: computers can have no natural relation between words and their meanings; they have no conceptual store to which the network of words is linked nor do they have the innate aspects of language functioning - represented by function words; computers have no direct links between speech sounds and movement patterns and they do not have the instantly integrated neural patterning underlying thought - they necessarily operate serially and hierarchically. Adults find the acquisition of a new language much more difficult than children do because they are already neurally committed to the link between the words of their first language and the elements in their conceptual store. A second language being acquired by an adult is in direct competition for neural space with the network structures established for the first language

    Speaking Rate Effects on Locus Equation Slope

    Get PDF
    A locus equation describes a 1st order regression fit to a scatter of vowel steady-state frequency values predicting vowel onset frequency values. Locus equation coefficients are often interpreted as indices of coarticulation. Speaking rate variations with a constant consonant–vowel form are thought to induce changes in the degree of coarticulation. In the current work, the hypothesis that locus slope is a transparent index of coarticulation is examined through the analysis of acoustic samples of large-scale, nearly continuous variations in speaking rate. Following the methodological conventions for locus equation derivation, data pooled across ten vowels yield locus equation slopes that are mostly consistent with the hypothesis that locus equations vary systematically with coarticulation. Comparable analyses between different four-vowel pools reveal variations in the locus slope range and changes in locus slope sensitivity to rate change. Analyses across rate but within vowels are substantially less consistent with the locus hypothesis. Taken together, these findings suggest that the practice of vowel pooling exerts a non-negligible influence on locus outcomes. Results are discussed within the context of articulatory accounts of locus equations and the effects of speaking rate change

    Control concepts for articulatory speech synthesis

    Get PDF
    We present two concepts for the generation of gestural scores to control an articulatory speech synthesizer. Gestural scores are the common input to the synthesizer and constitute an or- ganized pattern of articulatory gestures. The first concept gen- erates the gestures for an utterance using the phonetic transcrip- tions, phone durations, and intonation commands predicted by the Bonn Open Synthesis System (BOSS) from an arbitrary in- put text. This concept extends the synthesizerto a text-to-speech synthesis system. The idea of the second concept is to use tim- ing informationextracted from ElectromagneticArticulography signals to generate the articulatory gestures. Therefore, it is a concept for the re-synthesis of natural utterances. Finally, ap- plication prospects for the presented synthesizer are discussed

    Two and three-dimensional visual articulatory models for pronunciation training and for treatment of speech disorders

    Get PDF
    Visual articulatory models can be used for visualizing vocal tract articulatory speech movements. This information may be helpful in pronunciation training or in therapy of speech disorders. For testing this hypothesis, speech recognition rates were quantified for mute animations of vocalic and consonantal speech movements generated by a 2D and a 3D visual articulatory model. The visually based speech sound recognition test (mimicry test) was performed by two groups of eight children (five to eight years old) matched in age and sex. The children were asked to mimic the visually produced mute speech movement animations for different speech sounds. Recognition rates stay significantly above chance but indicate no significant difference for each of the two models. Children older than 5 years are capable of interpreting vocal tract articulatory speech sound movements without any preparatory training in a speech adequate way. The complex 3D-display of vocal tract articulatory movements provides no significant advantage in comparison to the visually simpler 2D-midsagittal displays of vocal tract articulatory movements

    Generating gestural timing from EMA data using articulatory resynthesis

    Get PDF
    As part of ongoing work to integrate an articulatory synthesizer into a modular TTS platform, a method is presented which allows gestural timings to be generated automatically from EMA data. Further work is outlined which will adapt the vocal tract model and phoneset to English using new articulatory data, and use statistical trajectory models

    Brittany Bernal - Sensorimotor Adaptation of Vowel Production in Stop Consonant Contexts

    Get PDF
    The purpose of this research is to measure the compensatory and adaptive articulatory response to shifted formants in auditory feedback to compare the resulting amount of sensorimotor learning that takes place in speakers upon saying the words /pep/ and /tet/. These words were chosen in order to analyze the coarticulatory effects of voiceless consonants /p/ and /t/ on sensorimotor adaptation of the vowel /e/. The formant perturbations were done using the Audapt software, which takes an input speech sample and plays it back to the speaker in real-time via headphones. Formants are high-energy acoustic resonance patterns measured in hertz that reflect positions of articulators during the production of speech sounds. The two lowest frequency formants (F1 and F2) can uniquely distinguish among the vowels of American English. For this experiment, Audapt shifted F1 down and F2 up, and those who adapt were expected to shift in the opposite direction of the perturbation. The formant patterns and vowel boundaries were analyzed using TF32 and S+ software, which led to conclusions about the adaptive responses. Manipulating auditory feedback by shifting formant values is hypothesized to elicit sensorimotor adaptation, a form of short-term motor learning. The amount of adaptation is expected to be greater for the word /pep/ rather than /tet/ because there is less competition for articulatory placement of the tongue during production of bilabial consonants. This methodology could be further developed to help those with motor speech disorders remedy their speech errors with much less conscious effort than traditional therapy techniques.https://epublications.marquette.edu/mcnair_2013/1008/thumbnail.jp

    Data-Driven Critical Tract Variable Determination for European Portuguese

    Get PDF
    Technologies, such as real-time magnetic resonance (RT-MRI), can provide valuable information to evolve our understanding of the static and dynamic aspects of speech by contributing to the determination of which articulators are essential (critical) in producing specific sounds and how (gestures). While a visual analysis and comparison of imaging data or vocal tract profiles can already provide relevant findings, the sheer amount of available data demands and can strongly profit from unsupervised data-driven approaches. Recent work, in this regard, has asserted the possibility of determining critical articulators from RT-MRI data by considering a representation of vocal tract configurations based on landmarks placed on the tongue, lips, and velum, yielding meaningful results for European Portuguese (EP). Advancing this previous work to obtain a characterization of EP sounds grounded on Articulatory Phonology, important to explore critical gestures and advance, for example, articulatory speech synthesis, entails the consideration of a novel set of tract variables. To this end, this article explores critical variable determination considering a vocal tract representation aligned with Articulatory Phonology and the Task Dynamics framework. The overall results, obtained considering data for three EP speakers, show the applicability of this approach and are consistent with existing descriptions of EP sounds

    A Speech Production Model for Synthesis-by-rule

    Get PDF
    Sponsored in part by the National Science Foundation through Grant GN-534.1 from the Office of Science Information Service to the Information Sciences Research Center, The Ohio State University

    Statistical identification of articulatory roles in speech production.

    Get PDF
    The human speech apparatus is a rich source of information and offers many cues in the speech signal due to its biomechanical constraints and physiological interdependencies. Coarticulation, a direct consequence of these speech production factors, is one of the main problems affecting the performance of speech systems. Incorporation of production knowledge could potentially benefit speech recognisers and synthesisers. Hand coded rules and scores derived from the phonological knowledge used by production oriented models of speech are simple and incomplete representations of the complex speech production process. Statistical models built from measurements of speech articulation fail to identify the cause of constraints. There is a need for building explanatory yet descriptive models of articulation for understanding and modelling the effects of coarticulation. This thesis aims at providing compact descriptive models of realistic speech articulation by identifying and capturing the essential characteristics of human articulators using measurements from electro-magnetic articulography. The constraints on articulators during speech production are identified in the form of critical, dependent and redundant roles using entirely statistical and data-driven methods. The critical role captures the maximally constrained target driven behaviour of an articulator. The dependent role models the partial constraints due to physiological interdependencies. The redundant role reflects the unconstrained behaviour of an articulator which is maximally prone to coarticulation. Statistical target models are also obtained as the by-product of the identified roles. The algorithm for identification of articulatory roles (and estimation of respective model distributions) for each phone is presented and the results are critically evaluated. The identified data-driven constraints obtained are compared with the well known and commonly used constraints derived from the IPA (International Phonetic Alphabet). The identified critical roles were not only in agreement with the place and manner descriptions of each phone but also provided a phoneme to phone transformation by capturing language and speaker specific behaviour of articulators. The models trained from the identified constraints fitted better to the phone distributions (40% improvement) . The evaluation of the proposed search procedure with respect to an exhaustive search for identification of roles demonstrated that the proposed approach performs equally well for much less computational load. Articulation models built in the planning stage using sparse yet efficient articulatory representations using standard trajectory generation techniques showed some potential in modelling articulatory behaviour. Plenty of scope exists for further developing models of articulation from the proposed framework
    • 

    corecore