352 research outputs found

    Speaker matters: Natural inter-speaker variation affects 4-month-olds’ perception of audio-visual speech

    Get PDF
    First Published September 27, 2019In the language development literature, studies often make inferences about infants’ speech perception abilities based on their responses to a single speaker. However, there can be significant natural variability across speakers in how speech is produced (i.e., inter-speaker differences). The current study examined whether inter-speaker differences can affect infants’ ability to detect a mismatch between the auditory and visual components of vowels. Using an eye-tracker, 4.5-month-old infants were tested on auditory-visual (AV) matching for two vowels (/i/ and /u/). Critically, infants were tested with two speakers who naturally differed in how distinctively they articulated the two vowels within and across the categories. Only infants who watched and listened to the speaker whose visual articulations of the two vowels were most distinct from one another were sensitive to AV mismatch. This speaker also produced a visually more distinct /i/ as compared to the other speaker. This finding suggests that infants are sensitive to the distinctiveness of AV information across speakers, and that when making inferences about infants’ perceptual abilities, characteristics of the speaker should be taken into account.The author(s) disclosed receipt of the following financial support for the research, authorship and/ or publication of this article: This research was funded by the grant PSI2014-5452-P from the Spanish Ministry of Economy and Competitiveness to M.M. The authors also acknowledge financial support from the ‘Severo Ochoa Program for Centers/Units of Excellence in R&D’ (SEV-2015-490) and from the Basque Government ‘Programa Predoctoral’ to J.P

    Emergence of the vowel space in very young children with Down syndrome: An exploratory case study

    Get PDF
    The current study presents the preliminary results of an investigation into the development of the vowel space in one female child with Down syndrome (DS). Vowel productions at five points in time, ranging from 1;0 to 3;8 years of age, have been analysed to produce age-specific F1-F2 vowel plots and to calculate metrics quantifying changes in their size and dimensions. The results show that changes in DS vowel space area and shape are non-systematic, lacking the definite developmental trajectories present in the productions of typically developing children. An explanation of outcomes using the DIVA model of speech acquisition is proposed

    A computational model of the relationship between speech intelligibility and speech acoustics

    Get PDF
    abstract: Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental phoneme errors and suprasegmental lexical boundary errors are developed to reveal the perceptual strategies of the listener. A comprehensive set of automated acoustic measures are developed to quantify variations in the acoustic signal from three perceptual aspects, including articulation, prosody, and vocal quality. The developed measures have been validated on a dysarthric speech dataset with various severity degrees. Multiple regression analysis is employed to show the developed measures could predict perceptual ratings reliably. The relationship between the acoustic measures and the listening errors is investigated to show the interaction between speech production and perception. The hypothesize is that the segmental phoneme errors are mainly caused by the imprecise articulation, while the sprasegmental lexical boundary errors are due to the unreliable phonemic information as well as the abnormal rhythm and prosody patterns. To test the hypothesis, within-speaker variations are simulated in different speaking modes. Significant changes have been detected in both the acoustic signals and the listening errors. Results of the regression analysis support the hypothesis by showing that changes in the articulation-related acoustic features are important in predicting changes in listening phoneme errors, while changes in both of the articulation- and prosody-related features are important in predicting changes in lexical boundary errors. Moreover, significant correlation has been achieved in the cross-validation experiment, which indicates that it is possible to predict intelligibility variations from acoustic signal.Dissertation/ThesisDoctoral Dissertation Speech and Hearing Science 201

    Phonological reduction and intelligibility in task-oriented dialogue

    Get PDF

    The production and perception of peripheral geminate/singleton coronal stop contrasts in Arabic

    Get PDF
    Gemination is typologically common word-medially but is rare at the periphery of the word (word-initially and -finally). In line with this observation, prior research on production and perception of gemination has focused primarily on medial gemination. Much less is known about the production and perception of peripheral gemination. This PhD thesis reports on comprehensive articulatory, acoustic and perceptual investigations of geminate-singleton contrasts according to the position of the contrast in the word and in the utterance. The production component of the project investigated the articulatory and acoustic features of medial and peripheral gemination of voiced and voiceless coronal stops in Modern standard Arabic and regional Arabic vernacular dialects, as produced by speakers from two disparate and geographically distant countries, Morocco and Lebanon. The perceptual experiment investigated how standard and dialectal Arabic gemination contrasts in each word position were categorised and discriminated by three groups of non-native listeners, each differing in their native language experience with gemination at different word positions. The first experiment used ultrasound and acoustic recordings to address the extent to which word-initial gemination in Moroccan and Lebanese dialectal Arabic is maintained, as well as the articulatory and acoustic variability of the contrast according to the position of the gemination contrast in the utterance (initial vs. medial) and between the two dialects. The second experiment compared the production of word-medial and -final gemination in Modern Standard Arabic as produced by Moroccan and Lebanese speakers. The aim of the perceptual experiment was to disentangle the contribution of phonological and phonetic effects of the listeners’ native languages on the categorisation and discrimination of non-lexical Moroccan gemination by three groups of non-native listeners varying in their phonological (native Lebanese group and heritage Lebanese group, for whom Moroccan is unintelligible, i.e., non-native language) and phonetic-only (native English group) experience with gemination across the three word positions. The findings in this thesis constitute important contributions about positional and dialectal effects on the production and perception of gemination contrasts, going beyond medial gemination (which was mainly included as control) and illuminating in particular the typologically rare peripheral gemination

    Analysis of speech and tongue motion in normal and post-glossectomy speaker using cine MRI

    Get PDF
    Objective Since the tongue is the oral structure responsible for mastication, pronunciation, and swallowing functions, patients who undergo glossectomy can be affected in various aspects of these functions. The vowel /i/ uses the tongue shape, whereas /u/ uses tongue and lip shapes. The purpose of this study is to investigate the morphological changes of the tongue and the adaptation of pronunciation using cine MRI for speech of patients who undergo glossectomy. Material and Methods Twenty-three controls (11 males and 12 females) and 13 patients (eight males and five females) volunteered to participate in the experiment. The patients underwent glossectomy surgery for T1 or T2 lateral lingual tumors. The speech tasks “a souk” and “a geese” were spoken by all subjects providing data for the vowels /u/ and /i/. Cine MRI and speech acoustics were recorded and measured to compare the changes in the tongue with vowel acoustics after surgery. 2D measurements were made of the interlip distance, tongue-palate distance, tongue position (anterior-posterior and superior-inferior), tongue height on the left and right sides, and pharynx size. Vowel formants Fl, F2, and F3 were measured. Results The patients had significantly lower F2/Fl ratios (F=5.911, p=0.018), and lower F3/F1 ratios that approached significance. This was seen primarily in the /u/ data. Patients had flatter tongue shapes than controls with a greater effect seen in /u/ than /i/. Conclusion The patients showed complex adaptation motion in order to preserve the acoustic integrity of the vowels, and the tongue modified cavity size relationships to maintain the value of the formant frequencies

    The articulatory and acoustic characteristics of Polish sibilants and their consequences for diachronic change

    Get PDF
    The study is concerned with the relative synchronic stability of three contrastive sibilant fricatives /s (sic)/ in Polish. Tongue movement data were collected from nine first-language Polish speakers producing symmetrical real and non-word CVCV sequences in three vowel contexts. A Gaussian model was used to classify the sibilants from spectral information in the noise and from formant frequencies at vowel onset. The physiological analysis showed an almost complete separation between /s (sic)/ on tongue-tip parameters. The acoustic analysis showed that the greater energy at higher frequencies distinguished /s/ in the fricative noise from the other two sibilant categories. The most salient information at vowel onset was for /(sic)/, which also had a strong palatalizing effect on the following vowel. Whereas either the noise or vowel onset was largely sufficient for the identification of /s (sic)/ respectively, both sets of cues were necessary to separate /(sic)/ from /s (sic)/. The greater synchronic instability of /(sic)/ may derive from its high articulatory complexity coupled with its comparatively low acoustic salience. The data also suggest that the relatively late stage of /(sic)/ acquisition by children may come about because of the weak acoustic information in the vowel for its distinction from /s/
    • 

    corecore