79 research outputs found

    Duration of frication noise required for identification of English fricatives

    Get PDF
    This is the publisher's version, also available electronically from http://scitation.aip.org/content/asa/journal/jasa/85/4/10.1121/1.397961.Natural speech consonant–vowel (CV) syllables ([f, s, θ, š, v, z, F] followed by [i, u, a]) were computer edited to include 20–70 ms of their frication noise in 10‐ms steps as measured from their onset, as well as the entire frication noise. These stimuli, and the entire syllables, were presented to 12 subjects for consonant identification. Results show that the listener does not require the entire fricative–vowel syllable in order to correctly perceive a fricative. The required frication duration depends on the particular fricative, ranging from approximately 30 ms for [σ, z] to 50 ms for [f, s, v], while [θ, F] are identified with reasonable accuracy in only the full frication and syllable conditions. Analysis in terms of the linguistic features of voicing, place, and manner of articulation revealed that fricative identification in terms of place of articulation is much more affected by a decrease in frication duration than identification in terms of voicing and manner of articulation

    Contributions of semantic and facial information to perception of non-sibilant fricatives

    Get PDF
    This is the author's accepted manuscript. The original publication is available at http://jslhr.pubs.asha.org/article.aspx?articleid=1781385.Most studies have been unable to identify reliable acoustic cues for the recognition of the English nonsibilant fricatives /f, v, θ, ð/. The present study was designed to test the extent to which the perception of these fricatives by normal-hearing adults is based on other sources of information, namely, linguistic context and visual information. In Experiment 1, target words beginning with /f/, /θ/, /s/, or /∍/ were preceded by either a semantically congruous or incongruous precursor sentence. Results showed an effect of linguistic context on the perception of the distinction between /f/ and /θ/ and on the acoustically more robust distinction between /s/ and /∍/. In Experiment 2, participants identified syllables consisting of the fricatives /f, v, θ, ð/ paired with the vowels /i, a, u/. Three conditions were contrasted: Stimuli were presented with (a) both auditory and visual information, (b) auditory information alone, or (c) visual information alone. When errors in terms of voicing were ignored in all 3 conditions, results indicated that perception of these fricatives is as good with visual information alone as with both auditory and visual information combined, and better than for auditory information alone. These findings suggest that accurate perception of nonsibilant fricatives derives from a combination of acoustic, linguistic, and visual information

    American Chinese learners’; acquisition of L2 Chinese affricates /ts/ and /tsh/

    Get PDF
    This is the publisher's version, also available electronically from http://scitation.aip.org/content/asa/journal/poma/18/1/10.1121/1.4798223.Many studies on L2 speech learning focused on testing the L1 transfer hypothesis. In general, L2 phonemes were found to be merged with similar L1 phoneme to different degrees (Flege 1995). Few studies examined whether non-phonemic phonetic categories in L1 help or block the formation of new phonetic categories in L2. The current study examined the effect of L1 English consonantal clusters [ts] and [dz] on learning L2 Chinese affricates /ts/ and /tsh/. We studied duration and center of gravity (COG) of Chinese affricates /ts/ and /tsh/ produced by native Chinese speakers, novice American Chinese learners and advanced learners. In terms of duration, both learner groups showed contrast between L2 /ts/ and /tsh/, which is similar to native Chinese speakers' production. However, for COG, only the advanced learner group showed contrast between L2 /ts/ and /tsh/, which is similar to native speakers' production while the novice learner group did not show a COG difference between the two L2 affricates. The results suggest an early acquisition of the durational contrast between the L2 Chinese affricates and later acquisition of COG contrast between the two L2 affricates

    Effects of tone on the three-way laryngeal distinction in Korean: An acoustic and aerodynamic comparison of the Seoul and South Kyungsang dialects

    Get PDF
    This is the publisher's version, made available with the permission of the publisher.The three-way laryngeal distinction among voiceless Korean stops has been well documented for the Seoul dialect. The present study compares the acoustic and aerodynamic properties of this stop series between two dialects, non-tonal Seoul and tonal South Kyungsang Korean. Sixteen male Korean speakers (eight from Seoul and eight from Kyungsang) participated. Measures collected included VOT, f0 at vowel onset, H1-H2, and air pressure and airflow. The presence versus absence of lexical pitch accent affects both the acoustic and aerodynamic properties. First, Seoul speakers use a combination of f0 and VOT to distinguish the three-way contrast of Korean stops, while Kyungsang speakers mainly use VOT. Second, the presence of lexical pitch for Kyungsang speakers makes f0 an unreliable acoustic cue for the three Korean stops. Third, dialectal differences in VOT to mark the three-way distinction support the notion of a diachronic transition whereby VOT differences between the lenis and aspirated stops in Seoul Korean have been decreasing over the past 50 years. Finally, the aerodynamic results make it possible to postulate the articulatory state of the glottis, indicating a positive correlation with acoustic parameters. Based on the acoustic and aerodynamic results, phonological representations of Korean stops for the tonal and non-tonal dialects are suggested

    Acoustic characteristics of clearly spoken English fricatives

    Get PDF
    This is the publisher's version, also available electronically from http://scitation.aip.org/content/asa/journal/jasa/125/6/10.1121/1.2990715Speakers can adopt a speaking style that allows them to be understood more easily in difficult communication situations, but few studies have examined the acoustic properties of clearly produced consonants in detail. This study attempts to characterize the adaptations in the clear production of American English fricatives in a carefully controlled range of communication situations. Ten female and ten male talkers produced fricatives in vowel-fricative-vowel contexts in both a conversational and a clear style that was elicited by means of simulated recognition errors in feedback received from an interactive computer program. Acoustic measurements were taken for spectral, amplitudinal, and temporal properties known to influence fricative recognition. Results illustrate that (1) there were consistent overall style effects, several of which (consonant duration, spectral peak frequency, and spectral moments) were consistent with previous findings and a few (notably consonant-to-vowel intensity ratio) of which were not; (2) specific acoustic modifications in clear productions of fricatives were influenced by the nature of the recognition errors that prompted the productions and were consistent with efforts to emphasize potentially misperceived contrasts both within the English fricative inventory and based on feedback from the simulated listener; and (3) talkers differed widely in the types and magnitude of all modifications

    What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations

    Get PDF
    This is the author's accepted manuscript. This article may not exactly replicate the final version published in the APA journal. It is not the copy of record. The original publication is available at http://psycnet.apa.org/index.cfm?fa=search.displayrecord&uid=2011-05323-001.Most theories of categorization emphasize how continuous perceptual information is mapped to categories. However, equally important are the informational assumptions of a model, the type of information subserving this mapping. This is crucial in speech perception where the signal is variable and context dependent. This study assessed the informational assumptions of several models of speech categorization, in particular, the number of cues that are the basis of categorization and whether these cues represent the input veridically or have undergone compensation. We collected a corpus of 2,880 fricative productions (Jongman, Wayland, & Wong, 2000) spanning many talker and vowel contexts and measured 24 cues for each. A subset was also presented to listeners in an 8AFC phoneme categorization task. We then trained a common classification model based on logistic regression to categorize the fricative from the cue values and manipulated the information in the training set to contrast (a) models based on a small number of invariant cues, (b) models using all cues without compensation, and (c) models in which cues underwent compensation for contextual factors. Compensation was modeled by computing cues relative to expectations (C-CuRE), a new approach to compensation that preserves fine-grained detail in the signal. Only the compensation model achieved a similar accuracy to listeners and showed the same effects of context. Thus, even simple categorization metrics can overcome the variability in speech when sufficient information is available and compensation schemes like C-CuRE are employed

    Speaker normalization in the perception of Mandarin Chinese tones

    Get PDF
    This is the publisher's version, also available electronically from http://scitation.aip.org/content/asa/journal/jasa/102/3/10.1121/1.420092.This study investigated speaker normalization in perception of Mandarin tone 2 (midrising) and tone 3 (low-falling–rising) by examining listeners’ use of F0 range as a cue to speaker identity. Two speakers were selected such that tone 2 of the low-pitched speaker and tone 3 of the high-pitched speaker occurred at equivalent F0 heights. Production and perception experiments determined that turning point (or inflection point of the tone), and ΔF0 (the difference in F0 between onset and turning point) distinguished the two tones. Three tone continua varying in either turning point, ΔF0, or both acoustic dimensions, were then appended to a natural precursor phrase from each of the two speakers. Results showed identification shifts such that identical stimuli were identified as low tones for the high precursor condition, but as high tones for the low precursor condition. Stimuli varying in turning point showed no significant shift, suggesting that listeners normalize only when the precursor varies in the same dimension as the stimuli. The magnitude of the shift was greater for stimuli varying only in ΔF0, as compared to stimuli varying in both turning point and ΔF0, indicating that normalization effects are reduced for stimuli more closely matching natural speech

    Method for the location of burst-onset spectra in the auditory-perceptual space: A study of place of articulation in voiceless stop consonants

    Get PDF
    This is the publisher's version, also available electronically from http://scitation.aip.org/content/asa/journal/jasa/89/2/10.1121/1.1894648.A method for distinguishing burst onsets of voiceless stop consonants in terms of place of articulation is described. Four speakers produced the voiceless stops in word‐initial position in six vowel contexts. A metric was devised to extract the characteristic burst‐friction components at burst onset. The burst‐friction components, derived from the metric as sensory formants, were then transformed into log frequency ratios and plotted as points in an auditory‐perceptual space (APS). In the APS, each place of articulation was seen to be associated with a distinct region, or target zone. The metric was then applied to a test set of words with voiceless stops preceding ten different vowel contexts as produced by eight new speakers. The present method of analyzing voiceless stops in English enabled us to distinguish place of articulation in these new stimuli with 70% accuracy

    Obituary: Wendy Herd (1973-2020)

    Get PDF

    The Effect of Instructed Second Language Learning on the Acoustic Properties of First Language Speech

    Get PDF
    This paper reports on a comprehensive phonetic study of American classroom learners of Russian, investigating the influence of the second language (L2) on the first language (L1). Russian and English productions of 20 learners were compared to 18 English monolingual controls focusing on the acoustics of word-initial and word-final voicing. The results demonstrate that learners’ Russian was acoustically different from their English, with shorter voice onset times (VOTs) in [−voice] stops, longer prevoicing in [+voice] stops, more [−voice] stops with short lag VOTs and more [+voice] stops with prevoicing, indicating a degree of successful L2 pronunciation learning. Crucially, learners also demonstrated an L1 phonetic change compared to monolingual English speakers. Specifically, the VOT of learners’ initial English voiceless stops was shortened, indicating assimilation with Russian, while the frequency of prevoicing in learners’ English was decreased, indicating dissimilation with Russian. Word-final, the duration of preceding vowels, stop closures, frication, and voicing during consonantal constriction all demonstrated drift towards Russian norms of word-final voicing neutralization. The study confirms that L2-driven phonetic changes in L1 are possible even in L1-immersed classroom language learners, challenging the role of reduced L1 use and highlighting the plasticity of the L1 phonetic system
    • …
    corecore