9,183 research outputs found

    The Phonetics of VOT and Tone Interaction in Cantonese

    Get PDF
    This study investigates the possible effects of lexical tone on Voice Onset Time (VOT) in Cantonese, a tonal language with a two-way contrast between short-lag (voiceless unaspirated) and long-lag (voiceless aspirated) stops. VOT was measured as the time interval between the stop burst and the onset of voicing for the following vowel. The recorded speech of 6 native speakers each producing 10 repetitions of 20 different words contrasting in aspiration and tone was analyzed. Tokens from each individual subject were divided into two sets for the purpose of comparison. The first set involved a comparison between the effects of a high-level 55 tone and a mid-level 33 tone. Results showed no significant VOT differences unless aspirated and unaspirated stops were examined separately. In this case, only the aspirated stops showed a significant difference with the 33 tone associated with higher VOT. The second set of stimuli compared the effects of 4 different phonemic tone categories (55, 25, 33, and 21) on VOT. Results show that words beginning with a lower tonal onset (and thus the 25 and 21 tones) correlated with higher VOT than words beginning with a higher tonal onset (the 55 and 33 tones)

    Disentangling the effects of phonation and articulation: Hemispheric asymmetries in the auditory N1m response of the human brain

    Get PDF
    BACKGROUND: The cortical activity underlying the perception of vowel identity has typically been addressed by manipulating the first and second formant frequency (F1 & F2) of the speech stimuli. These two values, originating from articulation, are already sufficient for the phonetic characterization of vowel category. In the present study, we investigated how the spectral cues caused by articulation are reflected in cortical speech processing when combined with phonation, the other major part of speech production manifested as the fundamental frequency (F0) and its harmonic integer multiples. To study the combined effects of articulation and phonation we presented vowels with either high (/a/) or low (/u/) formant frequencies which were driven by three different types of excitation: a natural periodic pulseform reflecting the vibration of the vocal folds, an aperiodic noise excitation, or a tonal waveform. The auditory N1m response was recorded with whole-head magnetoencephalography (MEG) from ten human subjects in order to resolve whether brain events reflecting articulation and phonation are specific to the left or right hemisphere of the human brain. RESULTS: The N1m responses for the six stimulus types displayed a considerable dynamic range of 115–135 ms, and were elicited faster (~10 ms) by the high-formant /a/ than by the low-formant /u/, indicating an effect of articulation. While excitation type had no effect on the latency of the right-hemispheric N1m, the left-hemispheric N1m elicited by the tonally excited /a/ was some 10 ms earlier than that elicited by the periodic and the aperiodic excitation. The amplitude of the N1m in both hemispheres was systematically stronger to stimulation with natural periodic excitation. Also, stimulus type had a marked (up to 7 mm) effect on the source location of the N1m, with periodic excitation resulting in more anterior sources than aperiodic and tonal excitation. CONCLUSION: The auditory brain areas of the two hemispheres exhibit differential tuning to natural speech signals, observable already in the passive recording condition. The variations in the latency and strength of the auditory N1m response can be traced back to the spectral structure of the stimuli. More specifically, the combined effects of the harmonic comb structure originating from the natural voice excitation caused by the fluctuating vocal folds and the location of the formant frequencies originating from the vocal tract leads to asymmetric behaviour of the left and right hemisphere

    Transfer Effect of Speech-sound Learning on Auditory-motor Processing of Perceived Vocal Pitch Errors

    Get PDF
    Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production

    VOT and F0 in Zulu Dental Clicks and Alveolar Plosives

    Get PDF
    The present study investigated the contrast in Voice Onset Time (VOT) and Fundamental Frequency (F0) between different varieties of dental clicks, and alveolar plosives. The study attempts to clarify the commonalities and dissimilarities within the two through a set of recordings with a language consultant, a native speaker of Zulu. The data findings generally held little discrepancies within current data, falling into line with Strazny’s report on tonal depression in Zulu (2003), and Hanson’s voiceless and voiced dichotomy in F0 at vowel onset (2009). However, there was a certain amount of deviation from Midtlyng’s report on effects of Speech Rate and Place of Articulation on VOT data (2011

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Individual differences in the discrimination of novel speech sounds: effects of sex, temporal processing, musical and cognitive abilities

    Get PDF
    This study examined whether rapid temporal auditory processing, verbal working memory capacity, non-verbal intelligence, executive functioning, musical ability and prior foreign language experience predicted how well native English speakers (N = 120) discriminated Norwegian tonal and vowel contrasts as well as a non-speech analogue of the tonal contrast and a native vowel contrast presented over noise. Results confirmed a male advantage for temporal and tonal processing, and also revealed that temporal processing was associated with both non-verbal intelligence and speech processing. In contrast, effects of musical ability on non-native speech-sound processing and of inhibitory control on vowel discrimination were not mediated by temporal processing. These results suggest that individual differences in non-native speech-sound processing are to some extent determined by temporal auditory processing ability, in which males perform better, but are also determined by a host of other abilities that are deployed flexibly depending on the characteristics of the target sounds

    Prosodic focus in Vietnamese

    Get PDF
    This paper reports on pilot work on the expression of Information Structure in Vietnamese and argues that Focus in Vietnamese is exclusively expressed prosodically: there are no specific focus markers, and the language uses phonology to express intonational emphasis in similar ways to languages like English or German. The exploratory data indicates that (i) focus is prosodically expressed while word order remains constant, (ii) listeners show good recoverability of the intended focus structure, and (iii) that there is a trading relationship between several phonetic parameters (duration, f0, amplitude) involved to signal prosodic (acoustic) emphasis

    Asymmetric discrimination of non-speech tonal analogues of vowels

    Full text link
    Published in final edited form as: J Exp Psychol Hum Percept Perform. 2019 February ; 45(2): 285–300. doi:10.1037/xhp0000603.Directional asymmetries reveal a universal bias in vowel perception favoring extreme vocalic articulations, which lead to acoustic vowel signals with dynamic formant trajectories and well-defined spectral prominences due to the convergence of adjacent formants. The present experiments investigated whether this bias reflects speech-specific processes or general properties of spectral processing in the auditory system. Toward this end, we examined whether analogous asymmetries in perception arise with non-speech tonal analogues that approximate some of the dynamic and static spectral characteristics of naturally-produced /u/ vowels executed with more versus less extreme lip gestures. We found a qualitatively similar but weaker directional effect with two-component tones varying in both the dynamic changes and proximity of their spectral energies. In subsequent experiments, we pinned down the phenomenon using tones that varied in one or both of these two acoustic characteristics. We found comparable asymmetries with tones that differed exclusively in their spectral dynamics, and no asymmetries with tones that differed exclusively in their spectral proximity or both spectral features. We interpret these findings as evidence that dynamic spectral changes are a critical cue for eliciting asymmetries in non-speech tone perception, but that the potential contribution of general auditory processes to asymmetries in vowel perception is limited.Accepted manuscrip

    Context effects on second-language learning of tonal contrasts.

    Full text link
    Studies of lexical tone  learning generally focus on monosyllabic contexts, while reports of phonetic learning benefits associated with input variability are based largely on experienced learners. This study trained inexperienced learners on Mandarin tonal contrasts to test two hypotheses regarding the influence of context and variability on tone  learning. The first hypothesis was that increased phonetic variability of tones in disyllabic contexts makes initial tone  learning more challenging in disyllabic than monosyllabic words. The second hypothesis was that the learnability of a given tone varies across contexts due to differences in tonal variability. Results of a word learning experiment supported both hypotheses: tones were acquired less successfully in disyllables than in monosyllables, and the relative difficulty of disyllables was closely related to contextual tonal variability. These results indicate limited relevance of monosyllable-based data on Mandarin learning for the disyllabic majority of the Mandarin lexicon. Furthermore, in the short term, variability can diminish learning; its effects are not necessarily beneficial but dependent on acquisition stage and other learner characteristics. These findings thus highlight the importance of considering contextual variability and the interaction between variability and type of learner in the design, interpretation, and application of research on phonetic learning

    Consonant and Tone Interaction in Cantonese

    Get PDF
    In this presentation, I discuss results from a statistical analysis of the acoustic properties of the speech of six native speakers of Cantonese. The particular research question investigated was whether or not tone in Cantonese has an effect on a property known as Voice Onset Time (VOT), a measurement of the duration of what are known as "stop" consonants, and whether this effect is purely a phonetic consequence of how tonal distinctions are produced or whether these effects are mediated by abstract tonal categories. Previous research examining the relationship between tone and VOT in other languages (including Mandarin, Hakka, Shanghainese, Taiwanese, Mazatec, and Kera) has shown mixed results. The study presented here contributes to this research literature by adding Cantonese to the list of tonal languages that have already been investigated. Results from an ANOVA test on the Cantonese data showed that there is a statistically significant effect (
    corecore