2,562 research outputs found

    Intonational Features of Local and Global Discourse Structure

    Get PDF
    We present results of a study of the relationship between intonational features including pitch range, timing, and amplitude and aspects of discourse structure defined in terms of Grosz and Sidner's (1986) model of discourse. We compare structural labelings of AP news text with prosodic/acoustic features examined from recordings of the same text read by a professional newscaster. We find significant correlations between prosodic/acoustic characteristics and both local and global aspects of discourse structure identified by our labelers. Our results have applications for speech synthesis and, potentially, for speech recognition.Engineering and Applied Science

    Asymmetrical cognitive load Imposed by processing native and non-native speech

    Full text link
    Intonation affects information processing and comprehension. Previous research has found that some international teaching assistants (ITAs) fail to exploit English intonation, potentially posing processing difficulties to students who are native English speakers. However, researchers have also found that non-native listeners found it easier to process sentences given by a non-native speaker with a shared language background, leading to an interlanguage speech intelligibility benefit (ISIB). Therefore, how native speaker teaching assistant (NSTA)’s and ITA’s classroom speech affects the processing, comprehension, and attitudes of listeners with different language backgrounds needs to be further investigated. Using a dual-task paradigm, a comprehension questionnaire, and an attitudinal questionnaire, the present study investigates how the pronunciation and intonation of a NSTA and an ITA affect native English speakers’ and Mandarin-speaking English learners’ processing and comprehension of a lecture, and attitudes towards the two instructors. The present study found shared processing advantages when the listeners shared the L1 of the speaker, but overall lecture comprehension and attitude were unaffected. These findings support and extend prior research studies surveying ITAs’ intonational patterns and ISIB. These findings also have implications for research on the teaching of English pronunciation to non-native instructors.Published versio

    Discourse structure and information structure : interfaces and prosodic realization

    Get PDF
    In this paper we review the current state of research on the issue of discourse structure (DS) / information structure (IS) interface. This field has received a lot of attention from discourse semanticists and pragmatists, and has made substantial progress in recent years. In this paper we summarize the relevant studies. In addition, we look at the issue of DS/ISinteraction at a different level—that of phonetics. It is known that both information structure and discourse structure can be realized prosodically, but the issue of phonetic interaction between the prosodic devices they employ has hardly ever been discussed in this context. We think that a proper consideration of this aspect of DS/IS-interaction would enrich our understanding of the phenomenon, and hence we formulate some related research-programmatic positions

    Prosodic focus in Vietnamese

    Get PDF
    This paper reports on pilot work on the expression of Information Structure in Vietnamese and argues that Focus in Vietnamese is exclusively expressed prosodically: there are no specific focus markers, and the language uses phonology to express intonational emphasis in similar ways to languages like English or German. The exploratory data indicates that (i) focus is prosodically expressed while word order remains constant, (ii) listeners show good recoverability of the intended focus structure, and (iii) that there is a trading relationship between several phonetic parameters (duration, f0, amplitude) involved to signal prosodic (acoustic) emphasis

    Prosodic Event Recognition using Convolutional Neural Networks with Context Information

    Full text link
    This paper demonstrates the potential of convolutional neural networks (CNN) for detecting and classifying prosodic events on words, specifically pitch accents and phrase boundary tones, from frame-based acoustic features. Typical approaches use not only feature representations of the word in question but also its surrounding context. We show that adding position features indicating the current word benefits the CNN. In addition, this paper discusses the generalization from a speaker-dependent modelling approach to a speaker-independent setup. The proposed method is simple and efficient and yields strong results not only in speaker-dependent but also speaker-independent cases.Comment: Interspeech 2017 4 pages, 1 figur
    • …
    corecore