3,220 research outputs found

    Discourse structure and information structure : interfaces and prosodic realization

    Get PDF
    In this paper we review the current state of research on the issue of discourse structure (DS) / information structure (IS) interface. This field has received a lot of attention from discourse semanticists and pragmatists, and has made substantial progress in recent years. In this paper we summarize the relevant studies. In addition, we look at the issue of DS/ISinteraction at a different level—that of phonetics. It is known that both information structure and discourse structure can be realized prosodically, but the issue of phonetic interaction between the prosodic devices they employ has hardly ever been discussed in this context. We think that a proper consideration of this aspect of DS/IS-interaction would enrich our understanding of the phenomenon, and hence we formulate some related research-programmatic positions

    Developing the modelling of Swedish prosody in spontaneous dialogue

    Get PDF
    The main goal of our current research is the development of the Swedish prosody model. In our analysis of discourse and dialogue intonation we are exploiting model-based resynthesis. By comparing synthesized default and fine-tuned pitch contours for dialogues under study we are able to isolate relevant intonation patterns. This analysis of intonation is related to an independent modelling of topic structure consisting of lexical-semantic analysis and text segmentation. Some results from our model-based acoustic analysis are presented, and the implementation in text-tospeech-synthesis is discussed. 1

    Structuring information through gesture and intonation

    Get PDF
    Face-to-face communication is multimodal. In unscripted spoken discourse we can observe the interaction of several “semiotic layers”, modalities of information such as syntax, discourse structure, gesture, and intonation. We explore the role of gesture and intonation in structuring and aligning information in spoken discourse through a study of the co-occurrence of pitch accents and gestural apices. Metaphorical spatialization through gesture also plays a role in conveying the contextual relationships between the speaker, the government and other external forces in a naturally-occurring political speech setting

    Using term clouds to represent segment-level semantic content of podcasts

    Get PDF
    Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries

    From image to text to speech : The effects of speech prosody on information sequencing in audio description

    Get PDF
    Given the extensive body of research in audio description – the verbal-vocal description of visual or audiovisual content for visually impaired audiences – it is striking how little attention has been paid thus far to the spoken dimension of audio description and its para-linguistic, prosodic aspects. This article complements the previous research into how audio description speech is received by the partially sighted audiences by analyzing how it is performed vocally. We study the audio description of pictorial art, and one aspect of prosody is examined in detail: pitch, and the segmentation of information in relation to it. We analyze this relation in a corpus of audio described pictorial art in Finnish by combining phonetic measurements of the pitch with discourse analysis of the information segmentation. Previous studies have already shown that a sentence-initial high pitch acts as a discourse-structuring device in interpreting. Our study shows that the same applies to audio description. In addition, our study suggests that there is a relationship between the scale in the rise of pitch and the scale of the topical transition. That is, when the topical transition is clear, the rise of pitch level between the beginnings of two consecutive spoken sentences is large. Analogically, when the topical transition is small, the change of the sentence-initial pitch level is also rather small.Given the extensive body of research in audio description – the verbal-vocal description of visual or audiovisual content for visually impaired audiences – it is striking how little attention has been paid thus far to the spoken dimension of audio description and its para-linguistic, prosodic aspects. This article complements the previous research into how audio description speech is received by the partially sighted audiences by analyzing how it is performed vocally. We study the audio description of pictorial art, and one aspect of prosody is examined in detail: pitch, and the segmentation of information in relation to it. We analyze this relation in a corpus of audio described pictorial art in Finnish by combining phonetic measurements of the pitch with discourse analysis of the information segmentation. Previous studies have already shown that a sentence-initial high pitch acts as a discourse-structuring device in interpreting. Our study shows that the same applies to audio description. In addition, our study suggests that there is a relationship between the scale in the rise of pitch and the scale of the topical transition. That is, when the topical transition is clear, the rise of pitch level between the beginnings of two consecutive spoken sentences is large. Analogically, when the topical transition is small, the change of the sentence-initial pitch level is also rather small.Peer reviewe

    Pragmatics and Prosody

    Get PDF
    Most of the papers collected in this book resulted from presentations and discussions undertaken during the V Lablita Workshop that took place at the Federal University of Minas Gerais, Brazil, on August 23-25, 2011. The workshop was held in conjunction with the II Brazilian Seminar on Pragmatics and Prosody. The guiding themes for the joint event were illocution, modality, attitude, information patterning and speech annotation. Thus, all papers presented here are concerned with theoretical and methodological issues related to the study of speech. Among the papers in this volume, there are different theoretical orientations, which are mirrored through the methodological designs of studies pursued. However, all papers are based on the analysis of actual speech, be it from corpora or from experimental contexts trying to emulate natural speech. Prosody is the keyword that comes out from all the papers in this publication, which indicates the high standing of this category in relation to studies that are geared towards the understanding of major elements that are constitutive of the structuring of speech
    • 

    corecore