5,167 research outputs found

    Multi-Tier Annotations in the Verbmobil Corpus

    Get PDF
    In very large and diverse scientific projects where as different groups as linguists and engineers with different intentions work on the same signal data or its orthographic transcript and annotate new valuable information, it will not be easy to build a homogeneous corpus. We will describe how this can be achieved, considering the fact that some of these annotations have not been updated properly, or are based on erroneous or deliberately changed versions of the basis transcription. We used an algorithm similar to dynamic programming to detect differences between the transcription on which the annotation depends and the reference transcription for the whole corpus. These differences are automatically mapped on a set of repair operations for the transcriptions such as splitting compound words and merging neighbouring words. On the basis of these operations the correction process in the annotation is carried out. It always depends on the type of the annotation as well as on the position and the nature of the difference, whether a correction can be carried out automatically or has to be fixed manually. Finally we present a investigation in which we exploit the multi-tier annotations of the Verbmobil corpus to find out how breathing is correlated with prosodic-syntactic boundaries and dialog acts. 1

    Il mito dell’isocronia moraica in giapponese. Un’analisi quantitativa basata su corpora orali

    Get PDF
    Pike (1945) classified the world languages into two types of rhythmic/prosodic patterns: stress-timed and syllable-timed. According to this classification, stress-timed languages, like English and German, tend to have isochronous interstress intervals, while syllable-timed languages, like Italian and Spanish, tend to have equal syllable duration. Ladefoged (1975) added the mora-timed type, in which isochrony is maintained at the level of the mora, a sub-syllabic constituent that includes either onset and nucleus, or a coda. Japanese is often referred to as a mora-timed language (Otake 2015): the mora is the psychological prosodic unit in spoken language, and the metric unit of traditional poetry (Bloch 1950). The syllabaries, in which each grapheme corresponds to a mora, make this prosodic segmentation clear. However, previous experimental studies have claimed that the mora is not a perfect isochronousunit (Warner and Arai 2001).The aim of this paper is to present the rhythm-prosodic system of the Japanese language giving a precise description of its prosodic units --the mora and the syllable--, and to provide empirical quantitative data on the duration of mora in spontaneous Japanese. The dataset used in the present study is a portion of the Corpus of Spontaneous Japanese called Core, consisting ofabout 45 hours ofextensively annotated speech. The variation of the average duration of the mora has been analysed on the basis of linguistic parameters, such as the typology of mora and the phonotactic structure of the word in which it is included, and of extra-linguistic parameters,such as the typology of speech

    Prosodic description: An introduction for fieldworkers

    Get PDF
    This article provides an introductory tutorial on prosodic features such as tone and accent for researchers working on little-known languages. It specifically addresses the needs of non-specialists and thus does not presuppose knowledge of the phonetics and phonology of prosodic features. Instead, it intends to introduce the uninitiated reader to a field often shied away from because of its (in part real, but in part also just imagined) complexities. It consists of a concise overview of the basic phonetic phenomena (section 2) and the major categories and problems of their functional and phonological analysis (sections 3 and 4). Section 5 gives practical advice for documenting and analyzing prosodic features in the field.National Foreign Language Resource Cente

    A Multimodal Analysis of Vocal and Visual Backchannels in Spontaneous Dialogs

    Get PDF
    Backchannels (BCs) are short vocal and visual listener responses that signal attention, interest, and understanding to the speaker. Previous studies have investigated BC prediction in telephone-style dialogs from prosodic cues. In contrast, we consider spontaneous face-to-face dialogs. The additional visual modality allows speaker and listener to monitor each other's attention continuously, and we hypothesize that this affects the BC-inviting cues. In this study, we investigate how gaze, in addition to prosody, can cue BCs. Moreover, we focus on the type of BC performed, with the aim to find out whether vocal and visual BCs are invited by similar cues. In contrast to telephone-style dialogs, we do not find rising/falling pitch to be a BC-inviting cue. However, in a face-to-face setting, gaze appears to cue BCs. In addition, we find that mutual gaze occurs significantly more often during visual BCs. Moreover, vocal BCs are more likely to be timed during pauses in the speaker's speech

    Tagging Prosody and Discourse Structure in Elicited Spontaneous Speech

    Get PDF
    This paper motivates and describes the annotation and analysis of prosody and discourse structure for several large spoken language corpora. The annotation schema are of two types: tags for prosody and intonation, and tags for several aspects of discourse structure. The choice of the particular tagging schema in each domain is based in large part on the insights they provide in corpus-based studies of the relationship between discourse structure and the accenting of referring expressions in American English. We first describe these results and show that the same models account for the accenting of pronouns in an extended passage from one of the Speech Warehouse hotel-booking dialogues. We then turn to corpora described in Venditti [Ven00], which adapts the same models to Tokyo Japanese. Japanese is interesting to compare to English, because accent is lexically specified and so cannot mark discourse focus in the same way. Analyses of these corpora show that local pitch range expansion serves the analogous focusing function in Japanese. The paper concludes with a section describing several outstanding questions in the annotation of Japanese intonation which corpus studies can help to resolve.Work reported in this paper was supported in part by a grant from the Ohio State University Office of Research, to Mary E. Beckman and co-principal investigators on the OSU Speech Warehouse project, and by an Ohio State University Presidential Fellowship to Jennifer J. Venditti

    Information structural notions and the fallacy of invariant correlates

    Get PDF
    In a first step, definitions of the irreducible information structural categories are given, and in a second step, it is shown that there are no invariant phonological or otherwise grammatical correlates of these categories. In other words, the phonology, syntax or morphology are unable to define information structure. It is a common mistake that information structural categories are expressed by invariant grammatical correlates, be they syntactic, morphological or phonological. It is rather the case that grammatical cues help speaker and hearer to sort out which element carries which information structural role, and only in this sense are the grammatical correlates of information structure important. Languages display variation as to the role of grammar in enhancing categories of information structure, and this variation reflects the variation found in the ‘normal’ syntax and phonology of languages
    • 

    corecore