931 research outputs found
Study on the phonetic pitch movement of the accentual phrase in Korean read speech
International audienceThe minor prosodic unit in Korean language, generally called Accentual Phrase, is usually defined by its syntactic or phonological cue. This article tries to analyze the correlation between the phonetic pitch movement and the accentual phrase boundary by means of pattern extraction and probabilistic prediction
The Discourse Function of Final Rises in French dialogues
We report the results of an empirical study which aims to describe the discourse function of rises at right edge intonation boundaries in French. A Map-Task corpus containing two dialogues was annotated for IP boundaries and pitch transition points with the INTSINT intonational alphabet. The transcripts of the dialogues were labeled for dialogue structure and dialogue acts, using form and function tags. The relation between rises at IP boundaries with types of dialogue acts and topic shifts was statistically evaluated. As expected, the results show a positive correlation between rises and polar questions and between rises and discourse topic openings. Interestingly, the second correlation was stronger than the first, suggesting that the association of rises with topic openings is not simply due to the effect of questions as introducing new topics
Optimizing the automatic functional annotation of English intonation
One of the fundamental aims of prosodic analysis is to provide a reliable means of extracting functional information (what prosody contributes to meaning) directly from prosodic form. It has been argued that an explicit model of the mapping from prosodic function to prosodic form could provide an objective way of approaching this task. In this presentation we look specifically at some of the problems of optimizing this mapping in order to extract the functional information automatically from the formal representation, hence ultimately directly from the acoustic data
Automatic prosodic analysis for computer aided pronunciation teaching
Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for purposes of both comprehension and intelligibility. Computer aided pronunciation teaching involves automatic analysis of the speech of a non-native talker in order to provide a diagnosis of the learner's performance in comparison with the speech of a native talker. This thesis describes research undertaken to automatically analyse the prosodic aspects of speech for computer aided pronunciation teaching. It is necessary to describe the suprasegmental composition of a learner's speech in order to characterise significant deviations from a native-like prosody, and to offer some kind of corrective diagnosis. Phonological theories of prosody aim to describe the suprasegmental composition of speech..
Recommended from our members
Chapter 2: The Original ToBI System and the Evolution of the ToBI Framework
In this chapter, the authors will try to identify the essential properties of a ToBI framework annotation system by describing the development and design of the original ToBI conventions. In this description, the authors will overview the general phonological theory and the specific theory of Mainstream American English intonation and prosody that the authors decided to incorporate in the original ToBI tags. The authors will also state the practical principles that led us to make the decisions that the authors did. The chapter is organised as follows. Section 2.2 briefly chronicles how the MAE_ToBI system came into being. Section 2.3 briefly describes the consensus account of English intonation and prosody on which the MAE_ToBI system is based. Section 2.4 catalogues the different components of a MAE_ToBI transcription and lists the salient rules which constrain the relationships between different components. This section also expands upon the theoretical foundations and practical consequences of adopting the general structure of multiple labelling tiers, and particularly the separation of the labels for tones from the labels for indexing prosodic boundary strength. Section 2.5 then describes some of the extensions of the basic ToBI tiers that have been adopted by some sites. This section also compares our decisions about the number of tiers and about inter-tier constraints with the analogous decisions for some of the other ToBI systems described in this book. Section 2.6 discusses the status of the symbolic labels relative to the continuous phonetic records that are also an obligatory component of the MAE_ToBI transcription. Section 2.7 then closes by listing several open research questions that the authors would like to see addressed by MAE_ToBI users and the larger ToBI community
Strategies for Representing Tone in African Writing Systems
Tone languages provide some interesting challenges for the designers of new orthographies.
One approach is to omit tone marks, just as stress is not marked in English (zero marking).
Another approach is to do phonemic tone analysis and then make heavy use of diacritic
symbols to distinguish the `tonemes' (exhaustive marking). While orthographies based on
either system have been successful, this may be thanks to our ability to manage inadequate
orthographies rather than to any intrinsic advantage which is afforded by one or the other
approach. In many cases, practical experience with both kinds of orthography in sub-Saharan
Africa has shown that people have not been able to attain the level of reading and writing
fluency that we know to be possible for the orthographies of non-tonal languages. In some
cases this can be attributed to a sociolinguistic setting which does not favour vernacular
literacy. In other cases, the orthography itself might be to blame. If the orthography of a tone
language is difficult to user or to learn, then a good part of the reason, I believe, is that the
designer either has not paid enough attention to the function of tone in the language, or has
not ensured that the information encoded in the orthography is accessible to the ordinary
(non-linguist) user of the language. If the writing of tone is not going to continue to be a
stumbling block to literacy efforts, then a fresh approach to tone orthography is required, one
which assigns high priority to these two factors.
This article describes the problems with orthographies that use too few or too many tone
marks, and critically evaluates a wide range of creative intermediate solutions. I review the
contributions made by phonology and reading theory, and provide some broad methodological
principles to guide someone who is seeking to represent tone in a writing system. The tone
orthographies of several languages from sub-Saharan Africa are presented throughout the
article, with particular emphasis on some tone languages of Cameroon
AUTOMATIC PROSODY GENERATION IN A TEXT-TO-SPEECH SYSTEM FOR HEBREW
The paper presents the module for automatic prosody generation within a system for automatic synthesis of high-quality speech based on arbitrary text in Hebrew. The high quality of synthesis is due to the high accuracy of automatic prosody generation, enabling the introduction of elements of natural sentence prosody of Hebrew. Automatic morphological annotation of text is based on the application of an expert algorithm relying on transformational rules. Syntactic-prosodic parsing is also rule based, while the generation of the acoustic representation of prosodic features is based on classification and regression trees. A tree structure generated during the training phase enables accurate prediction of the acoustic representatives of prosody, namely, durations of phonetic segments as well as temporal evolution of fundamental frequency and energy. Such an approach to automatic prosody generation has lead to an improvement in the quality of synthesized speech, as confirmed by listening tests
Extending automatic transcripts in a unified data representation towards a prosodic-based metadata annotation and evaluation
This paper describes a framework that extends automatic speech transcripts in order to accommodate relevant information coming from manual transcripts, the speech signal itself, and other resources, like lexica. The proposed framework automatically collects, relates, computes, and stores all relevant information together in a self-contained data source, making it possible to easily provide a wide range of interconnected information suitable for speech analysis, training, and evaluating a number of automatic speech processing tasks. The main goal of this framework is to integrate different linguistic and paralinguistic layers of knowledge for a more complete view of their representation and interactions in several domains and languages. The processing chain is composed of two main stages, where the first consists of integrating the relevant manual annotations in the speech recognition data, and the second consists of further enriching the previous output in order to accommodate prosodic information. The described framework has been used for the identification and analysis of structural metadata in automatic speech transcripts. Initially put to use for automatic detection of punctuation marks and for capitalization recovery from speech data, it has also been recently used for studying the characterization of disfluencies in speech. It was already applied to several domains of Portuguese corpora, and also to English and Spanish Broadcast News corpora
- …