5 research outputs found
Prosody Modelling in Concept-to-Speech Generation: Methodological Issues
We explore three issues for the development of concept-to-speech (CTS) systems. We identify information available in a language-generation system that has the potential to impact prosody; investigate the role played by different corpora in CTS prosody modelling; and explore different methodologies for learning how linguistic features
impact prosody. Our major focus is on the comparison of two machine learning methodologies: generalized rule induction and memory-based learning. We describe this work in the context of multimedia abstract generation of intensive care (MAGIC) data, a system that produces multimedia brings of the status of patients who have just undergone a bypass operation
Prosodic Correlates Of Referent Status
Prosodic correlates of 6 referent status taxonomies and 3 distance-from-last-mention heuristics both on the acoustic and on the symbolic level (ToBI) were investigated in a corpus of short news reports read by 6 professional newsreaders. Symbolic correlates are found mainly for pronouns, acoustic correlates for nouns and proper names. However, both form and extent of these correlates varies considerably between speakers. 1 INTRODUCTION 1.1 What is given? Prosody can have many functions, among these signalling "new" information. It is frequently assumed that discoursenew information tends to be accented, and discourse-old information to be deaccented. But what exactly is this givenness ? A popular operationalization is that entities which have already been mentioned in the discourse are given, all others new [11, 5]. Frequently, an item is also regarded as new if the distance to the last mention is more than one sentence. However, givenness of information and givenness of discourse r..
Expressivity in TTS from Semantics and Pragmatics
In this paper we present ongoing work to produce an expressive TTS reader that can be used both in text and dialogue applications. The system called SPARSAR has been used to read (English) poetry so far but it can now be applied to any text. The text is fully analyzed both at phonetic and phonological level, and at syntactic and semantic level. In addition, the system has access to a restricted list of typical pragmatically marked phrases and expressions that are used to convey specific discourse function and speech acts and need specialized intonational contours. The text is transformed into a poem-like structures, where each line corresponds to a Breath Group, semantically and syntactically consistent. Stanzas correspond to paragraph boundaries. Analogical parameters are related to ToBI theoretical in- dices but their number is doubled. In this paper, we concentrate on short stories and fables
Constituent-based Accent Prediction
Near-perfect automatic accent assignment is attainable for citation-style speech, but better computational models are needed to predict accent in extended, spontaneous discottrses. This paper presents an empirically motivated theory of the discourse focusing nattire of accent in spontaneous speech. Hypotheses based on this theory lead to a new approach to accent prediction, in which patterns of deviation from citation form accentuation, defined at the constittlent or nottn phrase level are automatically learned from an annotated corpus. Machine learning experiments on 1031 notin phrases from eighteen spontaneous direction-giving monologties show that accent assignment can be significantly improved by tip to 4%-6% relative to a hypothetical baseline system that wotdd produce only citation-form accentuation, giving error rate reductions of 11%-25%
Recommended from our members
The Computational Processing of Intonational Prominence: A Functional Prosody Perspective
Intonational prominence, or accent, is a fundamental prosodic feature that is said to contribute to discourse meaning. This thesis outlines a new, computational theory of the discourse interpretation of prominence, from a FUNCTIONAL PROSODY perspective. Functional prosody makes the following two important assumptions: first, there is an aspect of prominence interpretation that centrally concerns discourse processes, namely the discourse focusing nature of prominence; and second, the role of prominence in language processing in general, and discourse processing in particular, is not essentially separate from the processing of other grammatical, nonprosodic information. This thesis develops a computational theory of prominence interpretation by explaining how prominence serves as an inference cue in discourse processing. Prominence signals changes in the attentional status of entities in a discourse model, while nonprominence signals that the realized entities are already in discourse focus. Evidence for the new theory is provided by distributional analysis of a spontaneous narrative monologue. New discourse processing algorithms that integrate form of expression, grammatical function and intonational prominence information for reference resolution and attentional state modeling show how the principles of the theory may be applied in SPEECH UNDERSTANDING systems. Finally, aspects of the new theory are explored in accent prediction experiments on a corpus of spontaneous and read direction-giving monologues. Machine learning is used to investigate the extent to which the analyzed higher-level linguistic features associated with prominence may combine with lower-level features that are known to influence accent assignment. Original constituent-based accent prediction experiments attempt to bootstrap off of established knowledge about citation-form accenting, and begin to develop an understanding of how the examined features of discourse context may be integrated into accent assignment systems for text-to-speech synthesis.Engineering and Applied Science