286 research outputs found
Intonational Features of Local and Global Discourse Structure
We present results of a study of the relationship between intonational features including pitch range, timing, and amplitude and aspects of discourse structure defined in terms of Grosz and Sidner's (1986) model of discourse. We compare structural labelings of AP news text with prosodic/acoustic features examined from recordings of the same text read by a professional newscaster. We find significant correlations between prosodic/acoustic characteristics and both local and global aspects of discourse structure identified by our labelers. Our results have applications for speech synthesis and, potentially, for speech recognition.Engineering and Applied Science
Recommended from our members
Empirical Studies on the Disambiguation of Cue Phrases
Cue phrases are linguistic expressions such as now and well that function as explicit indicators of the structure of a discourse. For example, now may signal the beginning of a subtopic or a return to a previous topic, while well may mark subsequent material as a response to prior material, or as an explanatory comment. However, while cue phrases may convey discourse structure, each also has one or more alternate uses. While incidentally may be used sententially as an adverbial, for example, the discourse use initiates a digression. Although distinguishing discourse and sentential uses of cue phrases is critical to the interpretation and generation of discourse, the question of how speakers and hearers accomplish this disambiguation is rarely addressed. This paper reports results of empirical studies on discourse and sentential uses of cue phrases, in which both text-based and prosodic features were examined for disambiguating power. Based on these studies, it is proposed that discourse versus sentential usage may be distinguished by intonational features, specifically, pitch accent and prosodic phrasing. A prosodic model that characterizes these distinctions is identified. This model is associated with features identifiable from text analysis, including orthography and part of speech, to permit the application of the results of the prosodic analysis to the generation of appropriate intonational features for discourse and sentential uses of cue phrases in synthetic speech
Recommended from our members
On the Correlation between Energy and Pitch Accent in Read English Speech
In this paper, we describe a set of experiments that examine the correlation between energy and pitch accent. We tested the discriminative power of the energy component of frequency sub- bands with a variety of frequencies and bandwidths on read speech spoken by four native speakers of Standard American English, us- ing an analysis by classification approach. We found that the frequency region most robust to speaker differences is between 2 and 20 bark. Across all speakers, using only energy features we were able to predict pitch accent in read speech with accuracy of 81.9%
Recommended from our members
A Framework for Eliciting Emotional Speech: Capitalizing on the Actorās Process
This paper offers an approach and a theoretical framework for eliciting emotional speech using actors. The framework is developed by connecting the goal-based model of emotion proposed by Abelson [1], the work of appraisal theorists, and an approach to the actor's technical process widely used in the professional theater and taught in modern conservatories. In doing so, we hope to address some of the difficulties currently encountered in the use of acted speech in emotion research
Recommended from our members
Using Prosody and Phonotactics in Arabic Dialect Identiļ¬cation
While Modern Standard Arabic is the formal spoken and written language of the Arab world, dialects are the major communication mode for everyday life; identifying a speakerās dialect is thus critical to speech processing tasks such as automatic speech recognition, as well as speaker identification We examine the role of prosodic features (intonation and rhythm) across four Arabic dialects: Gulf, Iraqi, Levantine, and Egyptian, for the purpose of automatic dialect identification We show that prosodic features can significantly improve identification, over a purely phonotactic-based approach, with an identification accuracy of 86.33% for 2m utterances
Recommended from our members
The Meaning of Intonational Contours in the Interpretation of Discourse
Recent investigations of the contribution that intonation makes to overall utterance and discourse interpretation promise new sources of information for the investigation of long-time concerns in NLP. In Hirschberg & Pierrehumber 1986 we proposed that intonational features such as phrasing, accent placement, pitch range, and tune represent important sources of information about the attentional and intentional structures of discourse. In this paper we examine the particular contribution of choice of tune, or intonational contour, to discourse interpretation
Recommended from our members
Acoustic/Prosodic and Lexical Correlates of Charismatic Speech
Charisma, the ability to command authority on the basis of personal qualities, is more difcult to dene than to identify. How do charismatic leaders such as Fidel Castro or Pope John Paul II attract and retain their followers? We present results of an analysis of subjective ratings of charisma from a corpus of American political speech. We identify the associations be- tween charisma ratings and ratings of other personal attributes. We also examine acoustic/prosodic and lexical features of this speech and correlate these with charisma ratings
V-Measure: A conditional entropy-based external cluster evaluation
We present V-measure, an external entropy-based cluster evaluation measure. Vmeasure provides an elegant solution to many problems that affect previously defined cluster evaluation measures including 1) dependence on clustering algorithm or data set, 2) the āproblem of matchingā, where the clustering of only a portion of data points are evaluated and 3) accurate evaluation and combination of two desirable aspects of clustering, homogeneity and completeness. We compare V-measure to a number of popular cluster evaluation measures and demonstrate that it satisfies several desirable properties of clustering solutions, using simulated clustering results. Finally, we use V-measure to evaluate two clustering tasks: document clustering and pitch accent type clustering
- ā¦