Skip to main content
Article thumbnail
Location of Repository

Combining Multiple Knowledge Sources for Dialogue Segmentation in Multimedia Archives

By Pei-Yun Hsueh and Johanna D. Moore


Automatic segmentation is important for\ud making multimedia archives comprehensible,\ud and for developing downstream information\ud retrieval and extraction modules. In\ud this study, we explore approaches that can\ud segment multiparty conversational speech\ud by integrating various knowledge sources\ud (e.g., words, audio and video recordings,\ud speaker intention and context). In particular,\ud we evaluate the performance of a Maximum\ud Entropy approach, and examine the\ud effectiveness of multimodal features on the\ud task of dialogue segmentation. We also provide\ud a quantitative account of the effect of\ud using ASR transcription as opposed to human\ud transcripts

Year: 2007
OAI identifier:

Suggested articles


  1. (1996). A prosodic analysis of discourse segments in direction-giving monologues. doi
  2. (1996). Aprosodicanalysis of discourse segments in direction-giving monologues. doi
  3. (1986). Attention, intentions, and the structure of discourse.
  4. (2005). Automatic analysis of multimodal group actions in meetings. doi
  5. (2005). Automatic analysis of multimodalgroupactions in meetings. doi
  6. (2006). Automatic topic segmentation and lablelling in multiparty dialogue. doi
  7. (2000). DIASUMM: Flexible summarization of spontaneous dialogues in unrestricted domains. doi
  8. (2001). Direct modeling of prosody: An overview of applications in automatic speech processing. doi
  9. (1994). Discourse functions of pitch range in spontaneous and read speech.
  10. (2003). Discourse segmentation of multi-party conversation. doi
  11. (2001). Integrating prosodic and lexical cues for automatic topic segmentation. doi
  12. (1993). Intention-based segmentation: Human reliability and correlation with linguistic cues. doi
  13. (2005). Maximum entropy segmentation of broadcast news. doi
  14. (2005). Meeting structure annotation: Data and tools. doi
  15. (2005). Multimodal integration for meeting group action segmentation and recognition. doi
  16. (1987). Now let’s talk about now: identifying cue phrases intonationally. doi
  17. (1980). Phonetic characteristics of discourse. doi
  18. (2004). Prosody-based topic segmentation for mandarin broadcast news. doi
  19. (1980). Questions of Intonation.
  20. (2006). Segmenting meetings into agenda items by extracting implicit supervision from human note-taking. doi
  21. (1993). Text segmentation based on similarity between words. doi
  22. (1997). Text segmentation by topic. doi
  23. (1997). TextTiling: Segmenting text into multiparagraph subtopic passages.
  24. (1997). TextTiling: Segmentingtextintomultiparagraph subtopic passages.
  25. (2006). The AMI meeting corpus: A preannouncement. doi
  26. (2004). The kappa statistic: A second look. doi
  27. (2005). The necessity of a meeting recording and playback system, and the benefit of topic-level annotations to meeting browsing. doi
  28. (2005). The necessity of a meeting recordingand playback system, and the benefit of topic-level annotations to meeting 1022browsing. doi
  29. (1998). Topic detection and tracking pilot study: Final report.
  30. (1998). Topic Segmentation: Algorithms and Applications.
  31. (2005). Transcription of conference room meetings: An investigation. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.