6 research outputs found
Recommended from our members
Automatic Summarization of Broadcast News using Structural Features
We present a method for summarizing broadcast news that is not affected by word errors in an automatic speech recognition transcription, using information about the structure of the news program. We construct a directed graphical model to represent the probability distribution and dependencies among the structural features which we train by finding the values of parameters of the conditional probability tables. We then rank segments of the test set and extract the highest ranked ones as a summary. We present the procedure and preliminary test results
Recommended from our members
Soundbite Detection in Broadcast News Domain
In this paper, we present results of a study designed to identify SOUNDBITES in Broadcast News. We describe a Conditional Random Field-based model for the detection of these included speech segments uttered by individuals who are interviewed or who are the subject of a news story. Our goal is to identify direct quotations in spoken corpora which can be directly attributable to particular individuals, as well as to associate these soundbites with their speakers. We frame soundbite detection as a binary classification problem in which each turn is categorized either as a soundbite or not. We use lexical, acoustic/prosodic and structural features on a turn level to train a CRF. We performed a 10-fold cross validation experiment in which we obtained an accuracy of 67.4 % and an F-measure of 0.566 which is 20.9 % and 38.6 % higher than a chance baseline. Index Terms: soundbite detection, speaker roles, speech summarization, information extraction
Blending Sentence Optimization Weights of Unsupervised Approaches for Extractive Speech Summarization
AbstractThis paper evaluates the performance of two unsupervised approaches, Maximum Marginal Relevance (MMR) and concept-based global optimization framework for speech summarization. Automatic summarization is very useful techniques that can help the users browse a large amount of data. This study focuses on automatic extractive summarization on multi-dialogue speech corpus. We propose improved methods by blending each unsupervised approach at sentence level. Sentence level information is leveraged to improve the linguistic quality of selected summaries. First, these scores are used to filter sentences for concept extraction and concept weight computation. Second, we pre-select a subset of candidate summary sentences according to their sentence weights. Last, we extend the optimization function to a joint optimization of concept and sentence weights to cover both important concepts and sentences. Our experimental results show that these methods can improve the system performance comparing to the concept-based optimization baseline for both human transcripts and ASR output. The best scores are achieved by combining all three approaches, which are significantly better than the baseline system
A Probabilistic model of meetings that combines words and discourse features
(c) 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.This is the author's accepted version of this article. The final published version can be found here: http://dx.doi.org/10.1109/TASL.2008.92586