2 research outputs found

    Semi-Supervised Extractive Speech Summarization via Co-Training Algorithm

    No full text
    Supervised methods for extractive speech summarization require a large training set. Summary annotation is often expensive and time consuming. In this paper, we exploit semisupervised approaches to leverage unlabeled data. In particular, we investigate co-training algorithm for the task of extractive meeting summarization. Compared with text summarization, speech summarization task has its unique characteristic in that the features naturally split into two sets: textual features and prosodic/acoustic features. Such characteristic makes co-training an appropriate approach for semi-supervised speech summarization. Our experiments on ICSI meeting corpus show that by utilizing the unlabeled data, co-training algorithm significantly improves summarization performance when only a small amount of labeled data is available. Index Terms: extractive meeting summarization, co-training, semi-supervised learnin
    corecore