378 research outputs found

    Intonational Phrases for Speech Summarization

    Get PDF
    Extractive speech summarization approaches select relevant segments of spoken documents and concatenate them to generate a summary. The extraction unit chosen, whether a sentence, syntactic constituent, or other segment, has a significant impact on the overall quality and fluency of the summary. Even though sentences tend to be the choice of most the extractive speech summarizers, in this paper, we present the results of an empirical study indicating that intonational phrases are better units of extraction for summarization. Our study compared four types of input segmentation: sentences, two pause-based segmentation, and intonational phrases (IP). We found that IPs are the best candidates for extractive summarization, improving over the second highest-performing approach, sentence-based summarization, by 8.2% F-measure

    A comparison of feature and semantic-based summarization algorithms for Turkish

    Get PDF
    Akyokuş, Selim (Dogus Author) -- Conference full title: International Symposium on Innovations in Intelligent Systems and Applicaitons, 21-24June 2010, Kayseri & Cappadocia,TURKEY.In this paper we analyze the performances of a feature-based and two semantic-based text summarization algorithms on a new Turkish corpus. The feature-based algorithm uses the statistical analysis of paragraphs, sentences, words and formal clues found in documents, whereas the two semanticbased algorithms employ Latent Semantic Analysis (LSA) approach which enables the selection of the most important sentences in a semantic way. Performance evaluation is conducted by comparing automatically generated summaries with manual summaries generated by a human summarizer. This is the first study that applies LSA based algorithms to Turkish text summarization and its results are promising
    corecore