1 research outputs found

    HMM Content Model for TAC2010 Summarization Challenge

    No full text
    We present the HITS submission for the 2010 TAC Guided Summarization Task. We focus on the main multi-document summarization task, rather than the update task. We implement a baseline extractive summarization system from the literature (Barzilay and Lee, 2004) which uses a Hidden Markov Model to assign sentences content or topic labels, predicts which topics most likely appear in the summary, and constructs the summaries from these topics. We find that this model performs more poorly than expected, as compared to results shown in previous work. These differences may be attributed to the changes we made to the algorithm to accommodate the multi-document summarization task and the lack of human-annotated domains for the training data.
    corecore