Search CORE

1 research outputs found

HMM Content Model for TAC2010 Summarization Challenge

Author: Darla Magdalene Shockley
Michael Strube
Publication venue
Publication date: 01/01/2011
Field of study

We present the HITS submission for the 2010 TAC Guided Summarization Task. We focus on the main multi-document summarization task, rather than the update task. We implement a baseline extractive summarization system from the literature (Barzilay and Lee, 2004) which uses a Hidden Markov Model to assign sentences content or topic labels, predicts which topics most likely appear in the summary, and constructs the summaries from these topics. We find that this model performs more poorly than expected, as compared to results shown in previous work. These differences may be attributed to the changes we made to the algorithm to accommodate the multi-document summarization task and the lack of human-annotated domains for the training data.

CiteSeerX