1 research outputs found

    The thu summarization systems at tac 2010

    No full text
    The TAC 2010 Guided Summarization task requires participants to generate coherent summaries with the guidance of predefined categories and aspects. In this paper, we present our two extractive summarization systems. In the first system, we employ a topic model-Labeled LDA to model the aspects. The correspondence between the aspects and the topics in Labeled LDA is established through identifying indicative words for each aspect. After training and inference of Labeled LDA, we get the salience scores of concepts (named entities and bigrams) from topic concept distributions. Then we use an Integer Linear Programming (ILP) based maximal coverage method to generate summaries. In the other system which also uses ILP and maximal coverage during sentence extraction, the salience of concepts is obtained using a pairwise learning to rank algorithm- RankNet. The training samples are constructed based on the human annotated Pyramid data.
    corecore