1 research outputs found

    Genre identification and goal-focused summarization

    No full text
    In this paper, we present a novel technique of first performing document genre identification, then utilizing the genre for producing tailored summaries based on a user’s information seeking needs – genre oriented goal-focused summarization – such as a plot or opinion summary of a movie review. We create a test corpus to determine genre classification accuracy for 16 genres, and examine performance on various amounts of training data for machine learning algorithms- Random Forests, SVM light and Naïve Bayes. Results show that Random Forests outperforms SVM light and Naïve Bayes. The genre tag is used to inform a downstream summarization engine. We define types of summaries for 7 genres, create a ground truth corpus and analyze the results of genre oriented goal-focused summarization, showing that this type of user based summarization requires different algorithms than the leading sentence baseline which is known to perform well in the case of news articles
    corecore