33 research outputs found

    Inferring Strategies for Sentence Ordering in Multidocument News Summarization

    Full text link
    The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies

    Summarization of Films and Documentaries Based on Subtitles and Scripts

    Get PDF
    We assess the performance of generic text summarization algorithms applied to films and documentaries, using the well-known behavior of summarization of news articles as reference. We use three datasets: (i) news articles, (ii) film scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics are used for comparing generated summaries against news abstracts, plot summaries, and synopses. We show that the best performing algorithms are LSA, for news articles and documentaries, and LexRank and Support Sets, for films. Despite the different nature of films and documentaries, their relative behavior is in accordance with that obtained for news articles.Comment: 7 pages, 9 tables, 4 figures, submitted to Pattern Recognition Letters (Elsevier

    IMPROVING SUMMARY QUALITY USING SUMMARIZING STRATEGIES FROM MULTIPLE SOURCE TEXTS

    Get PDF
    The aim of this study was to examine students’ summarizing strategies in writing summaries from multiple texts. A total of 30 students from the group of high-ability and low-ability based on TOEIC scores were invited to participate in this study for data collection using writing summary task. The participants were taught summary writing using concept mapping strategies. The summarizing strategies of the students were collected through think aloud protocol and questionnaire. This study suggests that explicit instruction in the use of concept mapping strategies during summarizing was crucial to improve the quality of their summaries. This study indicates some differences in the strategies employed by the students of the two different groups. Information about successful strategies employed by high ability students in writing a summary can be used to teach low ability students

    USING CONCEPT MAP SOFTWARE TO IMPROVE SUMMARY WRITING QUALITY

    Get PDF
    This study suggests that conceptual summarizing of multiple sources is key to deep learning and useful to language learning strategies. The participants of this study were 68 students from three classes studying in upper intermediate reading course at English Department of State Polytechnic of Sriwijaya in Palembang. The t test indicated that the students who received strategic summarizing intervention in this study received significantly higher marks after eight reading and summarizing measures using computer based concept map. Collected data from questionnaire and focus group discussion showed that during summarizing from multiple sources using computer-based concept map the students employed prereading, conceptual reading and summarizing strategies. We could note the student activities which stressed the various aspects of conceptual learning. It is suggested that conceptual summarizing of multiple sources using computer-based concept map is used in English reading because it allows students to process information from sources, without which they are not able to organize main ideas into a coherent summary. It also facilitates students to become familiar with the textual structure of English academic summary writing

    Automatic summary evaluation. Roug e modifications

    Full text link
    Nowadays there is no common approach to summary. Manual evaluation is expensive and subjective and it is not applicable in real time or on a large corpus. Widely used approaches involve little human efforts and assume comparison with a set of reference summaries. We tried to overcome drawbacks of existing metrics such as ignoring redundant information, synonyms and sentence ordering. Our method combines edit distance, ROUGE-SU and trigrams similarity measure enriched by weights for different parts of speech and synonyms. Since nouns provide the most valuable information, each sentence is mapped into a set of nouns. If the normalized intersection of any pair is greater than a predefined threshold the sentences are penalized. Doing extracts there is no need to analyze sentence structure but sentence ordering is crucial. Sometimes it is impossible to compare sentence order with a gold standard. Therefore similarity between adjacent sentences may be used as a measure of text coherence. Chronological constraint violation should be penalized. Relevance score and readability assessment may be combined in the F-measure. In order to choose the best parameter values machine learning can be applied

    Learning to Order Facts for Discourse Planning in Natural Language Generation

    Full text link
    This paper presents a machine learning approach to discourse planning in natural language generation. More specifically, we address the problem of learning the most natural ordering of facts in discourse plans for a specific domain. We discuss our methodology and how it was instantiated using two different machine learning algorithms. A quantitative evaluation performed in the domain of museum exhibit descriptions indicates that our approach performs significantly better than manually constructed ordering rules. Being retrainable, the resulting planners can be ported easily to other similar domains, without requiring language technology expertise.Comment: 8 pages, 4 figures, 1 tabl

    A Query Focused Multi Document Automatic Summarization

    Get PDF

    Evaluating Centering for Information Ordering Using Corpora

    Get PDF
    In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclusively on NOCB transitions) sets a robust baseline that cannot be outperformed by other metrics which make use of additional centering-based features. This baseline can be used for the development of both text-to-text and concept-to-text generation systems. </jats:p

    Summary of a Topical Forum FAQ Based on the Chinese Composition Structure

    Get PDF
    An automatic multiple-document summarization system for producing frequently asked questions (FAQ) of a topical forum can save forum Webmasters a great deal of time in theory. This work will address summary composition issue of a previous work by proposing a structured presentation based on a four-part pattern of traditional Chinese articles. The result of the experiment shows that the enhanced system with both domain-terminology corpus methods produced a significantly better summary presentation than the original system. Recall rate and precision rate performance indices and user evaluations are also presented and discussed to show their practical implications
    corecore