Search CORE

33 research outputs found

Inferring Strategies for Sentence Ordering in Multidocument News Summarization

Author: Barzilay R.
Elhadad N.
Publication venue: 'AI Access Foundation'
Publication date: 09/06/2011
Field of study

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies

arXiv.org e-Print Archive

Crossref

Summarization of Films and Documentaries Based on Subtitles and Scripts

Author: Aparício Marta
de Matos David Martins
Figueiredo Paulo
Marujo Luís
Raposo Francisco
Ribeiro Ricardo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

We assess the performance of generic text summarization algorithms applied to films and documentaries, using the well-known behavior of summarization of news articles as reference. We use three datasets: (i) news articles, (ii) film scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics are used for comparing generated summaries against news abstracts, plot summaries, and synopses. We show that the best performing algorithms are LSA, for news articles and documentaries, and LexRank and Support Sets, for films. Despite the different nature of films and documentaries, their relative behavior is in accordance with that obtained for news articles.Comment: 7 pages, 9 tables, 4 figures, submitted to Pattern Recognition Letters (Elsevier

arXiv.org e-Print Archive

Repositório Institucional do ISCTE-IUL

IMPROVING SUMMARY QUALITY USING SUMMARIZING STRATEGIES FROM MULTIPLE SOURCE TEXTS

Author: Aryanti Nurul
Nadjmuddin Muhammad
Publication venue: 'Politeknik Negeri Sriwijaya'
Publication date: 20/01/2017
Field of study

The aim of this study was to examine students’ summarizing strategies in writing summaries from multiple texts. A total of 30 students from the group of high-ability and low-ability based on TOEIC scores were invited to participate in this study for data collection using writing summary task. The participants were taught summary writing using concept mapping strategies. The summarizing strategies of the students were collected through think aloud protocol and questionnaire. This study suggests that explicit instruction in the use of concept mapping strategies during summarizing was crucial to improve the quality of their summaries. This study indicates some differences in the strategies employed by the students of the two different groups. Information about successful strategies employed by high ability students in writing a summary can be used to teach low ability students

JURNAL POLITEKNIK NEGERI SRIWIJAYA

USING CONCEPT MAP SOFTWARE TO IMPROVE SUMMARY WRITING QUALITY

Author: - Herman
- Sunani
Nadjmuddin Muhammad
Yeny Eli
Publication venue: 'Politeknik Negeri Sriwijaya'
Publication date: 02/01/2018
Field of study

This study suggests that conceptual summarizing of multiple sources is key to deep learning and useful to language learning strategies. The participants of this study were 68 students from three classes studying in upper intermediate reading course at English Department of State Polytechnic of Sriwijaya in Palembang. The t test indicated that the students who received strategic summarizing intervention in this study received significantly higher marks after eight reading and summarizing measures using computer based concept map. Collected data from questionnaire and focus group discussion showed that during summarizing from multiple sources using computer-based concept map the students employed prereading, conceptual reading and summarizing strategies. We could note the student activities which stressed the various aspects of conceptual learning. It is suggested that conceptual summarizing of multiple sources using computer-based concept map is used in English reading because it allows students to process information from sources, without which they are not able to organize main ideas into a coherent summary. It also facilitates students to become familiar with the textual structure of English academic summary writing

JURNAL POLITEKNIK NEGERI SRIWIJAYA

Automatic summary evaluation. Roug e modifications

Author: Ermakova Liana
Publication venue
Publication date: 01/01/2012
Field of study

Nowadays there is no common approach to summary. Manual evaluation is expensive and subjective and it is not applicable in real time or on a large corpus. Widely used approaches involve little human efforts and assume comparison with a set of reference summaries. We tried to overcome drawbacks of existing metrics such as ignoring redundant information, synonyms and sentence ordering. Our method combines edit distance, ROUGE-SU and trigrams similarity measure enriched by weights for different parts of speech and synonyms. Since nouns provide the most valuable information, each sentence is mapped into a set of nouns. If the normalized intersection of any pair is greater than a predefined threshold the sentences are penalized. Doing extracts there is no need to analyze sentence structure but sentence ordering is crucial. Sometimes it is impossible to compare sentence order with a gold standard. Therefore similarity between adjacent sentences may be used as a measure of text coherence. Chronological constraint violation should be penalized. Relevance score and readability assessment may be combined in the F-measure. In order to choose the best parameter values machine learning can be applied

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Learning to Order Facts for Discourse Planning in Natural Language Generation

Author: Androutsopoulos Ion
Dimitromanolaki Aggeliki
Publication venue
Publication date: 01/01/2003
Field of study

This paper presents a machine learning approach to discourse planning in natural language generation. More specifically, we address the problem of learning the most natural ordering of facts in discourse plans for a specific domain. We discuss our methodology and how it was instantiated using two different machine learning algorithms. A quantitative evaluation performed in the domain of museum exhibit descriptions indicates that our approach performs significantly better than manually constructed ordering rules. Being retrainable, the resulting planners can be ported easily to other similar domains, without requiring language technology expertise.Comment: 8 pages, 4 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

A Query Focused Multi Document Automatic Summarization

Author: Bandyopadhyay Sivaji
Bhaskar Pinaki
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Waseda University Repository

Evaluating Centering for Information Ordering Using Corpora

Author: Chris Mellish
Grosz Barbara J
Jon Oberlander
Massimo Poesio
Nikiforos Karamanis
Strube Michael
Publication venue: 'MIT Press - Journals'
Publication date: 15/10/2008
Field of study

In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclusively on NOCB transitions) sets a robust baseline that cannot be outperformed by other metrics which make use of additional centering-based features. This baseline can be used for the development of both text-to-text and concept-to-text generation systems. </jats:p

University of Essex Research Repository

CiteSeerX

Aberdeen University Research

Crossref

Edinburgh Research Explorer

Summary of a Topical Forum FAQ Based on the Chinese Composition Structure

Author: Lin Chih-Lung
Liu Shu-Chu
Tao Yu-Hui
Publication venue: AIS Electronic Library (AISeL)
Publication date: 02/12/2007
Field of study

An automatic multiple-document summarization system for producing frequently asked questions (FAQ) of a topical forum can save forum Webmasters a great deal of time in theory. This work will address summary composition issue of a previous work by proposing a structured presentation based on a four-part pattern of traditional Chinese articles. The result of the experiment shows that the enhanced system with both domain-terminology corpus methods produced a significantly better summary presentation than the original system. Recall rate and precision rate performance indices and user evaluations are also presented and discussed to show their practical implications

AIS Electronic Library (AISeL)