33 research outputs found
Inferring Strategies for Sentence Ordering in Multidocument News Summarization
The problem of organizing information for multidocument summarization so that
the generated summary is coherent has received relatively little attention.
While sentence ordering for single document summarization can be determined
from the ordering of sentences in the input article, this is not the case for
multidocument summarization where summary sentences may be drawn from different
input articles. In this paper, we propose a methodology for studying the
properties of ordering information in the news genre and describe experiments
done on a corpus of multiple acceptable orderings we developed for the task.
Based on these experiments, we implemented a strategy for ordering information
that combines constraints from chronological order of events and topical
relatedness. Evaluation of our augmented algorithm shows a significant
improvement of the ordering over two baseline strategies
Summarization of Films and Documentaries Based on Subtitles and Scripts
We assess the performance of generic text summarization algorithms applied to
films and documentaries, using the well-known behavior of summarization of news
articles as reference. We use three datasets: (i) news articles, (ii) film
scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics
are used for comparing generated summaries against news abstracts, plot
summaries, and synopses. We show that the best performing algorithms are LSA,
for news articles and documentaries, and LexRank and Support Sets, for films.
Despite the different nature of films and documentaries, their relative
behavior is in accordance with that obtained for news articles.Comment: 7 pages, 9 tables, 4 figures, submitted to Pattern Recognition
Letters (Elsevier
IMPROVING SUMMARY QUALITY USING SUMMARIZING STRATEGIES FROM MULTIPLE SOURCE TEXTS
The aim of this study was to examine students’ summarizing strategies in writing summaries from multiple texts. A total of 30 students from the group of high-ability and low-ability based on TOEIC scores were invited to participate in this study for data collection using writing summary task. The participants were taught summary writing using concept mapping strategies. The summarizing strategies of the students were collected through think aloud protocol and questionnaire. This study suggests that explicit instruction in the use of concept mapping strategies during summarizing was crucial to improve the quality of their summaries. This study indicates some differences in the strategies employed by the students of the two different groups. Information about successful strategies employed by high ability students in writing a summary can be used to teach low ability students
USING CONCEPT MAP SOFTWARE TO IMPROVE SUMMARY WRITING QUALITY
This study suggests that conceptual summarizing of multiple sources is key to deep learning and useful to language learning strategies. The participants of this study were 68 students from three classes studying in upper intermediate reading course at English Department of State Polytechnic of Sriwijaya in Palembang. The t test indicated that the students who received strategic summarizing intervention in this study received significantly higher marks after eight reading and summarizing measures using computer based concept map. Collected data from questionnaire and focus group discussion showed that during summarizing from multiple sources using computer-based concept map the students employed prereading, conceptual reading and summarizing strategies. We could note the student activities which stressed the various aspects of conceptual learning. It is suggested that conceptual summarizing of multiple sources using computer-based concept map is used in English reading because it allows students to process information from sources, without which they are not able to organize main ideas into a coherent summary. It also facilitates students to become familiar with the textual structure of English academic summary writing
Automatic summary evaluation. Roug e modifications
Nowadays there is no common approach to summary. Manual evaluation is expensive and subjective and it is not applicable in real time or on a large corpus. Widely used approaches involve little human efforts and assume comparison with a set of reference summaries. We tried to overcome drawbacks of existing metrics such as ignoring redundant information, synonyms and sentence ordering. Our method combines edit distance, ROUGE-SU and trigrams similarity measure enriched by weights for different parts of speech and synonyms. Since nouns provide the most valuable information, each sentence is mapped into a set of nouns. If the normalized intersection of any pair is greater than a predefined threshold the sentences are penalized. Doing extracts there is no need to analyze sentence structure but sentence ordering is crucial. Sometimes it is impossible to compare sentence order with a gold standard. Therefore similarity between adjacent sentences may be used as a measure of text coherence. Chronological constraint violation should be penalized. Relevance score and readability assessment may be combined in the F-measure. In order to choose the best parameter values machine learning can be applied
Learning to Order Facts for Discourse Planning in Natural Language Generation
This paper presents a machine learning approach to discourse planning in
natural language generation. More specifically, we address the problem of
learning the most natural ordering of facts in discourse plans for a specific
domain. We discuss our methodology and how it was instantiated using two
different machine learning algorithms. A quantitative evaluation performed in
the domain of museum exhibit descriptions indicates that our approach performs
significantly better than manually constructed ordering rules. Being
retrainable, the resulting planners can be ported easily to other similar
domains, without requiring language technology expertise.Comment: 8 pages, 4 figures, 1 tabl
Evaluating Centering for Information Ordering Using Corpora
In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclusively on NOCB transitions) sets a robust baseline that cannot be outperformed by other metrics which make use of additional centering-based features. This baseline can be used for the development of both text-to-text and concept-to-text generation systems. </jats:p
Summary of a Topical Forum FAQ Based on the Chinese Composition Structure
An automatic multiple-document summarization system for producing frequently asked questions (FAQ) of a topical forum can save forum Webmasters a great deal of time in theory. This work will address summary composition issue of a previous work by proposing a structured presentation based on a four-part pattern of traditional Chinese articles. The result of the experiment shows that the enhanced system with both domain-terminology corpus methods produced a significantly better summary presentation than the original system. Recall rate and precision rate performance indices and user evaluations are also presented and discussed to show their practical implications