Search CORE

804 research outputs found

Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization

Author: Torres-Moreno Juan-Manuel
Publication venue
Publication date: 14/09/2012
Field of study

In Automatic Text Summarization, preprocessing is an important phase to reduce the space of textual representation. Classically, stemming and lemmatization have been widely used for normalizing words. However, even using normalization on large texts, the curse of dimensionality can disturb the performance of summarizers. This paper describes a new method for normalization of words to further reduce the space of representation. We propose to reduce each word to its initial letters, as a form of Ultra-stemming. The results show that Ultra-stemming not only preserve the content of summaries produced by this representation, but often the performances of the systems can be dramatically improved. Summaries on trilingual corpora were evaluated automatically with Fresa. Results confirm an increase in the performance, regardless of summarizer system used.Comment: 22 pages, 12 figures, 9 table

arXiv.org e-Print Archive

CiteSeerX

Accurate user directed summarization from existing tools

Author: Sanderson M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1998
Field of study

This paper describes a set of experimental results produced from the TIPSTER SUMMAC initiative on user directed summaries: document summaries generated in the context of an information need expressed as a query. The summarizer that was evaluated was based on a set of existing statistical techniques that had been applied successfully to the INQUERY retrieval system. The techniques proved to have a wider utility, however, as the summarizer was one of the better performing systems in the SUMMAC evaluation. The design of this summarizer is presented with a range of evaluations: both those provided by SUMMAC as well as a set of preliminary, more informal, evaluations that examined additional aspects of the summaries. Amongst other conclusions, the results reveal that users can judge the relevance of documents from their summary almost as accurately as if they had had access to the document’s full text

CiteSeerX

Crossref

White Rose Research Online

Text Summarization Techniques: A Brief Survey

Author: Allahyari Mehdi
Assefi Mehdi
Gutierrez Juan B.
Kochut Krys
Pouriyeh Seyedamin
Safaei Saeid
Trippe Elizabeth D.
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

arXiv.org e-Print Archive

Georgia Southern University: Digital Commons@Georgia Southern

Recommended from our members

Politeness and bias in dialogue summarization: two exploratory studies

Author: Carvalho Ariadne M.B.R.
Piwek Paul
Roman Norton
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

In this chapter, two empirical pilot studies on the role of politeness in dialogue summarization are described. In these studies, a collection of four dialogues was used. Each dialogue was automatically generated by the NECA system and the politeness of the dialogue participants was systematically manipulated. Subjects were divided into groups who had to summarize the dialogues from a particular dialogue participant’s point of view or the point of view of an impartial observer. In the first study, there were no other constraints. In the second study, the summarizers were restricted to summaries whose length did not exceed 10% of the number of words in the dialogue that was being summarized. Amongst other things, it was found that the politeness of the interaction is included more often in summaries of dialogues that deviate from what would be considered normal or unmarked. A comparison of the results of the two studies suggests that the extent to which politeness is reported is not affected by how long a summary is allowed to be. It was also found that the point of view of the summarizer influences which information is included in the summary and how it is presented. This finding did not seem to be affected by the constraint in our second study on the summary length

Open Research Online (The Open University)