Search CORE

1,758 research outputs found

Text Summarization Techniques: A Brief Survey

Author: Allahyari Mehdi
Assefi Mehdi
Gutierrez Juan B.
Kochut Krys
Pouriyeh Seyedamin
Safaei Saeid
Trippe Elizabeth D.
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

arXiv.org e-Print Archive

Georgia Southern University: Digital Commons@Georgia Southern

A Novel ILP Framework for Summarizing Content with High Lexical Variety

Author: Almeida
Celikyilmaz
Conroy
DIANE LITMAN
FEI LIU
Goodfellow
Li
Li
Luo
Luo
Luo
Martins
Mazumder
Mosteller
Narayan
Qian
Ren
Tarnpradab
Wang
WENCAN LUO
Wilson
Xiong
ZITAO LIU
Publication venue
Publication date: 25/07/2018
Field of study

Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

Author: Huang Xuanjing
Jiang Zhuoren
Kang Yangyang
Lin Jun
Liu Xiaozhong
Sun Changlong
Zhang Qi
Zhao Lujun
Zou Yicheng
Publication venue
Publication date: 18/05/2021
Field of study

Automatic chat summarization can help people quickly grasp important information from numerous chat messages. Unlike conventional documents, chat logs usually have fragmented and evolving topics. In addition, these logs contain a quantity of elliptical and interrogative sentences, which make the chat summarization highly context dependent. In this work, we propose a novel unsupervised framework called RankAE to perform chat summarization without employing manually labeled data. RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously, as well as a denoising auto-encoder that is carefully designed to generate succinct but context-informative summaries based on the selected utterances. To evaluate the proposed method, we collect a large-scale dataset of chat logs from a customer service environment and build an annotated set only for model evaluation. Experimental results show that RankAE significantly outperforms other unsupervised methods and is able to generate high-quality summaries in terms of relevance and topic coverage.Comment: Accepted by AAAI 2021, 9 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Scientific Opinion Summarization: Meta-review Generation with Checklist-guided Iterative Introspection

Author: Chan Hou Pong
Ji Heng
Sidhu Mankeerat
Wang Lu
Zeng Qi
Publication venue
Publication date: 13/11/2023
Field of study

Opinions in the scientific domain can be divergent, leading to controversy or consensus among reviewers. However, current opinion summarization datasets mostly focus on product review domains, which do not account for this variability under the assumption that the input opinions are non-controversial. To address this gap, we propose the task of scientific opinion summarization, where research paper reviews are synthesized into meta-reviews. To facilitate this task, we introduce a new ORSUM dataset covering 10,989 paper meta-reviews and 40,903 paper reviews from 39 conferences. Furthermore, we propose the Checklist-guided Iterative Introspection (CGI

^2

) approach, which breaks down the task into several stages and iteratively refines the summary under the guidance of questions from a checklist. We conclude that (1) human-written summaries are not always reliable since many do not follow the guidelines, and (2) the combination of task decomposition and iterative self-refinement shows promising discussion involvement ability and can be applied to other complex text generation using black-box LLM

arXiv.org e-Print Archive

Data Mining Oriented Automatic Scientific Documents Summarization

Author: Kalyani BJD
Shahanaz Shaik
Wankhede Jaishri
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 04/05/2023
Field of study

The scientific research process usually begins with an examination of the advanced, which may include voluminous publications. Summarizing scientific articles can assist researchers in their research by speeding up the research process. The summary of scientific articles differs from the abstract text in general due to its specific structure and the inclusion of cited sentences. Most of the important information in scientific articles is presented in tables, statistics, and algorithm pseudocode. These features, however, rarely appear in the standard text. Therefore, a number of methods that consider the value of the structure of a scientific article have been suggested that improve the standard of the produced summary. This paper makes use of clustering algorithms to handle CL- SciSumm 2020 and longsumm 2020 tasks for summarization of scientific documents. There are three well-known clustering algorithms that are employed to tackle CL- SciSumm 2020 and LongSumm 2020 tasks, and several sentences recording functions, with textual deduction, are used to retrieved phrases from each cluster to generate summary

International Journal on Recent and Innovation Trends in Computing and Communication