Search CORE

32 research outputs found

Toward abstractive multi-document summarization using submodular function-based framework, sentence compression and merging

Author: Tanvee Moin Mahmud
University of Lethbridge. Faculty of Arts and Science
Publication venue: 'University of Central Missouri, Department of Mathematics and Computer Science'
Publication date: 01/01/2016
Field of study

Automatic multi-document summarization is a process of generating a summary that contains the most important information from multiple documents. In this thesis, we design an automatic multi-document summarization system using different abstraction-based methods and submodularity. Our proposed model considers summarization as a budgeted submodular function maximization problem. The model integrates three important measures of a summary - namely importance, coverage, and non-redundancy, and we design a submodular function for each of them. In addition, we integrate sentence compression and sentence merging. When evaluated on the DUC 2004 data set, our generic summarizer has outperformed the state-of-the-art summarization systems in terms of ROUGE-1 recall and f1-measure. For query-focused summarization, we used the DUC 2007 data set where our system achieves statistically similar results to several well-established methods in terms of the ROUGE-2 measure

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

Semi-extractive multi-document summarization

Author: Ghiyafeh Davoodi Fatemeh
University of Lethbridge. Faculty of Arts and Science
Publication venue: 'University of Central Missouri, Department of Mathematics and Computer Science'
Publication date: 01/01/2015
Field of study

In this thesis, I design a Maximum Coverage problem with KnaPsack constraint (MCKP) based model for extractive multi-document summarization. The model integrates three measures to detect important sentences including Coverage, rewards sentences in regards to their representative level of the whole document, Relevance, focuses to select sentences that related to the given query, and Compression, rewards concise sentences. To generate a summary, I apply an efficient and scalable greedy algorithm. The algorithm has a near optimal solution when its scoring functions are monotone non-decreasing and submodular. I use DUC 2007 dataset to evaluate our proposed method. Investigating the results using ROUGE package shows improvement over two closely related works. The experimental results illustrates that integrating compression in the MCKP-based model, applying semantic similarity measures to detect Relevance measure and also defining all scoring functions as a monotone submodular function result in having a better performance in generating a summary

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

Author: Huang Xuanjing
Jiang Zhuoren
Kang Yangyang
Lin Jun
Liu Xiaozhong
Sun Changlong
Zhang Qi
Zhao Lujun
Zou Yicheng
Publication venue
Publication date: 18/05/2021
Field of study

Automatic chat summarization can help people quickly grasp important information from numerous chat messages. Unlike conventional documents, chat logs usually have fragmented and evolving topics. In addition, these logs contain a quantity of elliptical and interrogative sentences, which make the chat summarization highly context dependent. In this work, we propose a novel unsupervised framework called RankAE to perform chat summarization without employing manually labeled data. RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously, as well as a denoising auto-encoder that is carefully designed to generate succinct but context-informative summaries based on the selected utterances. To evaluate the proposed method, we collect a large-scale dataset of chat logs from a customer service environment and build an annotated set only for model evaluation. Experimental results show that RankAE significantly outperforms other unsupervised methods and is able to generate high-quality summaries in terms of relevance and topic coverage.Comment: Accepted by AAAI 2021, 9 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications