Search CORE

59 research outputs found

Generating Aspect-oriented Multi-document Summarization with Event-Aspect Model

Author: GAO Wei
JIANG Jing
LI Peng
WANG Yinglin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/07/2011
Field of study

In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of semantically related sentences and sentence ranking based on extension of random walk model. Also, we implement a new sentence compression algorithm which use dependency tree instead of parser tree. We compare our method with four baseline methods. Quantitative evaluation based on Rouge metric demonstrates the effectiveness and advantages of our method.

CiteSeerX

Institutional Knowledge at Singapore Management University

Chunking and Extracting Text Content for Mobile Learning: A Query-focused Summarizer Based on Relevance Language Model

Author: Kinshuk
Sutinen Erkki
Wen Dunwei
Yang Guangbing
Publication venue
Publication date: 05/03/2013
Field of study

ICALT is the top-tier international conference in educational technology with excellent academic background and very high level of academic performance. During the conference, I presented a short paper (which is the result of my current research in text summarization via mobile learning) in the conference. I also discussed my research with outstanding scholars from other research groups. I had received many positive feedbacks and useful suggestions from the conference participants. I believe these suggestions will provide me significant further scholarly directions. By attending such high quality conference, I can obtain advanced knowledge in academic research. This knowledge will directly benefit my work at Athabasca University. In short, this A&PDF activity is very helpful to my research and professional development in Athabasca University.Millions of text contents and multimedia published on the Web have potential to be shared as the learning contents. However, mobile learners often feel it difficult to extract useful contents for learning. Manually creating content not only requires a huge effort on the part of the teachers but also creates barriers towards reuse of the content that has already been created for e-Learning. In this paper, a text-based content summarizer is introduced to address an approach to help mobile learners to retrieve and process information more quickly by aligning text-based content size to various mobile characteristics. In this work, probabilistic language modeling techniques are integrated into an extractive text summarization system to fulfill the automatic summary generation for mobile learning. Experimental results have shown that our solution is a proper and efficient approach to help mobile learners to summarize important content quickly

Athabasca University Library Institutional Repository

Peringkasan Dokumen Teks Otomatis Berdasarkan Sebuah Kueri Menggunakan Bidirectional Long Short Term Memory Network

Author: Syaliman Khairul Umam
Yuliska Yuliska
Publication venue: 'IPM2KPE'
Publication date: 09/12/2022
Field of study

Query-focused summarization atau peringkasan teks otomatis berdasarkan sebuah kueri adalah sebuah bidang penelitian pada natural language processing yang bertujuan untuk menghasilkan sebuah dokumen pendek atau ringkasan dari sekumpulan dokumen panjang, dimana ringkasan yang dihasilkan harus relevan dengan sebuah kueri yang diberikan. Hingga saat ini, berbagai metode deep learning telah digunakan untuk menghasilkan ringkasan dari sebuah maupun banyak dokumen dengan pendekatan abstraktif maupun ekstraktif. Pada penelitian ini, peneliti menggunakan Bidirectional Long Short Term Memory Network (Bi-LSTM) untuk menghasilkan sebuah ringkasan berdasarkan sebuah kueri dari beberapa dokumen dengan pendekatan ekstraktif. Bi-LSTM merupakan salah satu metode deep learning yang sering digunakan dalam klasifikasi teks. Dataset yang peneliti gunakan adalah DUC 2005-2007 dataset, yang merupakan dataset yang umum digunakan pada text summarization. Berdasarkan eksperimen yang peneliti lakukan, Bi-LSTM mampu menghasilkan ringkasan yang baik, yang dibuktikan dengan skor ROUGE-1 = 43.53, skor ROUGE-2 = 11.40 dan skor ROUGE-L = 18.67

Institut Penelitian Matematika Komputer, Keperawatan, Pendidikan dan Ekonomi (IPM2KPE): Open Journal System

Text Summarization Techniques: A Brief Survey

Author: Allahyari Mehdi
Assefi Mehdi
Gutierrez Juan B.
Kochut Krys
Pouriyeh Seyedamin
Safaei Saeid
Trippe Elizabeth D.
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

arXiv.org e-Print Archive

Crossref

Georgia Southern University: Digital Commons@Georgia Southern

A Bayesian Method to Incorporate Background Knowledge during Automatic Text Summarization

Author: Louis Annie P
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/06/2014
Field of study

In order to summarize a document, it is often useful to have a background set of documents from the domain to serve as a reference for determining new and important information in the input document. We present a model based on Bayesian surprise which provides an intuitive way to identify surprising information from a summarization input with respect to a background corpus. Specifically, the method quantifies the degree to which pieces of information in the input change one’s beliefs’ about the world represented in the background. We develop systems for generic and update summarization based on this idea. Our method provides competitive content selection performance with particular advantages in the update task where systems are given a small and topical background corpus

University of Essex Research Repository

CiteSeerX

Edinburgh Research Explorer

Reinforced Extractive Summarization with Question-Focused Rewards

Author: Arumae Kristjan
Liu Fei
Publication venue
Publication date: 01/01/2018
Field of study

We investigate a new training paradigm for extractive summarization. Traditionally, human abstracts are used to derive goldstandard labels for extraction units. However, the labels are often inaccurate, because human abstracts and source documents cannot be easily aligned at the word level. In this paper we convert human abstracts to a set of Cloze-style comprehension questions. System summaries are encouraged to preserve salient source content useful for answering questions and share common words with the abstracts. We use reinforcement learning to explore the space of possible extractive summaries and introduce a question-focused reward function to promote concise, fluent, and informative summaries. Our experiments show that the proposed method is effective. It surpasses state-of-the-art systems on the standard summarization dataset.Comment: 7 page

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Human-in-the-Loop Schema Induction

Author: Brown Susan
Callison-Burch Chris
Dror Rotem
Hou Zhaoyi
Ji Heng
Li Sha
Martin Lara J.
Palmer Martha
Ren Jiaxuan
Suchocki Reece
Tham Isaac
Xu Hainiu
Zhang Li
Zhang Tianyi
Zhou Liyang
Publication venue
Publication date: 25/02/2023
Field of study

Schema induction builds a graph representation explaining how events unfold in a scenario. Existing approaches have been based on information retrieval (IR) and information extraction(IE), often with limited human curation. We demonstrate a human-in-the-loop schema induction system powered by GPT-3. We first describe the different modules of our system, including prompting to generate schematic elements, manual edit of those elements, and conversion of those into a schema graph. By qualitatively comparing our system to previous ones, we show that our system not only transfers to new domains more easily than previous approaches, but also reduces efforts of human curation thanks to our interactive interface.Comment: 10 pages, ACL2023 demo trac

arXiv.org e-Print Archive