Search CORE

7 research outputs found

Multi-document Summarization Based on Sentence Clustering Improved Using Topic Words

Author: Arifin A. Z. (Agus)
Kurniawardhani A. (Arrie)
Lukmana I. (Indra)
Purwitasari D. (Diana)
Swanjaya D. (Daniel)
Publication venue: Sepuluh Nopember Institute of Technology
Publication date: 01/07/2014
Field of study

Informasi dalam bentuk teks berita telah menjadi salah satu komoditas yang paling penting dalam era informasi ini. Ada banyak berita yang dihasilkan sehari-hari, tetapi berita-berita ini sering memberikan konten kontekstual yang sama dengan narasi berbeda. Oleh karena itu, diperlukan metode untuk mengumpulkan informasi ini ke dalam ringkasan sederhana. Di antara sejumlah subtugas yang terlibat dalam peringkasan multi-dokumen termasuk ekstraksi kalimat, deteksi topik, ekstraksi kalimat representatif, dan kalimat rep-resentatif. Dalam tulisan ini, kami mengusulkan metode baru untuk merepresentasikan kalimat ber-dasarkan kata kunci dari topic teks menggunakan Latent Dirichlet Allocation (LDA). Metode ini terdiri dari tiga langkah dasar. Pertama, kami mengelompokkan kalimat di set dokumen menggunakan kesamaan histogram pengelompokan (SHC). Selanjutnya, peringkat cluster menggunakan klaster penting. Terakhir, kalimat perwakilan yang dipilih oleh topik diidentifikasi pada LDA. Metode yang diusulkan diuji pada dataset DUC2004. Hasil penelitian menunjukkan rata-rata 0,3419 dan 0,0766 untuk ROUGE-1 dan ROUGE-2, masing-masing. Selain itu, dari pembaca prespective, metode kami diusulkan menyajikan pengaturan yang koheren dan baik dalam memesan kalimat representatif, sehingga dapat mempermudah pemahaman bacaan dan mengurangi waktu yang dibutuhkan untuk membaca ringkasan

Neliti

MULTI-DOCUMENT SUMMARIZATION BASED ON SENTENCE CLUSTERING IMPROVED USING TOPIC WORDS

Author: Arifin Agus Zainal
Kurniawardhani Arrie
Lukmana Indra
Purwitasari Diana
Swanjaya Daniel
Publication venue: 'Lembaga Penelitian dan Pengabdian kepada Masyarakat ITS'
Publication date: 01/07/2014
Field of study

Crossref

Directory of Open Access Journals

JUTI: Jurnal Ilmiah Teknologi Informasi

Multi-document summarization based on atomic semantic events and their temporal relationss

Author: Uddin Md Mohsin
Publication venue: Department of Mathematics and Computer Sicence
Publication date: 01/01/2014
Field of study

Automatic multi-document summarization (MDS) is the process of extracting the most important information such as events and entities from multiple natural language texts focused on the same topic. We extract all types of semantic atomic information and feed them to a topic model to experiment with their effects on a summary. We design a coherent summarization system by taking into account the sentence relative positions in the original text. Our generic MDS system has outperformed the best recent multi-document summarization system in DUC 2004 in terms of ROUGE-1 recall and

f_1

-measure. Our query-focused summarization system achieves a statistically similar result to the state-of-the-art unsupervised system for DUC 2007 query-focused MDS task in ROUGE-2 recall measure. Update Summarization is a new form of MDS where novel yet salience sentences are chosen as summary sentences based on the assumption that the user has already read a given set of documents. In this thesis, we present an event based update summarization where the novelty is detected based on the temporal ordering of events and the saliency is ensured by event and entity distribution. To our knowledge, no other study has deeply investigated the effects of the novelty information acquired from the temporal ordering of events (assuming that a sentence contains one or more events) in the domain of update MDS. Our update MDS system has outperformed the state-of-the-art update MDS system in terms of ROUGE-2, and ROUGE-SU4 recall measures. Our MDS systems also generate quality summaries which are manually evaluated based on popular evaluation criteria

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

Toward abstractive multi-document summarization using submodular function-based framework, sentence compression and merging

Author: Tanvee Moin Mahmud
University of Lethbridge. Faculty of Arts and Science
Publication venue: 'University of Central Missouri, Department of Mathematics and Computer Science'
Publication date: 01/01/2016
Field of study

Automatic multi-document summarization is a process of generating a summary that contains the most important information from multiple documents. In this thesis, we design an automatic multi-document summarization system using different abstraction-based methods and submodularity. Our proposed model considers summarization as a budgeted submodular function maximization problem. The model integrates three important measures of a summary - namely importance, coverage, and non-redundancy, and we design a submodular function for each of them. In addition, we integrate sentence compression and sentence merging. When evaluated on the DUC 2004 data set, our generic summarizer has outperformed the state-of-the-art summarization systems in terms of ROUGE-1 recall and f1-measure. For query-focused summarization, we used the DUC 2007 data set where our system achieves statistically similar results to several well-established methods in terms of the ROUGE-2 measure

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

The theory of extended topic and its application in information retrieval

Author: Yin Ling
Publication venue
Publication date: 01/09/2012
Field of study

University of Brighton Research Portal