Search CORE

70 research outputs found

Detecting (Un)Important Content for Single-Document News Summarization

Author: Bao Forrest Sheng
Nenkova Ani
Yang Yinfei
Publication venue
Publication date: 01/01/2017
Field of study

We present a robust approach for detecting intrinsic sentence importance in news, by training on two corpora of document-summary pairs. When used for single-document summarization, our approach, combined with the "beginning of document" heuristic, outperforms a state-of-the-art summarizer and the beginning-of-article baseline in both automatic and manual evaluations. These results represent an important advance because in the absence of cross-document repetition, single document summarizers for news have not been able to consistently outperform the strong beginning-of-article baseline.Comment: Accepted By EACL 201

arXiv.org e-Print Archive

Crossref

Assessing the Quality of Automatic Summarization for Peer Review in Education

Author: Christopher Maynard
Edward F Gehringer
Ferry Pramudianto
Tarun Chhabra
Publication venue
Publication date: 11/04/2020
Field of study

ABSTRACT Technology supported peer review has drawn many interests from educators and researchers. It encourages active learning, provides timely feedback to students and multiple perspectives on their work. Currently, online peer review systems allow a student's work to be reviewed by a handful of their peers. While this is quite a good way to obtain a high degree of confidence, reading a large amount of feedback could be overwhelming. Our observation shows that the students even ignore some feedback when it gets too large. In this work, we try to automatically summarize the feedback by extracting the similar content that is mentioned by the reviewers, which would capture the strength and weaknesses of the work. We evaluate different auto summarization algorithms and length of the summary with educational peer review dataset, which was rated by a human. In general, the students found that medium-size generated summaries (5-10 sentences) encapsulate the context of the reviews, are able to convey the intent of the reviews, and help them to judge the quality of the work

CiteSeerX

Peringkasan Otomatis dengan Ekstraksi Informasi untuk Dokumen Berita Ter-cluster

Author: Ilyas R. (Ridwan)
Umbara F. (Fajri)
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 01/01/2016
Field of study

Keterbukaan dan kemudahan mengakses informasi membuat jumlah informasi menjadi sangat banyak. Banyaknya informasi untuk satu hal yang sama menimbulkan information overload. Masalah tersebut muncul dalam berbagai bidang seperti berita, dokumen karya ilmiah dan media sosial. Dibutuhkan sistem yang mampu membantu pengguna untuk menghasilkan berita yang lengkap dengan cara membangun sistem peringkasan otomatis. Pada penelitian ini diajukan membentuk serangkayan standar dalam tahapan peringkasan berita dengan konfirgurasi dinamis pada masing-masing tugas (clustering, ekstraksi informasi dan peringkasan). Dengan membangun sistem peringkasan dari mulai proses clustering, ekstraksi informasi dan peringkasan diharapkan menghasilkan hasil ringkasan yang utuh, lengkap dan memiliki tingkat keterbacaan tinggi

Neliti

Automatic Multiple Document Text Summarization using Wordnet and Agility Tool

Author: naresh kumar
Publication venue: Global Journals Inc. (US)
Publication date: 29/03/2014
Field of study

The number of web pages on the World Wide Web is increasing very rapidly. Consequently, search engines like Google, AltaVista, Bing etc. provides a long list of URLs to the end user. So, it becomes very difficult to review and analyze each web page manually. That2019;s why automatic text sumarization is used to summarize the source text into its shorter version by preserving its information content and overall meaning. This paper proposes an automatic multiple documents text summarization technique called AMDTSWA, which allows the end user to select multiple URLs to generate their summarized results in parallel. AMDTSWA makes the use of concept based segmentation, HTML DOM tree and concept blocks formation. Similarities of contents are determined by calculating the sentence score and useful information is extracted for generating a comparative summary. The proposed approach is implemented by using ASP.Net and gives good results

Global Journal of Computer Science and Technology (GJCST)

Automatic Text Summarization Using Latent Drichlet Allocation (LDA) for Document Clustering

Author: Azhari Azhari
Dewi Ika Novita
Firdausillah Fahri
Hastuti Khafiizh
Hidayat Erwin Yudi
Publication venue: 'Universitas Ahmad Dahlan, Kampus 3'
Publication date: 01/12/2015
Field of study

In this paper, we present Latent Drichlet Allocation in automatic text summarization to improve accuracy in document clustering. The experiments involving 398 data set from public blog article obtained by using python scrapy crawler and scraper. Several steps of clustering in this research are preprocessing, automatic document compression using feature method, automatic document compression using LDA, word weighting and clustering algorithm The results show that automatic document summarization with LDA reaches 72% in LDA 40%, compared to traditional k-means method which only reaches 66%

International Journal of Advances in Intelligent Informatics

Directory of Open Access Journals

International Journal of Advances in Intelligent Informatics (IJAIN)

Utilizing microblogs for improving automatic news high-lights extraction

Author: GAO Wei
WEI Zhongyu
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/08/2014
Field of study

Institutional Knowledge at Singapore Management University

Extractive multi document summarization using harmony search algorithm

Author: Abass Haithem Kareem
Ali Zuhair Hussein
Fadel Elham
Hussein Ahmed Kawther
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/02/2021
Field of study

The exponential growth of information on the internet makes it troublesome for users to get valuable information. Text summarization is the process to overcome such a problem. An adequate summary must have wide coverage, high diversity, and high readability. In this article, a new method for multi-document summarization has been supposed based on a harmony search algorithm that optimizes the coverage, diversity, and readability. Concerning the benchmark dataset Text Analysis Conference (TAC-2011), the ROUGE package used to measure the effectiveness of the proposed model. The calculated results support the effectiveness of the proposed approach

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System