Search CORE

23,568 research outputs found

A Study of Realtime Summarization Metrics

Author: Diaz Fernando
Ekstrand-Abueg Matthew
McCreadie Richard
Pavlu Virgil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Unexpected news events, such as natural disasters or other human tragedies, create a large volume of dynamic text data from official news media as well as less formal social media. Automatic real-time text summarization has become an important tool for quickly transforming this overabundance of text into clear, useful information for end-users including affected individuals, crisis responders, and interested third parties. Despite the importance of real-time summarization systems, their evaluation is not well understood as classic methods for text summarization are inappropriate for real-time and streaming conditions. The TREC 2013-2015 Temporal Summarization (TREC-TS) track was one of the first evaluation campaigns to tackle the challenges of real-time summarization evaluation, introducing new metrics, ground-truth generation methodology and dataset. In this paper, we present a study of TREC-TS track evaluation methodology, with the aim of documenting its design, analyzing its effectiveness, as well as identifying improvements and best practices for the evaluation of temporal summarization systems

Crossref

Enlighten

Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

Author: A Clarke
AR Aronson
CY Lin
CY Lin
D Cutting
G Erkan
Hongfei Lin
HP Edmundson
HP Luhn
J Carbonell
J Zhan
K McKeown
KS Jones
LH Reeve
M Fiszman
O Bodenreider
R Mihalcea
S Teufel
T Rindflesch
TC Rindflesch
TC Rindflesch
TE Workman
TE Workman
W Hersh
X Ling
X Ling
X Wan
Yanpeng Li
Ying Xu
Yue Shang
Zhihao Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

RTSUM: Relation Triple-based Interpretable Summarization with Multi-level Salience Visualization

Author: Cho Seonglae
Cho Yonggi
Jang Myungha
Lee Dongha
Lee HoonJae
Yeo Jinyoung
Publication venue
Publication date: 20/10/2023
Field of study

In this paper, we present RTSUM, an unsupervised summarization framework that utilizes relation triples as the basic unit for summarization. Given an input document, RTSUM first selects salient relation triples via multi-level salience scoring and then generates a concise summary from the selected relation triples by using a text-to-text language model. On the basis of RTSUM, we also develop a web demo for an interpretable summarizing tool, providing fine-grained interpretations with the output summary. With support for customization options, our tool visualizes the salience for textual units at three distinct levels: sentences, relation triples, and phrases. The codes,are publicly available.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

Summarizing Text for Indonesian Language by Using Latent Dirichlet Allocation and Genetic Algorithm

Author: Aprilia V. R. (Vivi)
Meiliana M. (Meiliana)
Rukmana P. (Pitri)
Silvia S. (Silvia)
Suhartono D. (Derwin)
Wongso R. (Rini)
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/08/2014
Field of study

The number of documents progressively increases especially for the electronic one. This degrades effectivity and efficiency in managing them. Therefore, it is a must to manage the documents. Automatic text summarization is able to solve by producing text document summaries. The goal of the research is to produce a tool to summarize documents in Bahasa: Indonesian Language. It is aimed to satisfy the user's need of relevant and consistent summaries. The algorithm is based on sentence features scoring by using Latent Dirichlet Allocation and Genetic Algorithm for determining sentence feature weights. It is evaluated by calculating summarization speed, precision, recall, F-measure, and some subjective evaluations. Extractive summaries from the original text documents can represent important information from a single document in Bahasa with faster summarization speed compared to manual process. Best F-measure value is 0,556926 (with precision of 0.53448 and recall of 0.58134) and summary ratio of 30%

Neliti

Summarizing Text for Indonesian Language by Using Latent Dirichlet Allocation and Genetic Algorithm

Author: . Meiliana
. Silvia
Aprilia Vivi Regina
Rukmana Pitri
Suhartono Derwin
Wongso Rini
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 20/08/2014
Field of study

The number of documents progressively increases especially for the electronic one. This degrades effectivity and efficiency in managing them. Therefore, it is a must to manage the documents. Automatic text summarization is able to solve by producing text document summaries. The goal of the research is to produce a tool to summarize documents in Bahasa: Indonesian Language. It is aimed to satisfy the user’s need of relevant and consistent summaries. The algorithm is based on sentence features scoring by using Latent Dirichlet Allocation and Genetic Algorithm for determining sentence feature weights. It is evaluated by calculating summarization speed, precision, recall, F-measure, and some subjective evaluations. Extractive summaries from the original text documents can represent important information from a single document in Bahasa with faster summarization speed compared to manual process. Best F-measure value is 0,556926 (with precision of 0.53448 and recall of 0.58134) and summary ratio of 30%

Proceeding of the Electrical Engineering Computer Science and Informatics

Summarizing Text Using Lexical Chains

Author: Pooja Jain, Sachin Jain
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/04/2016
Field of study

The current technology of automatic text summarization imparts an important role in the information retrieval and text classification, and it provides the best solution to the information overload problem. And the text summarization is a process of reducing the size of a text while protecting its information content. When taking into consideration the size and number of documents which are available on the Internet and from the other sources, the requirement for a highly efficient tool on which produces usable summaries is clear. We present a better algorithm using lexical chain computation. The algorithm one which makes lexical chains a computationally feasible for the user. And using these lexical chains the user will generate a summary, which is much more effective compared to the solutions available and also closer to the human generated summary

International Journal on Recent and Innovation Trends in Computing and Communication

Text Summarization with K-Means Method

Author: Firdaus Ari
Rodiah Desty
Yusliani Novi
Publication venue: Fakultas Ilmu Komputer Universitas Sriwijaya
Publication date: 01/07/2021
Field of study

Text Summarization is a tool used to generate a short form of text that contains important information that is needed by the user automatically. In this study, Text Summarization was conducted on Indonesian news using K-Means method. The news is taken from CNN Indonesia with a free topic. K-Means is used to classify sentences that already have weight in the news with 2 clusters, namely text summaries and not text summaries. The initial centroid is selected based on the sentence with the largest value and the sentence with the smallest value. The test conducted on Indonesian news with a total 50 news and tested for feasibility using a questionnaire. K-Means was successfully summarizing the news with an average 27.3 % of original news length and gain 87% good summarize based on respondents from questionnaire

Sriwijaya Journal of Informatics and Applications (SJIA)

Automatic Arabic Text Summarization System (AATSS) Based on Semantic Feature Extraction

Author: Abu Kwaik Kathrein Abdel Jawad
Publication venue: The Islamic University College Journal
Publication date: 01/01/2011
Field of study

Recently, one of the problems arisen due to the amount of information and it’s availability on the web, is the increased need for effective and powerful tool to automatically summarize text. For English and European languages an intensive works have been done with high performance and nowadays they look forward to multi-document and multi-language summarization. However, Arabic language still suffers from the little attentions and research done in this filed. In our research we propose a model to automatically summarize Arabic text using text extraction. Various steps are involved in the approach: preprocessing text, extract set of feature from sentences, classify sentence based on scoring method, ranking sentences and finally generate an extract summary. The main difference between our proposed system and other Arabic summarization systems are the consideration of semantics, entity objects such as names and places, and similarity factors in our proposed system. The proposed system has been applied on news domain using a dataset obtained from Falesteen newspaper. Manual evaluation techniques are used to evaluate and test the system. The results obtained by the proposed method achieve 86.5% similarity between the system and human summarization. A comparative study between our proposed system and Sakhr Arabic online summarization system has been conducted. The results show that our proposed system outperforms the Shakr system

Institutional Repository of the Islamic University of Gaza