23,568 research outputs found

    A Study of Realtime Summarization Metrics

    Get PDF
    Unexpected news events, such as natural disasters or other human tragedies, create a large volume of dynamic text data from official news media as well as less formal social media. Automatic real-time text summarization has become an important tool for quickly transforming this overabundance of text into clear, useful information for end-users including affected individuals, crisis responders, and interested third parties. Despite the importance of real-time summarization systems, their evaluation is not well understood as classic methods for text summarization are inappropriate for real-time and streaming conditions. The TREC 2013-2015 Temporal Summarization (TREC-TS) track was one of the first evaluation campaigns to tackle the challenges of real-time summarization evaluation, introducing new metrics, ground-truth generation methodology and dataset. In this paper, we present a study of TREC-TS track evaluation methodology, with the aim of documenting its design, analyzing its effectiveness, as well as identifying improvements and best practices for the evaluation of temporal summarization systems

    Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

    Get PDF
    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization

    RTSUM: Relation Triple-based Interpretable Summarization with Multi-level Salience Visualization

    Full text link
    In this paper, we present RTSUM, an unsupervised summarization framework that utilizes relation triples as the basic unit for summarization. Given an input document, RTSUM first selects salient relation triples via multi-level salience scoring and then generates a concise summary from the selected relation triples by using a text-to-text language model. On the basis of RTSUM, we also develop a web demo for an interpretable summarizing tool, providing fine-grained interpretations with the output summary. With support for customization options, our tool visualizes the salience for textual units at three distinct levels: sentences, relation triples, and phrases. The codes,are publicly available.Comment: 8 pages, 2 figure

    Summarizing Text for Indonesian Language by Using Latent Dirichlet Allocation and Genetic Algorithm

    Full text link
    The number of documents progressively increases especially for the electronic one. This degrades effectivity and efficiency in managing them. Therefore, it is a must to manage the documents. Automatic text summarization is able to solve by producing text document summaries. The goal of the research is to produce a tool to summarize documents in Bahasa: Indonesian Language. It is aimed to satisfy the user's need of relevant and consistent summaries. The algorithm is based on sentence features scoring by using Latent Dirichlet Allocation and Genetic Algorithm for determining sentence feature weights. It is evaluated by calculating summarization speed, precision, recall, F-measure, and some subjective evaluations. Extractive summaries from the original text documents can represent important information from a single document in Bahasa with faster summarization speed compared to manual process. Best F-measure value is 0,556926 (with precision of 0.53448 and recall of 0.58134) and summary ratio of 30%

    Summarizing Text for Indonesian Language by Using Latent Dirichlet Allocation and Genetic Algorithm

    Get PDF
    The number of documents progressively increases especially for the electronic one. This degrades effectivity and efficiency in managing them. Therefore, it is a must to manage the documents. Automatic text summarization is able to solve by producing text document summaries. The goal of the research is to produce a tool to summarize documents in Bahasa: Indonesian Language. It is aimed to satisfy the user’s need of relevant and consistent summaries. The algorithm is based on sentence features scoring by using Latent Dirichlet Allocation and Genetic Algorithm for determining sentence feature weights. It is evaluated by calculating summarization speed, precision, recall, F-measure, and some subjective evaluations. Extractive summaries from the original text documents can represent important information from a single document in Bahasa with faster summarization speed compared to manual process. Best F-measure value is 0,556926 (with precision of 0.53448 and recall of 0.58134) and summary ratio of 30%

    Summarizing Text Using Lexical Chains

    Get PDF
    The current technology of automatic text summarization imparts an important role in the information retrieval and text classification, and it provides the best solution to the information overload problem. And the text summarization is a process of reducing the size of a text while protecting its information content. When taking into consideration the size and number of documents which are available on the Internet and from the other sources, the requirement for a highly efficient tool on which produces usable summaries is clear. We present a better algorithm using lexical chain computation. The algorithm one which makes lexical chains a computationally feasible for the user. And using these lexical chains the user will generate a summary, which is much more effective compared to the solutions available and also closer to the human generated summary

    Text Summarization with K-Means Method

    Get PDF
    Text Summarization is a tool used to generate a short form of text that contains important information that is needed by the user automatically. In this study, Text Summarization was conducted on Indonesian news using K-Means method. The news is taken from CNN Indonesia with a free topic. K-Means is used to classify sentences that already have weight in the news with 2 clusters, namely text summaries and not text summaries. The initial centroid is selected based on the sentence with the largest value and the sentence with the smallest value. The test conducted on Indonesian news with a total 50 news and tested for feasibility using a questionnaire. K-Means was successfully summarizing the news with an average 27.3 % of original news length and gain 87% good summarize based on respondents from questionnaire

    Automatic Arabic Text Summarization System (AATSS) Based on Semantic Feature Extraction

    Get PDF
    Recently, one of the problems arisen due to the amount of information and it’s availability on the web, is the increased need for effective and powerful tool to automatically summarize text. For English and European languages an intensive works have been done with high performance and nowadays they look forward to multi-document and multi-language summarization. However, Arabic language still suffers from the little attentions and research done in this filed. In our research we propose a model to automatically summarize Arabic text using text extraction. Various steps are involved in the approach: preprocessing text, extract set of feature from sentences, classify sentence based on scoring method, ranking sentences and finally generate an extract summary. The main difference between our proposed system and other Arabic summarization systems are the consideration of semantics, entity objects such as names and places, and similarity factors in our proposed system. The proposed system has been applied on news domain using a dataset obtained from Falesteen newspaper. Manual evaluation techniques are used to evaluate and test the system. The results obtained by the proposed method achieve 86.5% similarity between the system and human summarization. A comparative study between our proposed system and Sakhr Arabic online summarization system has been conducted. The results show that our proposed system outperforms the Shakr system
    • …
    corecore