Search CORE

1,578 research outputs found

Rolling-contact bearing reference summary

Author: Cox D. B.
Dufrane K. F.
Glaeser W. A.
Kannel J. W.
Publication venue
Publication date
Field of study

Design and performance of rolling contact bearing

NASA Technical Reports Server

Question-Answering Approach to Evaluate Legal Summaries

Author: Ashley Kevin
Xu Huihui
Publication venue
Publication date: 26/09/2023
Field of study

Traditional evaluation metrics like ROUGE compare lexical overlap between the reference and generated summaries without taking argumentative structure into account, which is important for legal summaries. In this paper, we propose a novel legal summarization evaluation framework that utilizes GPT-4 to generate a set of question-answer pairs that cover main points and information in the reference summary. GPT-4 is then used to generate answers based on the generated summary for the questions from the reference summary. Finally, GPT-4 grades the answers from the reference summary and the generated summary. We examined the correlation between GPT-4 grading with human grading. The results suggest that this question-answering approach with GPT-4 can be a useful tool for gauging the quality of the summary

arXiv.org e-Print Archive

SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization

Author: Eger Steffen
Gao Yang
Zhao Wei
Publication venue
Publication date: 01/01/2020
Field of study

We study unsupervised multi-document summarization evaluation metrics, which require neither human-written reference summaries nor human annotations (e.g. preferences, ratings, etc.). We propose SUPERT, which rates the quality of a summary by measuring its semantic similarity with a pseudo reference summary, i.e. selected salient sentences from the source documents, using contextualized embeddings and soft token alignment techniques. Compared to the state-of-the-art unsupervised evaluation metrics, SUPERT correlates better with human ratings by 18-39%. Furthermore, we use SUPERT as rewards to guide a neural-based reinforcement learning summarizer, yielding favorable performance compared to the state-of-the-art unsupervised summarizers. All source code is available at https://github.com/yg211/acl20-ref-free-eval.Comment: ACL 202

arXiv.org e-Print Archive

TUbiblio

Crossref

Software Engineering Laboratory (SEL) relationships, models, and management rules

Author: Decker William
Hendrick Robert
Valett Jon D.
Publication venue
Publication date
Field of study

Over 50 individual Software Engineering Laboratory (SEL) research results, extracted from a review of published SEL documentation, that can be applied directly to managing software development projects are captured. Four basic categories of results are defined and discussed - environment profiles, relationships, models, and management rules. In each category, research results are presented as a single page that summarizes the individual result, lists potential uses of the result by managers, and references the original SEL documentation where the result was found. The document serves as a concise reference summary of applicable research for SEL managers

NASA Technical Reports Server

TGSum: Build Tweet Guided Multi-Document Summarization Dataset

Author: Cao Ziqiang
Chen Chengyao
Li Sujian
Li Wenjie
Wei Furu
Zhou Ming
Publication venue
Publication date: 26/11/2015
Field of study

The development of summarization research has been significantly hampered by the costly acquisition of reference summaries. This paper proposes an effective way to automatically collect large scales of news-related multi-document summaries with reference to social media's reactions. We utilize two types of social labels in tweets, i.e., hashtags and hyper-links. Hashtags are used to cluster documents into different topic sets. Also, a tweet with a hyper-link often highlights certain key points of the corresponding document. We synthesize a linked document cluster to form a reference summary which can cover most key points. To this aim, we adopt the ROUGE metrics to measure the coverage ratio, and develop an Integer Linear Programming solution to discover the sentence set reaching the upper bound of ROUGE. Since we allow summary sentences to be selected from both documents and high-quality tweets, the generated reference summaries could be abstractive. Both informativeness and readability of the collected summaries are verified by manual judgment. In addition, we train a Support Vector Regression summarizer on DUC generic multi-document summarization benchmarks. With the collected data as extra training resource, the performance of the summarizer improves a lot on all the test sets. We release this dataset for further research.Comment: 7 pages, 1 figure in AAAI 201

arXiv.org e-Print Archive

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Association for the Advancement of Artificial Intelligence: AAAI Publications

Abstractive text summarization using Pre-Trained Language Model "Text-to-Text Transfer Transformer (T5)"

Author: Hayaty Mardhiya
Itsnaini Qurrota A’yuna
Jabari Nidal A.M
Putra Andriyan Dwi
Publication venue: 'Universitas Muslim Indonesia'
Publication date: 07/04/2023
Field of study

Automatic Text Summarization (ATS) is one of the utilizations of technological sophistication in terms of text processing assisting humans in producing a summary or key points of a document in large quantities. We use Indonesian language as objects because there are few resources in NLP research using Indonesian language. This paper utilized PLTMs (Pre-Trained Language Models) from the transformer architecture, namely T5 (Text-to-Text Transfer Transformer) which has been completed previously with a larger dataset. Evaluation in this study was measured through comparison of the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) calculation results between the reference summary and the model summary. The experiments with the pre-trained t5-base model with fine tuning parameters of 220M for the Indonesian news dataset yielded relatively high ROUGE values, namely ROUGE-1 = 0.68, ROUGE-2 = 0.61, and ROUGE-L = 0.65. The evaluation value worked well, but the resulting model has not achieved satisfactory results because in terms of abstraction, the model did not work optimally. We also found several errors in the reference summary in the dataset used

ILKOM Jurnal Ilmiah (Fakultas Ilmu Komputer, Universitas Muslim Indonesia)