126 research outputs found
Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization
We consider the problem of automatically generating a narrative biomedical
evidence summary from multiple trial reports. We evaluate modern neural models
for abstractive summarization of relevant article abstracts from systematic
reviews previously conducted by members of the Cochrane collaboration, using
the authors conclusions section of the review abstract as our target. We enlist
medical professionals to evaluate generated summaries, and we find that modern
summarization systems yield consistently fluent and relevant synopses, but that
they are not always factual. We propose new approaches that capitalize on
domain-specific models to inform summarization, e.g., by explicitly demarcating
snippets of inputs that convey key findings, and emphasizing the reports of
large and high-quality trials. We find that these strategies modestly improve
the factual accuracy of generated summaries. Finally, we propose a new method
for automatically evaluating the factuality of generated narrative evidence
syntheses using models that infer the directionality of reported findings.Comment: 11 pages, 2 figures. Accepted for presentation at the 2021 AMIA
Informatics Summi
A Survey on Biomedical Text Summarization with Pre-trained Language Model
The exponential growth of biomedical texts such as biomedical literature and
electronic health records (EHRs), provides a big challenge for clinicians and
researchers to access clinical information efficiently. To address the problem,
biomedical text summarization has been proposed to support clinical information
retrieval and management, aiming at generating concise summaries that distill
key information from single or multiple biomedical documents. In recent years,
pre-trained language models (PLMs) have been the de facto standard of various
natural language processing tasks in the general domain. Most recently, PLMs
have been further investigated in the biomedical field and brought new insights
into the biomedical text summarization task. In this paper, we systematically
summarize recent advances that explore PLMs for biomedical text summarization,
to help understand recent progress, challenges, and future directions. We
categorize PLMs-based approaches according to how they utilize PLMs and what
PLMs they use. We then review available datasets, recent approaches and
evaluation metrics of the task. We finally discuss existing challenges and
promising future directions. To facilitate the research community, we line up
open resources including available datasets, recent approaches, codes,
evaluation metrics, and the leaderboard in a public project:
https://github.com/KenZLuo/Biomedical-Text-Summarization-Survey/tree/master.Comment: 19 pages, 6 figures, TKDE under revie
Leveraging GPT-4 for Food Effect Summarization to Enhance Product-Specific Guidance Development via Iterative Prompting
Food effect summarization from New Drug Application (NDA) is an essential
component of product-specific guidance (PSG) development and assessment.
However, manual summarization of food effect from extensive drug application
review documents is time-consuming, which arouses a need to develop automated
methods. Recent advances in large language models (LLMs) such as ChatGPT and
GPT-4, have demonstrated great potential in improving the effectiveness of
automated text summarization, but its ability regarding the accuracy in
summarizing food effect for PSG assessment remains unclear. In this study, we
introduce a simple yet effective approach, iterative prompting, which allows
one to interact with ChatGPT or GPT-4 more effectively and efficiently through
multi-turn interaction. Specifically, we propose a three-turn iterative
prompting approach to food effect summarization in which the keyword-focused
and length-controlled prompts are respectively provided in consecutive turns to
refine the quality of the generated summary. We conduct a series of extensive
evaluations, ranging from automated metrics to FDA professionals and even
evaluation by GPT-4, on 100 NDA review documents selected over the past five
years. We observe that the summary quality is progressively improved throughout
the process. Moreover, we find that GPT-4 performs better than ChatGPT, as
evaluated by FDA professionals (43% vs. 12%) and GPT-4 (64% vs. 35%).
Importantly, all the FDA professionals unanimously rated that 85% of the
summaries generated by GPT-4 are factually consistent with the golden reference
summary, a finding further supported by GPT-4 rating of 72% consistency. These
results strongly suggest a great potential for GPT-4 to draft food effect
summaries that could be reviewed by FDA professionals, thereby improving the
efficiency of PSG assessment cycle and promoting the generic drug product
development.Comment: 22 pages, 6 figure
CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs
We present CrossSum, a large-scale cross-lingual abstractive summarization
dataset comprising 1.7 million article-summary samples in 1500+ language pairs.
We create CrossSum by aligning identical articles written in different
languages via cross-lingual retrieval from a multilingual summarization
dataset. We propose a multi-stage data sampling algorithm to effectively train
a cross-lingual summarization model capable of summarizing an article in any
target language. We also propose LaSE, a new metric for automatically
evaluating model-generated summaries and showing a strong correlation with
ROUGE. Performance on ROUGE and LaSE indicate that pretrained models fine-tuned
on CrossSum consistently outperform baseline models, even when the source and
target language pairs are linguistically distant. To the best of our knowledge,
CrossSum is the largest cross-lingual summarization dataset and the first-ever
that does not rely solely on English as the pivot language. We are releasing
the dataset, alignment and training scripts, and the models to spur future
research on cross-lingual abstractive summarization. The resources can be found
at https://github.com/csebuetnlp/CrossSum
- …