401 research outputs found
Automatic Extraction of Useful Information from Food -Health Articles related to Diabetes, Cardiovascular Disease and Cancer
Food-health articles (FHA) contain invaluable information for health promotion. However, extracting this information manually is a challenging process due to the length and number of articles published yearly. Automatic text summarization efficiently identifies useful information across large bodies of text which in turn speeds up the delivery of useful information from FHA. This research work aims to investigate the performance of statistical based summarization and graphical based unsupervised learning summarization in extracting useful information from FHA related to diabetes, cardiovascular disease and cancer. Various combinations of introduction, result and conclusion sections of three hundred articles were collected, preprocessed and used for evaluating the performance of the two summarization technique types. Generated summaries are compared to the original abstracts using two measures. The first quantifies the similarity of the generated summary to the abstract. The second measure gauges the coverage of the generated summary and the article abstract to the article sections. Overall, this experiment showed the automatically generated summaries are not comparable to the human-made abstracts found in FHA and there is room for improvement since the highest similarity of the generated to the written abstract was 52-57% and the sentence scoring of summarization could be optimized for various domains
Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization
We consider the problem of automatically generating a narrative biomedical
evidence summary from multiple trial reports. We evaluate modern neural models
for abstractive summarization of relevant article abstracts from systematic
reviews previously conducted by members of the Cochrane collaboration, using
the authors conclusions section of the review abstract as our target. We enlist
medical professionals to evaluate generated summaries, and we find that modern
summarization systems yield consistently fluent and relevant synopses, but that
they are not always factual. We propose new approaches that capitalize on
domain-specific models to inform summarization, e.g., by explicitly demarcating
snippets of inputs that convey key findings, and emphasizing the reports of
large and high-quality trials. We find that these strategies modestly improve
the factual accuracy of generated summaries. Finally, we propose a new method
for automatically evaluating the factuality of generated narrative evidence
syntheses using models that infer the directionality of reported findings.Comment: 11 pages, 2 figures. Accepted for presentation at the 2021 AMIA
Informatics Summi
Streamlining Literature Reviews Using an Automatic and Flexible Data Gathering and Classification Platform
Literature reviews are a crucial but time-consuming and complex task in scientific research. As such, interest in automating this process using machine learning techniques has increased over the last few years. In this paper, we present a method of streamlining the process of writing literature reviews by automating several aspects of the process using Maestro v2023, an automatic and flexible data gathering and classification platform. Maestro v2023 is a revamped version of the original Maestro platform, designed to be modular and configurable, allowing users in an organization to create search contexts that automatically gather and classify data for them. We analyze the work related to literature review automation and suggest how Maestro can contribute to this field, demonstrating how the system was utilized in order to streamline our own literature review process, as well aid us in formulating the abstract and extracting relevant keywords to this paper
Text summarization in the biomedical domain: A systematic review of recent research
The amount of information for clinicians and clinical researchers is growing exponentially. Text summarization reduces information as an attempt to enable users to find and understand relevant source texts more quickly and effortlessly. In recent years, substantial research has been conducted to develop and evaluate various summarization techniques in the biomedical domain. The goal of this study was to systematically review recent published research on summarization of textual documents in the biomedical domain
Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art
The adoption of Deep Neural Networks (DNNs) has greatly benefited Natural
Language Processing (NLP) during the past decade. However, the demands of long
document analysis are quite different from those of shorter texts, while the
ever increasing size of documents uploaded on-line renders automated
understanding of long texts a critical area of research. This article has two
goals: a) it overviews the relevant neural building blocks, thus serving as a
short tutorial, and b) it surveys the state-of-the-art in long document NLP,
mainly focusing on two central tasks: document classification and document
summarization. Sentiment analysis for long texts is also covered, since it is
typically treated as a particular case of document classification.
Additionally, this article discusses the main challenges, issues and current
solutions related to long document NLP. Finally, the relevant, publicly
available, annotated datasets are presented, in order to facilitate further
research.Comment: 53 pages, 2 figures, 171 citation
- …