216 research outputs found
Evidence Inference 2.0: More Data, Better Models
How do we most effectively treat a disease or condition? Ideally, we could
consult a database of evidence gleaned from clinical trials to answer such
questions. Unfortunately, no such database exists; clinical trial results are
instead disseminated primarily via lengthy natural language articles. Perusing
all such articles would be prohibitively time-consuming for healthcare
practitioners; they instead tend to depend on manually compiled systematic
reviews of medical literature to inform care.
NLP may speed this process up, and eventually facilitate immediate consult of
published evidence. The Evidence Inference dataset was recently released to
facilitate research toward this end. This task entails inferring the
comparative performance of two treatments, with respect to a given outcome,
from a particular article (describing a clinical trial) and identifying
supporting evidence. For instance: Does this article report that chemotherapy
performed better than surgery for five-year survival rates of operable cancers?
In this paper, we collect additional annotations to expand the Evidence
Inference dataset by 25\%, provide stronger baseline models, systematically
inspect the errors that these make, and probe dataset quality. We also release
an abstract only (as opposed to full-texts) version of the task for rapid model
prototyping. The updated corpus, documentation, and code for new baselines and
evaluations are available at http://evidence-inference.ebm-nlp.com/.Comment: Accepted as workshop paper into BioNLP Updated results from SciBERT
to Biomed RoBERT
Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization
We consider the problem of automatically generating a narrative biomedical
evidence summary from multiple trial reports. We evaluate modern neural models
for abstractive summarization of relevant article abstracts from systematic
reviews previously conducted by members of the Cochrane collaboration, using
the authors conclusions section of the review abstract as our target. We enlist
medical professionals to evaluate generated summaries, and we find that modern
summarization systems yield consistently fluent and relevant synopses, but that
they are not always factual. We propose new approaches that capitalize on
domain-specific models to inform summarization, e.g., by explicitly demarcating
snippets of inputs that convey key findings, and emphasizing the reports of
large and high-quality trials. We find that these strategies modestly improve
the factual accuracy of generated summaries. Finally, we propose a new method
for automatically evaluating the factuality of generated narrative evidence
syntheses using models that infer the directionality of reported findings.Comment: 11 pages, 2 figures. Accepted for presentation at the 2021 AMIA
Informatics Summi
Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews
Medical systematic reviews play a vital role in healthcare decision making
and policy. However, their production is time-consuming, limiting the
availability of high-quality and up-to-date evidence summaries. Recent
advancements in large language models (LLMs) offer the potential to
automatically generate literature reviews on demand, addressing this issue.
However, LLMs sometimes generate inaccurate (and potentially misleading) texts
by hallucination or omission. In healthcare, this can make LLMs unusable at
best and dangerous at worst. We conducted 16 interviews with international
systematic review experts to characterize the perceived utility and risks of
LLMs in the specific context of medical evidence reviews. Experts indicated
that LLMs can assist in the writing process by drafting summaries, generating
templates, distilling information, and crosschecking information. They also
raised concerns regarding confidently composed but inaccurate LLM outputs and
other potential downstream harms, including decreased accountability and
proliferation of low-quality reviews. Informed by this qualitative analysis, we
identify criteria for rigorous evaluation of biomedical LLMs aligned with
domain expert views.Comment: 18 pages, 2 figures, 8 tables. Accepted as an EMNLP 2023 main pape
Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations
The best evidence concerning comparative treatment effectiveness comes from
clinical trials, the results of which are reported in unstructured articles.
Medical experts must manually extract information from articles to inform
decision-making, which is time-consuming and expensive. Here we consider the
end-to-end task of both (a) extracting treatments and outcomes from full-text
articles describing clinical trials (entity identification) and, (b) inferring
the reported results for the former with respect to the latter (relation
extraction). We introduce new data for this task, and evaluate models that have
recently achieved state-of-the-art results on similar tasks in Natural Language
Processing. We then propose a new method motivated by how trial results are
typically presented that outperforms these purely data-driven baselines.
Finally, we run a fielded evaluation of the model with a non-profit seeking to
identify existing drugs that might be re-purposed for cancer, showing the
potential utility of end-to-end evidence extraction systems
Compression of glycolide-h4 to 6 GPa
This study details the structural characterisation of glycolide-h4 as a function of pressure to 6 GPa using neutron powder diffraction on the PEARL instrument at ISIS Neutron and Muon source. Glycolide-h4, rather than its deuterated isotopologue, was used in this study due to the difficulty of deuteration. The low-background afforded by Zirconia-Toughened Alumina (ZTA) anvils nevertheless enabled the collection of data suitable for structural analysis to be obtained to a pressure of 5 GPa. Glycolide-h4 undergoes a reconstructive phase transition at 0.15 GPa to a previously identified, form-II, which is stable to 6 GPa
The low recombining pericentromeric region of barley restricts gene diversity and evolution but not gene expression
The low-recombining pericentromeric region of the barley genome contains roughly a quarter of the genes of the species, embedded in low-recombining DNA that is rich in repeats and repressive chromatin signatures. We have investigated the effects of pericentromeric region residency upon the expression, diversity and evolution of these genes. We observe no significant difference in average transcript level or developmental RNA specificity between the barley pericentromeric region and the rest of the genome. In contrast, all of the evolutionary parameters studied here show evidence of compromised gene evolution in this region. First, genes within the pericentromeric region of wild barley show reduced diversity and significantly weakened purifying selection compared with the rest of the genome. Second, gene duplicates (ohnolog pairs) derived from the cereal whole-genome duplication event ca. 60MYa have been completely eliminated from the barley pericentromeric region. Third, local gene duplication in the pericentromeric region is reduced by 29% relative to the rest of the genome. Thus, the pericentromeric region of barley is a permissive environment for gene expression but has restricted gene evolution in a sizeable fraction of barley's genes
Question answering systems for health professionals at the point of care - a systematic review
Effect of wavefront aberrations on a focused plenoptic imaging system: a wave optics simulation approach
- …