42 research outputs found
The 2022 n2c2/UW Shared Task on Extracting Social Determinants of Health
Objective: The n2c2/UW SDOH Challenge explores the extraction of social
determinant of health (SDOH) information from clinical notes. The objectives
include the advancement of natural language processing (NLP) information
extraction techniques for SDOH and clinical information more broadly. This
paper presents the shared task, data, participating teams, performance results,
and considerations for future work.
Materials and Methods: The task used the Social History Annotated Corpus
(SHAC), which consists of clinical text with detailed event-based annotations
for SDOH events such as alcohol, drug, tobacco, employment, and living
situation. Each SDOH event is characterized through attributes related to
status, extent, and temporality. The task includes three subtasks related to
information extraction (Subtask A), generalizability (Subtask B), and learning
transfer (Subtask C). In addressing this task, participants utilized a range of
techniques, including rules, knowledge bases, n-grams, word embeddings, and
pretrained language models (LM).
Results: A total of 15 teams participated, and the top teams utilized
pretrained deep learning LM. The top team across all subtasks used a
sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1
Subtask B, and 0.889 F1 for Subtask C.
Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the
best performance, including generalizability and learning transfer. An error
analysis indicates extraction performance varies by SDOH, with lower
performance achieved for conditions, like substance use and homelessness, that
increase health risks (risk factors) and higher performance achieved for
conditions, like substance abstinence and living with family, that reduce
health risks (protective factors)
Generalizing through Forgetting -- Domain Generalization for Symptom Event Extraction in Clinical Notes
Symptom information is primarily documented in free-text clinical notes and
is not directly accessible for downstream applications. To address this
challenge, information extraction approaches that can handle clinical language
variation across different institutions and specialties are needed. In this
paper, we present domain generalization for symptom extraction using
pretraining and fine-tuning data that differs from the target domain in terms
of institution and/or specialty and patient population. We extract symptom
events using a transformer-based joint entity and relation extraction method.
To reduce reliance on domain-specific features, we propose a domain
generalization method that dynamically masks frequent symptoms words in the
source domain. Additionally, we pretrain the transformer language model (LM) on
task-related unlabeled texts for better representation. Our experiments
indicate that masking and adaptive pretraining methods can significantly
improve performance when the source domain is more distant from the target
domain
ACI-BENCH: a Novel Ambient Clinical Intelligence Dataset for Benchmarking Automatic Visit Note Generation
Recent immense breakthroughs in generative models such as in GPT4 have
precipitated re-imagined ubiquitous usage of these models in all applications.
One area that can benefit by improvements in artificial intelligence (AI) is
healthcare. The note generation task from doctor-patient encounters, and its
associated electronic medical record documentation, is one of the most arduous
time-consuming tasks for physicians. It is also a natural prime potential
beneficiary to advances in generative models. However with such advances,
benchmarking is more critical than ever. Whether studying model weaknesses or
developing new evaluation metrics, shared open datasets are an imperative part
of understanding the current state-of-the-art. Unfortunately as clinic
encounter conversations are not routinely recorded and are difficult to
ethically share due to patient confidentiality, there are no sufficiently large
clinic dialogue-note datasets to benchmark this task. Here we present the
Ambient Clinical Intelligence Benchmark (ACI-BENCH) corpus, the largest dataset
to date tackling the problem of AI-assisted note generation from visit
dialogue. We also present the benchmark performances of several common
state-of-the-art approaches
LeafAI: query generator for clinical cohort discovery rivaling a human programmer
Objective: Identifying study-eligible patients within clinical databases is a
critical step in clinical research. However, accurate query design typically
requires extensive technical and biomedical expertise. We sought to create a
system capable of generating data model-agnostic queries while also providing
novel logical reasoning capabilities for complex clinical trial eligibility
criteria.
Materials and Methods: The task of query creation from eligibility criteria
requires solving several text-processing problems, including named entity
recognition and relation extraction, sequence-to-sequence transformation,
normalization, and reasoning. We incorporated hybrid deep learning and
rule-based modules for these, as well as a knowledge base of the Unified
Medical Language System (UMLS) and linked ontologies. To enable data-model
agnostic query creation, we introduce a novel method for tagging database
schema elements using UMLS concepts. To evaluate our system, called LeafAI, we
compared the capability of LeafAI to a human database programmer to identify
patients who had been enrolled in 8 clinical trials conducted at our
institution. We measured performance by the number of actual enrolled patients
matched by generated queries.
Results: LeafAI matched a mean 43% of enrolled patients with 27,225 eligible
across 8 clinical trials, compared to 27% matched and 14,587 eligible in
queries by a human database programmer. The human programmer spent 26 total
hours crafting queries compared to several minutes by LeafAI.
Conclusions: Our work contributes a state-of-the-art data model-agnostic
query generation system capable of conditional reasoning using a knowledge
base. We demonstrate that LeafAI can rival a human programmer in finding
patients eligible for clinical trials
Identifying Patients with Pneumonia from Free-Text Intensive Care Unit Reports
Abstract Clinical research studying critical illness phenotypes relies on the identification of clinical syndromes defined by consensus definitions. Pneumonia is a prime example. Historically, identifying pneumonia has required manual chart review, which is a time and resource intensive process. The overall research goal of our work is to develop automated approaches that accurately identify critical illness phenotypes. In this paper, we describe our approach to the identification of pneumonia from electronic medical records, present our preliminary results, and describe future steps