2 research outputs found
Development of natural language processing tools to support determination of federal disability benefits in the U.S.
The disability benefits programs administered by the US Social Security Administration (SSA) receive between 2 and 3 million new applications each year. Adjudicators manually review hundreds of evidence pages per case to determine eligibility based on financial, medical, and functional criteria. Natural Language Processing (NLP) technology is uniquely suited to support this adjudication work and is a critical component of an ongoing inter-agency collaboration between SSA and the National Institutes of Health. This NLP work provides resources and models for document ranking, named entity recognition, and terminology extraction in order to automatically identify documents and reports pertinent to a case, and to allow adjudicators to search for and locate desired information quickly. In this paper, we describe our vision for how NLP can impact SSA’s adjudication process, present the resources and models that have been developed, and discuss some of the benefits and challenges in working with large-scale government data, and its specific properties in the functional domain
Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health
Linking clinical narratives to standardized vocabularies and coding systems
is a key component of unlocking the information in medical text for analysis.
However, many domains of medical concepts lack well-developed terminologies
that can support effective coding of medical text. We present a framework for
developing natural language processing (NLP) technologies for automated coding
of under-studied types of medical information, and demonstrate its
applicability via a case study on physical mobility function. Mobility is a
component of many health measures, from post-acute care and surgical outcomes
to chronic frailty and disability, and is coded in the International
Classification of Functioning, Disability, and Health (ICF). However, mobility
and other types of functional activity remain under-studied in medical
informatics, and neither the ICF nor commonly-used medical terminologies
capture functional status terminology in practice. We investigated two
data-driven paradigms, classification and candidate selection, to link
narrative observations of mobility to standardized ICF codes, using a dataset
of clinical narratives from physical therapy encounters. Recent advances in
language modeling and word embedding were used as features for established
machine learning models and a novel deep learning approach, achieving a macro
F-1 score of 84% on linking mobility activity reports to ICF codes. Both
classification and candidate selection approaches present distinct strengths
for automated coding in under-studied domains, and we highlight that the
combination of (i) a small annotated data set; (ii) expert definitions of codes
of interest; and (iii) a representative text corpus is sufficient to produce
high-performing automated coding systems. This study has implications for the
ongoing growth of NLP tools for a variety of specialized applications in
clinical care and research.Comment: Updated final version, published in Frontiers in Digital Health,
https://doi.org/10.3389/fdgth.2021.620828. 34 pages (23 text + 11
references); 9 figures, 2 table