1,835 research outputs found
Multimodal Machine Learning for Automated ICD Coding
This study presents a multimodal machine learning model to predict ICD-10
diagnostic codes. We developed separate machine learning models that can handle
data from different modalities, including unstructured text, semi-structured
text and structured tabular data. We further employed an ensemble method to
integrate all modality-specific models to generate ICD-10 codes. Key evidence
was also extracted to make our prediction more convincing and explainable. We
used the Medical Information Mart for Intensive Care III (MIMIC -III) dataset
to validate our approach. For ICD code prediction, our best-performing model
(micro-F1 = 0.7633, micro-AUC = 0.9541) significantly outperforms other
baseline models including TF-IDF (micro-F1 = 0.6721, micro-AUC = 0.7879) and
Text-CNN model (micro-F1 = 0.6569, micro-AUC = 0.9235). For interpretability,
our approach achieves a Jaccard Similarity Coefficient (JSC) of 0.1806 on text
data and 0.3105 on tabular data, where well-trained physicians achieve 0.2780
and 0.5002 respectively.Comment: Machine Learning for Healthcare 201
Predicting Multiple ICD-10 Codes from Brazilian-Portuguese Clinical Notes
ICD coding from electronic clinical records is a manual, time-consuming and
expensive process. Code assignment is, however, an important task for billing
purposes and database organization. While many works have studied the problem
of automated ICD coding from free text using machine learning techniques, most
use records in the English language, especially from the MIMIC-III public
dataset. This work presents results for a dataset with Brazilian Portuguese
clinical notes. We develop and optimize a Logistic Regression model, a
Convolutional Neural Network (CNN), a Gated Recurrent Unit Neural Network and a
CNN with Attention (CNN-Att) for prediction of diagnosis ICD codes. We also
report our results for the MIMIC-III dataset, which outperform previous work
among models of the same families, as well as the state of the art. Compared to
MIMIC-III, the Brazilian Portuguese dataset contains far fewer words per
document, when only discharge summaries are used. We experiment concatenating
additional documents available in this dataset, achieving a great boost in
performance. The CNN-Att model achieves the best results on both datasets, with
micro-averaged F1 score of 0.537 on MIMIC-III and 0.485 on our dataset with
additional documents.Comment: Accepted at BRACIS 202
Multi-label natural language processing to identify diagnosis and procedure codes from MIMIC-III inpatient notes
In the United States, 25% or greater than 200 billion dollars of hospital
spending accounts for administrative costs that involve services for medical
coding and billing. With the increasing number of patient records, manual
assignment of the codes performed is overwhelming, time-consuming and
error-prone, causing billing errors. Natural language processing can automate
the extraction of codes/labels from unstructured clinical notes, which can aid
human coders to save time, increase productivity, and verify medical coding
errors. Our objective is to identify appropriate diagnosis and procedure codes
from clinical notes by performing multi-label classification. We used
de-identified data of critical care patients from the MIMIC-III database and
subset the data to select the ten (top-10) and fifty (top-50) most common
diagnoses and procedures, which covers 47.45% and 74.12% of all admissions
respectively. We implemented state-of-the-art Bidirectional Encoder
Representations from Transformers (BERT) to fine-tune the language model on 80%
of the data and validated on the remaining 20%. The model achieved an overall
accuracy of 87.08%, an F1 score of 85.82%, and an AUC of 91.76% for top-10
codes. For the top-50 codes, our model achieved an overall accuracy of 93.76%,
an F1 score of 92.24%, and AUC of 91%. When compared to previously published
research, our model outperforms in predicting codes from the clinical text. We
discuss approaches to generalize the knowledge discovery process of our
MIMIC-BERT to other clinical notes. This can help human coders to save time,
prevent backlogs, and additional costs due to coding errors.Comment: This is a shortened version of the Capstone Project that was accepted
by the Faculty of Indiana University, in partial fulfillment of the
requirements for the degree of Master of Science in Health Informatic
Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt
Automatic International Classification of Diseases (ICD) coding aims to
assign multiple ICD codes to a medical note with an average of 3,000+ tokens.
This task is challenging due to the high-dimensional space of multi-label
assignment (155,000+ ICD code candidates) and the long-tail challenge - Many
ICD codes are infrequently assigned yet infrequent ICD codes are important
clinically. This study addresses the long-tail challenge by transforming this
multi-label classification task into an autoregressive generation task.
Specifically, we first introduce a novel pretraining objective to generate free
text diagnoses and procedure using the SOAP structure, the medical logic
physicians use for note documentation. Second, instead of directly predicting
the high dimensional space of ICD codes, our model generates the lower
dimension of text descriptions, which then infer ICD codes. Third, we designed
a novel prompt template for multi-label classification. We evaluate our
Generation with Prompt model with the benchmark of all code assignment
(MIMIC-III-full) and few shot ICD code assignment evaluation benchmark
(MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with
a marco F1 30.2, which substantially outperforms the previous MIMIC-III-full
SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot
setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross
attention reranker with prompts, to integrate previous SOTA and our best
few-shot coding predictions. Experiments on MIMIC-III-full show that our
ensemble learner substantially improves both macro and micro F1, from 10.4 to
14.6 and from 58.2 to 59.1, respectively.Comment: To be appear in AAAI202
Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding
International Classification of Diseases (ICD) coding plays an important role
in systematically classifying morbidity and mortality data. In this study, we
propose a hierarchical label-wise attention Transformer model (HiLAT) for the
explainable prediction of ICD codes from clinical documents. HiLAT firstly
fine-tunes a pretrained Transformer model to represent the tokens of clinical
documents. We subsequently employ a two-level hierarchical label-wise attention
mechanism that creates label-specific document representations. These
representations are in turn used by a feed-forward neural network to predict
whether a specific ICD code is assigned to the input clinical document of
interest. We evaluate HiLAT using hospital discharge summaries and their
corresponding ICD-9 codes from the MIMIC-III database. To investigate the
performance of different types of Transformer models, we develop
ClinicalplusXLNet, which conducts continual pretraining from XLNet-Base using
all the MIMIC-III clinical notes. The experiment results show that the F1
scores of the HiLAT+ClinicalplusXLNet outperform the previous state-of-the-art
models for the top-50 most frequent ICD-9 codes from MIMIC-III. Visualisations
of attention weights present a potential explainability tool for checking the
face validity of ICD code predictions
Assigning Diagnosis Codes Using Medication History
Diagnosis assignment is the process of assigning disease codes to patients. Automatic diagnosis assignment has the potential to validate code assignments, correct erroneous codes, and register completion. Previous methods build on text-based techniques utilizing medical notes but are inapplicable in the absence of these notes. We propose using patients' medication data to assign diagnosis codes. We present a proof-of-concept study using medical data from an American dataset (MIMIC-III) and Danish nationwide registers to train a machine-learning-based model that predicts an extensive collection of diagnosis codes for multiple levels of aggregation over a disease hierarchy. We further suggest a specialized loss function designed to utilize the innate hierarchical nature of the disease hierarchy. We evaluate the proposed method on a subset of 567 disease codes. Moreover, we investigate the technique's generalizability and transferability by (1) training and testing models on the same subsets of disease codes over the two medical datasets and (2) training models on the American dataset while evaluating them on the Danish dataset, respectively. Results demonstrate the proposed method can correctly assign diagnosis codes on multiple levels of aggregation from the disease hierarchy over the American dataset with recall 70.0% and precision 69.48% for top-10 assigned codes; thereby being comparable to text-based techniques. Furthermore, the specialized loss function performs consistently better than the non-hierarchical state-of-the-art version. Moreover, results suggest the proposed method is language and dataset-agnostic, with initial indications of transferability over subsets of disease codes
- …