There are several opportunities for automation in healthcare that can improve
clinician throughput. One such example is assistive tools to document diagnosis
codes when clinicians write notes. We study the automation of medical code
prediction using curriculum learning, which is a training strategy for machine
learning models that gradually increases the hardness of the learning tasks
from easy to difficult. One of the challenges in curriculum learning is the
design of curricula -- i.e., in the sequential design of tasks that gradually
increase in difficulty. We propose Hierarchical Curriculum Learning (HiCu), an
algorithm that uses graph structure in the space of outputs to design curricula
for multi-label classification. We create curricula for multi-label
classification models that predict ICD diagnosis and procedure codes from
natural language descriptions of patients. By leveraging the hierarchy of ICD
codes, which groups diagnosis codes based on various organ systems in the human
body, we find that our proposed curricula improve the generalization of neural
network-based predictive models across recurrent, convolutional, and
transformer-based architectures. Our code is available at
https://github.com/wren93/HiCu-ICD.Comment: To appear at Machine Learning for Healthcare Conference (MLHC2022