The COVID-19 pandemic has posed a heavy burden to the healthcare system
worldwide and caused huge social disruption and economic loss. Many deep
learning models have been proposed to conduct clinical predictive tasks such as
mortality prediction for COVID-19 patients in intensive care units using
Electronic Health Record (EHR) data. Despite their initial success in certain
clinical applications, there is currently a lack of benchmarking results to
achieve a fair comparison so that we can select the optimal model for clinical
use. Furthermore, there is a discrepancy between the formulation of traditional
prediction tasks and real-world clinical practice in intensive care. To fill
these gaps, we propose two clinical prediction tasks, Outcome-specific
length-of-stay prediction and Early mortality prediction for COVID-19 patients
in intensive care units. The two tasks are adapted from the naive
length-of-stay and mortality prediction tasks to accommodate the clinical
practice for COVID-19 patients. We propose fair, detailed, open-source
data-preprocessing pipelines and evaluate 17 state-of-the-art predictive models
on two tasks, including 5 machine learning models, 6 basic deep learning models
and 6 deep learning predictive models specifically designed for EHR data. We
provide benchmarking results using data from two real-world COVID-19 EHR
datasets. One dataset is publicly available without needing any inquiry and
another dataset can be accessed on request. We provide fair, reproducible
benchmarking results for two tasks. We deploy all experiment results and models
on an online platform. We also allow clinicians and researchers to upload their
data to the platform and get quick prediction results using our trained models.
We hope our efforts can further facilitate deep learning and machine learning
research for COVID-19 predictive modeling.Comment: Junyi Gao, Yinghao Zhu and Wenqing Wang contributed equall