COVID-19 has a spectrum of disease severity, ranging from asymptomatic to
requiring hospitalization. Understanding the mechanisms driving disease
severity is crucial for developing effective treatments and reducing mortality
rates. One way to gain such understanding is using a multi-class classification
framework, in which patients' biological features are used to predict patients'
severity classes. In this severity classification problem, it is beneficial to
prioritize the identification of more severe classes and control the
"under-classification" errors, in which patients are misclassified into less
severe categories. The Neyman-Pearson (NP) classification paradigm has been
developed to prioritize the designated type of error. However, current NP
procedures are either for binary classification or do not provide high
probability controls on the prioritized errors in multi-class classification.
Here, we propose a hierarchical NP (H-NP) framework and an umbrella algorithm
that generally adapts to popular classification methods and controls the
under-classification errors with high probability. On an integrated collection
of single-cell RNA-seq (scRNA-seq) datasets for 864 patients, we explore ways
of featurization and demonstrate the efficacy of the H-NP algorithm in
controlling the under-classification errors regardless of featurization. Beyond
COVID-19 severity classification, the H-NP algorithm generally applies to
multi-class classification problems, where classes have a priority order