The relative case fatality rates (CFRs) between groups and countries are key
measures of relative risk that guide policy decisions regarding scarce medical
resource allocation during the ongoing COVID-19 pandemic. In the middle of an
active outbreak when surveillance data is the primary source of information,
estimating these quantities involves compensating for competing biases in time
series of deaths, cases, and recoveries. These include time- and severity-
dependent reporting of cases as well as time lags in observed patient outcomes.
In the context of COVID-19 CFR estimation, we survey such biases and their
potential significance. Further, we analyze theoretically the effect of certain
biases, like preferential reporting of fatal cases, on naive estimators of CFR.
We provide a partially corrected estimator of these naive estimates that
accounts for time lag and imperfect reporting of deaths and recoveries. We show
that collection of randomized data by testing the contacts of infectious
individuals regardless of the presence of symptoms would mitigate bias by
limiting the covariance between diagnosis and death. Our analysis is
supplemented by theoretical and numerical results and a simple and fast
open-source codebase at https://github.com/aangelopoulos/cfr-covid-19 .Comment: Harvard Data Science Review (2020) article available at
https://hdsr.mitpress.mit.edu/pub/y9vc2u3