In light of the outbreak of COVID-19, analyzing and measuring human mobility
has become increasingly important. A wide range of studies have explored
spatiotemporal trends over time, examined associations with other variables,
evaluated non-pharmacologic interventions (NPIs), and predicted or simulated
COVID-19 spread using mobility data. Despite the benefits of publicly available
mobility data, a key question remains unanswered: are models using mobility
data performing equitably across demographic groups? We hypothesize that bias
in the mobility data used to train the predictive models might lead to unfairly
less accurate predictions for certain demographic groups. To test our
hypothesis, we applied two mobility-based COVID infection prediction models at
the county level in the United States using SafeGraph data, and correlated
model performance with sociodemographic traits. Findings revealed that there is
a systematic bias in models performance toward certain demographic
characteristics. Specifically, the models tend to favor large, highly educated,
wealthy, young, urban, and non-black-dominated counties. We hypothesize that
the mobility data currently used by many predictive models tends to capture
less information about older, poorer, non-white, and less educated regions,
which in turn negatively impacts the accuracy of the COVID-19 prediction in
these regions. Ultimately, this study points to the need of improved data
collection and sampling approaches that allow for an accurate representation of
the mobility patterns across demographic groups.Comment: 24 pages, 4 figures, 2 Table