The coronavirus disease (COVID-19), caused by the SARS-CoV-2 virus, was
declared a pandemic by the World Health Organization (WHO) in February 2020.
Currently, there are no vaccines or treatments that have been approved after
clinical trials. Social distancing measures, including travel bans, school
closure, and quarantine applied to countries or regions are being used to limit
the spread of the disease and the demand on the healthcare infrastructure. The
seclusion of groups and individuals has led to limited access to accurate
information. To update the public, especially in South Africa, announcements
are made by the minister of health daily. These announcements narrate the
confirmed COVID-19 cases and include the age, gender, and travel history of
people who have tested positive for the disease. Additionally, the South
African National Institute for Communicable Diseases updates a daily
infographic summarising the number of tests performed, confirmed cases,
mortality rate, and the regions affected. However, the age of the patient and
other nuanced data regarding the transmission is only shared in the daily
announcements and not on the updated infographic. To disseminate this
information, the Data Science for Social Impact research group at the
University of Pretoria, South Africa, has worked on curating and applying
publicly available data in a way that is computer-readable so that information
can be shared to the public - using both a data repository and a dashboard.
Through collaborative practices, a variety of challenges related to publicly
available data in South Africa came to the fore. These include shortcomings in
the accessibility, integrity, and data management practices between
governmental departments and the South African public. In this paper, solutions
to these problems will be shared by using a publicly available data repository
and dashboard as a case study.Comment: Accepted for publication in the Data Science Journa