37 research outputs found
Use of Available Data To Inform The COVID-19 Outbreak in South Africa: A Case Study
The coronavirus disease (COVID-19), caused by the SARS-CoV-2 virus, was
declared a pandemic by the World Health Organization (WHO) in February 2020.
Currently, there are no vaccines or treatments that have been approved after
clinical trials. Social distancing measures, including travel bans, school
closure, and quarantine applied to countries or regions are being used to limit
the spread of the disease and the demand on the healthcare infrastructure. The
seclusion of groups and individuals has led to limited access to accurate
information. To update the public, especially in South Africa, announcements
are made by the minister of health daily. These announcements narrate the
confirmed COVID-19 cases and include the age, gender, and travel history of
people who have tested positive for the disease. Additionally, the South
African National Institute for Communicable Diseases updates a daily
infographic summarising the number of tests performed, confirmed cases,
mortality rate, and the regions affected. However, the age of the patient and
other nuanced data regarding the transmission is only shared in the daily
announcements and not on the updated infographic. To disseminate this
information, the Data Science for Social Impact research group at the
University of Pretoria, South Africa, has worked on curating and applying
publicly available data in a way that is computer-readable so that information
can be shared to the public - using both a data repository and a dashboard.
Through collaborative practices, a variety of challenges related to publicly
available data in South Africa came to the fore. These include shortcomings in
the accessibility, integrity, and data management practices between
governmental departments and the South African public. In this paper, solutions
to these problems will be shared by using a publicly available data repository
and dashboard as a case study.Comment: Accepted for publication in the Data Science Journa
Improving short text classification through global augmentation methods
We study the effect of different approaches to text augmentation. To do this
we use 3 datasets that include social media and formal text in the form of news
articles. Our goal is to provide insights for practitioners and researchers on
making choices for augmentation for classification use cases. We observe that
Word2vec-based augmentation is a viable option when one does not have access to
a formal synonym model (like WordNet-based augmentation). The use of
\emph{mixup} further improves performance of all text based augmentations and
reduces the effects of overfitting on a tested deep learning model. Round-trip
translation with a translation service proves to be harder to use due to cost
and as such is less accessible for both normal and low resource use-cases.Comment: Final version published in CD-MAKE 2020: Machine Learning and
Knowledge Extraction pp 385-39
Semi-supervised learning approaches for predicting South African political sentiment for local government elections
This study aims to understand the South African political context
by analysing the sentiments shared on Twitter during the local
government elections. An emphasis on the analysis was placed
on understanding the discussions led around four predominant
political parties β ANC, DA, EFF and ActionSA. A semi-supervised
approach by means of a graph-based technique to label the vast
accessible Twitter data for the classification of tweets into negative
and positive sentiment was used. The tweets expressing negative
sentiment were further analysed through latent topic extraction
to uncover hidden topics of concern associated with each of the
political parties. Our findings demonstrated that the general sentiment
across South African Twitter users is negative towards all
four predominant parties with the worst negative sentiment among
users projected towards the current ruling party, ANC, relating to
concerns centered around corruption, incompetence and loadshedding.ABSA (who sponsor the UP ABSA Data Science Chair) and the National Research Foundation, South Africa.https://www.acm.org/publications/icpsam2023Computer Scienc
Improving short text classification through global augmentation methods
We study the effect of different approaches to text augmentation. To do this we use three datasets that include social media and formal text in the form of news articles. Our goal is to provide insights for practitioners and researchers on making choices for augmentation for classification use cases. We observe that Word2Vec-based augmentation is a viable option when one does not have access to a formal synonym model (like WordNet-based augmentation). The use of mixup further improves performance of all text based augmentations and reduces the effects of overfitting on a tested deep learning model. Round-trip translation with a translation service proves to be harder to use due to cost and as such is less accessible for both normal and low resource use-cases.http://link.springer.combookseries/558hj2020Computer Scienc
An Intelligent Multi-Agent Recommender System for Human Capacity Building
This paper presents a Multi-Agent approach to the problem of recommending
training courses to engineering professionals. The recommendation system is
built as a proof of concept and limited to the electrical and mechanical
engineering disciplines. Through user modelling and data collection from a
survey, collaborative filtering recommendation is implemented using intelligent
agents. The agents work together in recommending meaningful training courses
and updating the course information. The system uses a users profile and
keywords from courses to rank courses. A ranking accuracy for courses of 90% is
achieved while flexibility is achieved using an agent that retrieves
information autonomously using data mining techniques from websites. This
manner of recommendation is scalable and adaptable. Further improvements can be
made using clustering and recording user feedback.Comment: Proceedings of the 14th IEEE Mediterranean Electrotechnical
Conference, 2008, pages 909 to 91