12 research outputs found

    What Are People Asking About COVID-19? A Question Classification Dataset

    Full text link
    We present COVID-Q, a set of 1,690 questions about COVID-19 from 13 sources, which we annotate into 15 question categories and 207 question clusters. The most common questions in our dataset asked about transmission, prevention, and societal effects of COVID, and we found that many questions that appeared in multiple sources were not answered by any FAQ websites of reputable organizations such as the CDC and FDA. We post our dataset publicly at https://github.com/JerryWei03/COVID-Q. For classifying questions into 15 categories, a BERT baseline scored 58.1% accuracy when trained on 20 examples per category, and for a question clustering task, a BERT + triplet loss baseline achieved 49.5% accuracy. We hope COVID-Q can help either for direct use in developing applied systems or as a domain-specific resource for model evaluation.Comment: Published in Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 202

    CovidTracker: A comprehensive Covid-related social media dataset for NLP tasks

    Full text link
    The Covid-19 pandemic presented an unprecedented global public health emergency, and concomitantly an unparalleled opportunity to investigate public responses to adverse social conditions. The widespread ability to post messages to social media platforms provided an invaluable outlet for such an outpouring of public sentiment, including not only expressions of social solidarity, but also the spread of misinformation and misconceptions around the effect and potential risks of the pandemic. This archive of message content therefore represents a key resource in understanding public responses to health crises, analysis of which could help to inform public policy interventions to better respond to similar events in future. We present a benchmark database of public social media postings from the United Kingdom related to the Covid-19 pandemic for academic research purposes, along with some initial analysis, including a taxonomy of key themes organised by keyword. This release supports the findings of a research study funded by the Scottish Government Chief Scientists' Office that aims to investigate social sentiment in order to understand the response to public health measures implemented during the pandemic

    Is There a Relationship Between New Media and the Number of Recorded Covid-19 Cases in Ghana? No! Evidence from a Content Analysis of the Twitter Posts of Key Ghanaian State Actors in the First Year of the Pandemic

    Get PDF
    The covid 19 pandemic led to a public health crises which aside leading to the death of more than four hundred and fifty thousand people (WHO, 2020a), has also disrupted the way people live, by forcing us to make changes to how we work, school and live our social lives. Governments worldwide devised many strategies to help slow down the spread of the virus and reduce its impact on the economy and livelihoods of people. Eventhough social media platforms played a key role in information dissemination and awareness creation in relation to the novel Corona Virus, it is unknown if the activity of key government social media accounts have any relationship with the number of recorded cases. The researchers used a quantitative content analysis strategy to analyse the posts of 5 key Ghanaian government accounts on Twitter between 11th March 2020 and 11th March 2021, in relation to certain Covid 19 keywords. The researchers found that, no correlation exists between the Twitter posts of key government accounts and number of recorded Covid-19 cases in Ghana. The study also shows that, the lowest number of Covid 19 related tweets were posted in December 2020, the month of the Ghanaian elections, whereas, the highest number of Covid 19 related tweets were posted in March 2020, the month in which the first case was detected in Ghana.  The researchers conclude that eventhough social media can conttribute to crises and emergency communications, social media alone as a crises communication strategy may not be enough and must be paired with other traditional forms of communication such as radio. Keywords: Social Media, Crises Communications, Twitter, Covid 19, Pandemic Communications DOI: 10.7176/NMMC/104-01 Publication date: January 31st 202

    COVID-19 datasets : a brief overview

    Get PDF
    The outbreak of the COVID-19 pandemic affects lives and social-economic development around the world. The affecting of the pandemic has motivated researchers from different domains to find effective solutions to diagnose, prevent, and estimate the pandemic and relieve its adverse effects. Numerous COVID-19 datasets are built from these studies and are available to the public. These datasets can be used for disease diagnosis and case prediction, speeding up solving problems caused by the pandemic. To meet the needs of researchers to understand various COVID-19 datasets, we examine and provide an overview of them. We organise the majority of these datasets into three categories based on the category of ap-plications, i.e., time-series, knowledge base, and media-based datasets. Organising COVID-19 datasets into appropriate categories can help researchers hold their focus on methodology rather than the datasets. In addition, applications and COVID-19 datasets suffer from a series of problems, such as privacy and quality. We discuss these issues as well as potentials of COVID-19 datasets. © 2022, ComSIS Consortium. All rights reserved

    A Unified Contrastive Transfer Framework with Propagation Structure for Boosting Low-Resource Rumor Detection

    Full text link
    The truth is significantly hampered by massive rumors that spread along with breaking news or popular topics. Since there is sufficient corpus gathered from the same domain for model training, existing rumor detection algorithms show promising performance on yesterday's news. However, due to a lack of training data and prior expert knowledge, they are poor at spotting rumors concerning unforeseen events, especially those propagated in different languages (i.e., low-resource regimes). In this paper, we propose a unified contrastive transfer framework to detect rumors by adapting the features learned from well-resourced rumor data to that of the low-resourced. More specifically, we first represent rumor circulated on social media as an undirected topology, and then train a Multi-scale Graph Convolutional Network via a unified contrastive paradigm. Our model explicitly breaks the barriers of the domain and/or language issues, via language alignment and a novel domain-adaptive contrastive learning mechanism. To enhance the representation learning from a small set of target events, we reveal that rumor-indicative signal is closely correlated with the uniformity of the distribution of these events. We design a target-wise contrastive training mechanism with three data augmentation strategies, capable of unifying the representations by distinguishing target events. Extensive experiments conducted on four low-resource datasets collected from real-world microblog platforms demonstrate that our framework achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.Comment: A significant extension of the first contrastive approach for low-resource rumor detection (arXiv:2204.08143

    Leveraging Twitter data to analyze the virality of Covid-19 tweets: a text mining approach

    Get PDF
    As the novel coronavirus spreads across the world, work, pleasure, entertainment, social interactions, and meetings have shifted online. The conversations on social media have spiked, and given the uncertainties and new policies, COVID-19 remains the trending topic on all such platforms, including Twitter. This research explores the factors that affect COVID-19 content-sharing by Twitter users. The analysis was conducted using 57,000 plus tweets that mentioned COVID-19 and related keywords. The tweets were subjected to the Natural Language Processing (NLP) techniques like Topic modelling, Named Entity-Relationship, Emotion & Sentiment analysis, and Linguistic feature extraction. These methods generated features that could help explain the retweet count of the tweets. The results indicate that tweets with named entities (person, organisation, and location), expression of negative emotions (anger, disgust, fear, and sadness), reference to mental health, optimistic content, and greater length have higher chances of being shared (retweeted). On the other hand, tweets with more hashtags and user mentions are less likely to be shared

    Developing a mental health index using a machine learning approach: Assessing the impact of mobility and lockdown during the COVID-19 pandemic

    Get PDF
    Governments worldwide have implemented stringent restrictions to curtail the spread of the COVID-19 pandemic. Although beneficial to physical health, these preventive measures could have a profound detrimental effect on the mental health of the population. This study focuses on the impact of lockdowns and mobility restrictions on mental health during the COVID-19 pandemic. We first develop a novel mental health index based on the analysis of data from over three million global tweets using the Microsoft Azure machine learning approach. The computed mental health index scores are then regressed with the lockdown strictness index and Google mobility index using fixed-effects ordinary least squares (OLS) regression. The results reveal that the reduction in workplace mobility, reduction in retail and recreational mobility, and increase in residential mobility (confinement to the residence) have harmed mental health. However, restrictions on mobility to parks, grocery stores, and pharmacy outlets were found to have no significant impact. The proposed mental health index provides a path for theoretical and empirical mental health studies using social media. [Abstract copyright: © 2022 Elsevier Inc. All rights reserved.

    Social media mining under the COVID-19 context: Progress, challenges, and opportunities

    Full text link
    Social media platforms allow users worldwide to create and share information, forging vast sensing networks that allow information on certain topics to be collected, stored, mined, and analyzed in a rapid manner. During the COVID-19 pandemic, extensive social media mining efforts have been undertaken to tackle COVID-19 challenges from various perspectives. This review summarizes the progress of social media data mining studies in the COVID-19 contexts and categorizes them into six major domains, including early warning and detection, human mobility monitoring, communication and information conveying, public attitudes and emotions, infodemic and misinformation, and hatred and violence. We further document essential features of publicly available COVID-19 related social media data archives that will benefit research communities in conducting replicable and repro�ducible studies. In addition, we discuss seven challenges in social media analytics associated with their potential impacts on derived COVID-19 findings, followed by our visions for the possible paths forward in regard to social media-based COVID-19 investigations. This review serves as a valuable reference that recaps social media mining efforts in COVID-19 related studies and provides future directions along which the information harnessed from social media can be used to address public health emergencies
    corecore