445 research outputs found

    Systematic review on the prevalence, frequency and comparative value of adverse events data in social media

    Get PDF
    Aim: The aim of this review was to summarize the prevalence, frequency and comparative value of information on the adverse events of healthcare interventions from user comments and videos in social media. Methods: A systematic review of assessments of the prevalence or type of information on adverse events in social media was undertaken. Sixteen databases and two internet search engines were searched in addition to handsearching, reference checking and contacting experts. The results were sifted independently by two researchers. Data extraction and quality assessment were carried out by one researcher and checked by a second. The quality assessment tool was devised in-house and a narrative synthesis of the results followed. Results: From 3064 records, 51 studies met the inclusion criteria. The studies assessed over 174 social media sites with discussion forums (71%) being the most popular. The overall prevalence of adverse events reports in social media varied from 0.2% to 8% of posts. Twenty-nine studies compared the results from searching social media with using other data sources to identify adverse events. There was general agreement that a higher frequency of adverse events was found in social media and that this was particularly true for ‘symptom’ related and ‘mild’ adverse events. Those adverse events that were under-represented in social media were laboratory-based and serious adverse events. Conclusions: Reports of adverse events are identifiable within social media. However, there is considerable heterogeneity in the frequency and type of events reported, and the reliability or validity of the data has not been thoroughly evaluated

    Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task

    Get PDF
    Objective: We executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data.Materials and Methods: We organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions. Training data consisted of 15 717 annotated tweets for (1), 10 260 for (2), and 6650 ADR phrases and identifiers for (3); and exhibited typical properties of social-media-based health-related texts. Systems were evaluated using 9961, 7513, and 2500 instances for the 3 subtasks, respectively. We evaluated performances of classes of methods and ensembles of system combinations following the shared tasks.Results: Among 55 system runs, the best system scores for the 3 subtasks were 0.435 (ADR class F1-score) for subtask-1, 0.693 (micro-averaged F1-score over two classes) for subtask-2, and 88.5% (accuracy) for subtask-3. Ensembles of system combinations obtained best scores of 0.476, 0.702, and 88.7%, outperforming individual systems.Discussion: Among individual systems, support vector machines and convolutional neural networks showed high performance. Performance gains achieved by ensembles of system combinations suggest that such strategies may be suitable for operational systems relying on difficult text classification tasks (eg, subtask-1).Conclusions: Data imbalance and lack of context remain challenges for natural language processing of social media text. Annotated data from the shared task have been made available as reference standards for future studies (http://dx.doi.org/10.17632/rxwfb3tysd.1).</div

    Challenges and opportunities for mining adverse drug reactions: perspectives from pharma, regulatory agencies, healthcare providers and consumers

    Get PDF
    Monitoring drug safety is a central concern throughout the drug life cycle. Information about toxicity and adverse events is generated at every stage of this life cycle, and stakeholders have a strong interest in applying text mining and artificial intelligence (AI) methods to manage the ever-increasing volume of this information. Recognizing the importance of these applications and the role of challenge evaluations to drive progress in text mining, the organizers of BioCreative VII (Critical Assessment of Information Extraction in Biology) convened a panel of experts to explore ‘Challenges in Mining Drug Adverse Reactions’. This article is an outgrowth of the panel; each panelist has highlighted specific text mining application(s), based on their research and their experiences in organizing text mining challenge evaluations. While these highlighted applications only sample the complexity of this problem space, they reveal both opportunities and challenges for text mining to aid in the complex process of drug discovery, testing, marketing and post-market surveillance. Stakeholders are eager to embrace natural language processing and AI tools to help in this process, provided that these tools can be demonstrated to add value to stakeholder workflows. This creates an opportunity for the BioCreative community to work in partnership with regulatory agencies, pharma and the text mining community to identify next steps for future challenge evaluations.M.K.: This work was supported in part through the collaboration between the Spanish Plan for the Advancement of Language Technology (Plan TL) and the Barcelona Supercomputing Center; we also acknowledge the 2020 Proyectos de I+D+i - RTI Tipo A (PID2020-119266RA-I00) for support. Ö.U.: This study was supported in part by the National Library of Medicine under Award Number R15LM013209 and R13LM013127.Peer ReviewedPostprint (published version

    The Healthy States of America: Creating a Health Taxonomy with Social Media

    Get PDF
    Since the uptake of social media, researchers have mined online discussions to track the outbreak and evolution of specific diseases or chronic conditions such as influenza or depression. To broaden the set of diseases under study, we developed a Deep Learning tool for Natural Language Processing that extracts mentions of virtually any medical condition or disease from unstructured social media text. With that tool at hand, we processed Reddit and Twitter posts, analyzed the clusters of the two resulting co-occurrence networks of conditions, and discovered that they correspond to well-defined categories of medical conditions. This resulted in the creation of the first comprehensive taxonomy of medical conditions automatically derived from online discussions. We validated the structure of our taxonomy against the official International Statistical Classification of Diseases and Related Health Problems (ICD-11), finding matches of our clusters with 20 official categories, out of 22. Based on the mentions of our taxonomy's sub-categories on Reddit posts geo-referenced in the U.S., we were then able to compute disease-specific health scores. As opposed to counts of disease mentions or counts with no knowledge of our taxonomy's structure, we found that our disease-specific health scores are causally linked with the officially reported prevalences of 18 conditions

    Detecting adherence to the recommended childhood vaccination schedule from user-generated content in a US parenting forum

    Get PDF
    Vaccine hesitancy is considered as one of the leading causes for the resurgence of vaccine preventable diseases. A non-negligible minority of parents does not fully adhere to the recommended vaccination schedule, leading their children to be partially immunized and at higher risk of contracting vaccine preventable diseases. Here, we leverage more than one million comments of 201,986 users posted from March 2008 to April 2019 on the public online forum BabyCenter US to learn more about such parents. For 32% with geographic location, we find the number of mapped users for each US state resembling the census population distribution with good agreement. We employ Natural Language Processing to identify 6884 and 10,131 users expressing their intention of following the recommended and alternative vaccination schedule, respectively RSUs and ASUs. From the analysis of their activity on the forum we find that ASUs have distinctly different interests and previous experiences with vaccination than RSUs. In particular, ASUs are more likely to follow groups focused on alternative medicine, are two times more likely to have experienced adverse events following immunization, and to mention more serious adverse reactions such as seizure or developmental regression. Content analysis of comments shows that the resources most frequently shared by both groups point to governmental domains (.gov). Finally, network analysis shows that RSUs and ASUs communicate between each other (indicating the absence of echo chambers), however with the latter group being more endogamic and favoring interactions with other ASUs. While our findings are limited to the specific platform analyzed, our approach may provide additional insights for the development of campaigns targeting parents on digital platforms.Postprint (published version

    Public Discourse Against Masks in the COVID-19 Era: Infodemiology Study of Twitter Data

    Get PDF
    Background: Despite scientific evidence supporting the importance of wearing masks to curtail the spread of COVID-19, wearing masks has stirred up a significant debate particularly on social media. Objective: This study aimed to investigate the topics associated with the public discourse against wearing masks in the United States. We also studied the relationship between the anti-mask discourse on social media and the number of new COVID-19 cases. Methods: We collected a total of 51,170 English tweets between January 1, 2020, and October 27, 2020, by searching for hashtags against wearing masks. We used machine learning techniques to analyze the data collected. We investigated the relationship between the volume of tweets against mask-wearing and the daily volume of new COVID-19 cases using a Pearson correlation analysis between the two-time series. Results: The results and analysis showed that social media could help identify important insights related to wearing masks. The results of topic mining identified 10 categories or themes of user concerns dominated by (1) constitutional rights and freedom of choice; (2) conspiracy theory, population control, and big pharma; and (3) fake news, fake numbers, and fake pandemic. Altogether, these three categories represent almost 65% of the volume of tweets against wearing masks. The relationship between the volume of tweets against wearing masks and newly reported COVID-19 cases depicted a strong correlation wherein the rise in the volume of negative tweets led the rise in the number of new cases by 9 days. Conclusions: These findings demonstrated the potential of mining social media for understanding the public discourse about public health issues such as wearing masks during the COVID-19 pandemic. The results emphasized the relationship between the discourse on social media and the potential impact on real events such as changing the course of the pandemic. Policy makers are advised to proactively address public perception and work on shaping this perception through raising awareness, debunking negative sentiments, and prioritizing early policy intervention toward the most prevalent topics

    AI for social good: social media mining of migration discourse

    Get PDF
    The number of international migrants has steadily increased over the years, and it has become one of the pressing issues in today’s globalized world. Our bibliometric review of around 400 articles on Scopus platform indicates an increased interest in migration-related research in recent times but the extant research is scattered at best. AI-based opinion mining research has predominantly noted negative sentiments across various social media platforms. Additionally, we note that prior studies have mostly considered social media data in the context of a particular event or a specific context. These studies offered a nuanced view of the societal opinions regarding that specific event, but this approach might miss the forest for the trees. Hence, this dissertation makes an attempt to go beyond simplistic opinion mining to identify various latent themes of migrant-related social media discourse. The first essay draws insights from the social psychology literature to investigate two facets of Twitter discourse, i.e., perceptions about migrants and behaviors toward migrants. We identified two prevailing perceptions (i.e., sympathy and antipathy) and two dominant behaviors (i.e., solidarity and animosity) of social media users toward migrants. Additionally, this essay has also fine-tuned the binary hate speech detection task, specifically in the context of migrants, by highlighting the granular differences between the perceptual and behavioral aspects of hate speech. The second essay investigates the journey of migrants or refugees from their home to the host country. We draw insights from Gennep's seminal book, i.e., Les Rites de Passage, to identify four phases of their journey: Arrival of Refugees, Temporal stay at Asylums, Rehabilitation, and Integration of Refugees into the host nation. We consider multimodal tweets for this essay. We find that our proposed theoretical framework was relevant for the 2022 Ukrainian refugee crisis – as a use-case. Our third essay points out that a limited sample of annotated data does not provide insights regarding the prevailing societal-level opinions. Hence, this essay employs unsupervised approaches on large-scale societal datasets to explore the prevailing societal-level sentiments on YouTube platform. Specifically, it probes whether negative comments about migrants get endorsed by other users. If yes, does it depend on who the migrants are – especially if they are cultural others? To address these questions, we consider two datasets: YouTube comments before the 2022 Ukrainian refugee crisis, and during the crisis. Second dataset confirms the Cultural Us hypothesis, and our findings are inconclusive for the first dataset. Our final or fourth essay probes social integration of migrants. The first part of this essay probed the unheard and faint voices of migrants to understand their struggle to settle down in the host economy. The second part of this chapter explored the viability of social media platforms as a viable alternative to expensive commercial job portals for vulnerable migrants. Finally, in our concluding chapter, we elucidated the potential of explainable AI, and briefly pointed out the inherent biases of transformer-based models in the context of migrant-related discourse. To sum up, the importance of migration was recognized as one of the essential topics in the United Nation’s Sustainable Development Goals (SDGs). Thus, this dissertation has attempted to make an incremental contribution to the AI for Social Good discourse

    An Analysis of the Allergy Comments on Twitter Using Data Mining Approach

    Get PDF
    Allergies are one of the most common chronic illnesses in the world. The prevalence of social media allows people to express their opinions and exchange information including symptoms of personal health. Mining those publicly accessible health-related data on social media, such as Twitter, offers a unique approach to get valuable healthcare insights. In this paper, a multi-component data mining framework was developed to collect Twitter data, detect time series patterns, discover topics of interest about allergies, and analyze the contents of tweets. From the extracted 2.2 million tweets in 2019, my experimental results show that allergy-related tweet volume is strongly correlated to the pollen data (r = .699, p < .01). Also, 152 unique topics are identified with a -28.36 perplexity score and a .67 coherence score. Furthermore, many linguistic dimensions such as the sentiment are analyzed to learn about the tweet contents. I consider this to be one of the many studies examining a large-scale social media stream to deeply analyze allergy activities. And with the growing social media, publicly available data such as Twitter posts can be used to support healthcare practitioners and social scientists in better understanding common public opinions, not just allergies.Master of Scienc
    • …
    corecore