115 research outputs found

    VaxInsight: an artificial intelligence system to access large-scale public perceptions of vaccination from social media

    Get PDF
    Vaccination is considered one of the greatest public health achievements of the 20th century. A high vaccination rate is required to reduce the prevalence and incidence of vaccine-preventable diseases. However, in the last two decades, there has been a significant and increasing number of people who refuse or delay getting vaccinated and who prohibit their children from receiving vaccinations. Importantly, under-vaccination is associated with infectious disease outbreaks. A good understanding of public perceptions regarding vaccinations is important if we are to develop effective vaccination promotion strategies. Traditional methods of research, such as surveys, suffer limitations that impede our understanding of public perceptions, including resources cost, delays in data collection and analysis, especially in large samples. The popularity of social media (e.g. Twitter), combined with advances in artificial intelligence algorithms (e.g. natural language processing, deep learning), open up new avenues for accessing large scale data on public perceptions related to vaccinations. This dissertation reports on an original and systematic effort to develop artificial intelligence algorithms that will increase our ability to use Twitter discussions to understand vaccine-related perceptions and intentions. The research is framed within the perspectives offered by grounded behavior change theories. Tweets concerning the human papillomavirus (HPV) vaccine were used to accomplish three major aims: 1) Develop a deep learning-based system to better understand public perceptions of the HPV vaccine, using Twitter data and behavior change theories; 2) Develop a deep learning-based system to infer Twitter users’ demographic characteristics (e.g. gender and home location) and investigate demographic differences in public perceptions of the HPV vaccine; 3) Develop a web-based interactive visualization system to monitor real-time Twitter discussions of the HPV vaccine. For Aim 1, the bi-directional long short-term memory (LSTM) network with attention mechanism outperformed traditional machine learning and competitive deep learning algorithms in mapping Twitter discussions to the theoretical constructs of behavior change theories. Domain-specific embedding trained on HPV vaccine-related Twitter corpus by fastText algorithms further improved performance on some tasks. Time series analyses revealed evolving trends of public perceptions regarding the HPV vaccine. For Aim 2, the character-based convolutional neural network model achieved favorable state-of-the-art performance in Twitter gender inference on a Public Author Profiling challenge. The trained models then were applied to the Twitter corpus and they identified gender differences in public perceptions of the HPV vaccine. The findings on gender differences were largely consistent with previous survey-based studies. For the Twitter users’ home location inference, geo-tagging was framed as text classification tasks that resulted in a character-based recurrent neural network model. The model outperformed machine learning and deep learning baselines on home location tagging. Interstate variations in public perceptions of the HPV vaccine also were identified. For Aim 3, a prototype web-based interactive dashboard, VaxInsight, was built to synthesize HPV vaccine-related Twitter discussions in a comprehendible format. The usability test of VaxInsight showed high usability of the system. Notably, this maybe the first study to use deep learning algorithms to understand Twitter discussions of the HPV vaccine within the perspective of grounded behavior change theories. VaxInsight is also the first system that allows users to explore public health beliefs of vaccine related topics from Twitter. Thus, the present research makes original and systematical contributions to medical informatics by combining cutting-edge artificial intelligence algorithms and grounded behavior change theories. This work also builds a foundation for the next generation of real-time public health surveillance and research

    Trivalent Influenza Vaccine Adverse Event Analysis Based On MedDRA System Organ Classes Using VAERS Data

    Get PDF
    We studied serious reports following influnza vaccine from VAERS database in year 2011. Our statistical analyses revealed differences of reactions among different age groups and between genders. The results may lead to additional studies to uncover factors contributing to the individual differences in susceptibility to influenza infection

    Large language models in biomedical natural language processing: benchmarks, baselines, and recommendations

    Full text link
    Biomedical literature is growing rapidly, making it challenging to curate and extract knowledge manually. Biomedical natural language processing (BioNLP) techniques that can automatically extract information from biomedical literature help alleviate this burden. Recently, large Language Models (LLMs), such as GPT-3 and GPT-4, have gained significant attention for their impressive performance. However, their effectiveness in BioNLP tasks and impact on method development and downstream users remain understudied. This pilot study (1) establishes the baseline performance of GPT-3 and GPT-4 at both zero-shot and one-shot settings in eight BioNLP datasets across four applications: named entity recognition, relation extraction, multi-label document classification, and semantic similarity and reasoning, (2) examines the errors produced by the LLMs and categorized the errors into three types: missingness, inconsistencies, and unwanted artificial content, and (3) provides suggestions for using LLMs in BioNLP applications. We make the datasets, baselines, and results publicly available to the community via https://github.com/qingyu-qc/gpt_bionlp_benchmark

    Deep learning in clinical natural language processing: a methodical review.

    Get PDF
    OBJECTIVE: This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. MATERIALS AND METHODS: We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. RESULTS: DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a long tail of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. DISCUSSION: Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). CONCLUSION: Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field
    • …
    corecore