492 research outputs found

    Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

    Get PDF
    Objective: The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods: We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results: Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion: This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums

    Knowledge Modelling and Learning through Cognitive Networks

    Get PDF
    One of the most promising developments in modelling knowledge is cognitive network science, which aims to investigate cognitive phenomena driven by the networked, associative organization of knowledge. For example, investigating the structure of semantic memory via semantic networks has illuminated how memory recall patterns influence phenomena such as creativity, memory search, learning, and more generally, knowledge acquisition, exploration, and exploitation. In parallel, neural network models for artificial intelligence (AI) are also becoming more widespread as inferential models for understanding which features drive language-related phenomena such as meaning reconstruction, stance detection, and emotional profiling. Whereas cognitive networks map explicitly which entities engage in associative relationships, neural networks perform an implicit mapping of correlations in cognitive data as weights, obtained after training over labelled data and whose interpretation is not immediately evident to the experimenter. This book aims to bring together quantitative, innovative research that focuses on modelling knowledge through cognitive and neural networks to gain insight into mechanisms driving cognitive processes related to knowledge structuring, exploration, and learning. The book comprises a variety of publication types, including reviews and theoretical papers, empirical research, computational modelling, and big data analysis. All papers here share a commonality: they demonstrate how the application of network science and AI can extend and broaden cognitive science in ways that traditional approaches cannot

    Cannabidiol tweet miner: a framework for identifying misinformation In CBD tweets.

    Get PDF
    As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and drug surveillance. By leveraging Twitter data, NLP can offer invaluable insights into public perceptions around CBD, as well as the marketing tactics employed by those marketing such loosely-regulated substances to the general public. Given the lack of comprehensive clinical CBD testing, the various health claims made by CBD sellers regarding their products are highly dubious and potentially perilous, as is evident from the ongoing COVID-19 misinformation. It is therefore critically important to efficiently identify unsupportable claims to guide public health policy and action. To this end, we present our proposed framework, the Cannabidiol Tweet Miner (CBD-TM), which utilizes advanced natural language processing (NLP) techniques, including text mining and sentiment analysis, to analyze the similarities and differences between commercial and personal tweets that mention CBD. CBD-TM enables us to identify conditions typically associated with commercial CBD advertising, or conditions not associated with positive sentiment, that are also absent from personal conversations. Through our technical contributions, including NLP, text mining, and sentiment analysis, we can effectively uncover areas where the public may be misled by CBD sellers. Since the rise in popularity of CBD, advertisements making bold claims about its benefits have become increasingly prevalent. The COVID-19 pandemic created a new opportunity for sellers to promote and sell products that purportedly treat and/or prevent the virus, with CBD being one of them. Although the U.S. Food and Drug Administration issued multiple warnings to CBD sellers, this type of misinformation still persists. In response, we have extended the CBD-TM framework with an additional layer of tweet classification designed to identify tweets that make potentially misleading claims about CBD\u27s efficacy in treating and/or preventing COVID-19. Our approach harnesses modern NLP algorithms, utilizing a transformer-based language model to establish the semantic relationship between statements extracted from the FDA\u27s website that contain false information and tweets conveying similar false claims. Our technical contributions build upon the impressive performance of deep language models in various natural language processing and understanding tasks. Specifically, we employ transfer learning via pre-trained deep language models, enabling us to achieve improved misinformation identification in tweets, even with relatively small training sets. Furthermore, this extension of CBD-TM can be easily adapted to detect other forms of misinformation. Through our innovative use of NLP techniques and algorithms, we can more effectively identify and combat false and potentially harmful claims related to CBD and COVID-19, as well as other forms of misinformation. As the conversations surrounding CBD on Twitter evolve over time, concept drift can occur, leading to changes in the topics being discussed. We observed significant changes within the CBD Twitter data stream with the emergence of COVID-19, introducing a new medical condition associated with CBD that would not have been discussed in conversations prior to the pandemic. These shifts in conversation introduce concept drift into CBD-TM, which has the potential to negatively impact our tweet classification models. Therefore, it is crucial to identify when such concept drift occurs to maintain the accuracy of our models. To this end, we propose an innovative approach for identifying potential changes within social network streams, allowing us to determine how and when these conversations evolve over time. Our approach leverages a BERT-based topic model, which can effectively capture how conversations related to CBD change over time. By incorporating advanced NLP techniques and algorithms, we are able to better understand the changes in topic that occur within the CBD Twitter data stream, allowing us to more effectively manage concept drift in CBD-TM. Our technical contributions enable us to maintain the accuracy and effectiveness of our tweet classification models, ensuring that we can continue to identify and address potentially harmful misinformation related to CBD

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-164).There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration.by Jingjing Liu.Ph.D

    A review on Natural Language Processing Models for COVID-19 research

    Get PDF
    This survey paper reviews Natural Language Processing Models and their use in COVID-19 research in two main areas. Firstly, a range of transformer-based biomedical pretrained language models are evaluated using the BLURB benchmark. Secondly, models used in sentiment analysis surrounding COVID-19 vaccination are evaluated. We filtered literature curated from various repositories such as PubMed and Scopus and reviewed 27 papers. When evaluated using the BLURB benchmark, the novel T-BPLM BioLinkBERT gives groundbreaking results by incorporating document link knowledge and hyperlinking into its pretraining. Sentiment analysis of COVID-19 vaccination through various Twitter API tools has shown the public’s sentiment towards vaccination to be mostly positive. Finally, we outline some limitations and potential solutions to drive the research community to improve the models used for NLP tasks

    Extracting health information from social media

    Get PDF
    Social media platforms with large user bases such as Twitter, Reddit, and online health forums contain a rich amount of health-related information. Despite the advances achieved in natural language processing (NLP), extracting actionable health information from social media still remains challenging. This thesis proposes a set of methodologies that can be used to extract medical concepts and health information from social media that is related to drugs, symptoms, and side-effects. We first develop a rule-based relationship extraction system that utilises a set of dictionaries and linguistic rules in order to extract structured information from patients’ posts on online health forums. We then automate the concept extraction pro-cess via; i) a supervised algorithm that has been trained with a small labelled dataset, and ii) an iterative semi-supervised algorithm capable of learning new sentences and concepts. We test our machine-learning pipeline on a COVID-19 case study that involves patient authored social media posts. We develop a novel triage and diagnostic approach to extract symptoms, severity, and prevalence of the disease rather than to provide any actionable decisions at the individual level. Finally, we extend our approach by investigating the potential benefit of incorporating dictionary information into a neural network architecture for natural language processing

    Deep learning approach to sentiment analysis in health and well-being

    Get PDF
    Sentiment analysis, also known as opinion mining, is an area of natural language processing which focuses on the classification of the sentiment that is expressed in a written document. Sentiment analysis has found applications in various domains including finance, politics, and health. This thesis is focused on sentiment analysis in the domain of health and well-being. An extensive systematic literature review was carried out to establish the state of the art in sentiment analysis in this domain. This systematic review provides evidence that the state-of-the-art results in sentiment analysis in the domain of health and well-being lags behind that in other domains. Additionally, it revealed that deep learning has not been used to classify the sentiment within the aforementioned domain. Furthermore, we performed a study and showed that the language that is used within the health and well-being domain is biased towards the negative sentiment. Aspect-based sentiment analysis refines the focus of sentiment analysis by classifying the sentiment associated with a specific aspect. Subsequently, we focus specifically on aspect-based sentiment analysis. To support it within the domain of health and well-being we created a dataset consisting of drug reviews, where the aspects were automatically annotated by matching concepts from the Unified Medical Language System. We have successfully shown that graph convolution can effectively utilise the context, represented with syntactic dependencies, to determine the intended sentiment of inherently negative aspects and consequently close the performance gap regardless of the domain. The advent of transformer-based architectures initiated a breakthrough in various tasks in natural language processing, including sentiment analysis. There-fore, we presented an approach to fine-tuning a transformer-based language model for the specific task of aspect-based sentiment analysis. The findings show the evidence that transformer-based models account for syntactic dependencies when classifying the sentiment of the given aspect
    • …
    corecore