8,889 research outputs found

    Multimodal Graph Learning for Modeling Emerging Pandemics with Big Data

    Full text link
    Accurate forecasting and analysis of emerging pandemics play a crucial role in effective public health management and decision-making. Traditional approaches primarily rely on epidemiological data, overlooking other valuable sources of information that could act as sensors or indicators of pandemic patterns. In this paper, we propose a novel framework called MGL4MEP that integrates temporal graph neural networks and multi-modal data for learning and forecasting. We incorporate big data sources, including social media content, by utilizing specific pre-trained language models and discovering the underlying graph structure among users. This integration provides rich indicators of pandemic dynamics through learning with temporal graph neural networks. Extensive experiments demonstrate the effectiveness of our framework in pandemic forecasting and analysis, outperforming baseline methods across different areas, pandemic situations, and prediction horizons. The fusion of temporal graph learning and multi-modal data enables a comprehensive understanding of the pandemic landscape with less time lag, cheap cost, and more potential information indicators

    Cannabidiol tweet miner: a framework for identifying misinformation In CBD tweets.

    Get PDF
    As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and drug surveillance. By leveraging Twitter data, NLP can offer invaluable insights into public perceptions around CBD, as well as the marketing tactics employed by those marketing such loosely-regulated substances to the general public. Given the lack of comprehensive clinical CBD testing, the various health claims made by CBD sellers regarding their products are highly dubious and potentially perilous, as is evident from the ongoing COVID-19 misinformation. It is therefore critically important to efficiently identify unsupportable claims to guide public health policy and action. To this end, we present our proposed framework, the Cannabidiol Tweet Miner (CBD-TM), which utilizes advanced natural language processing (NLP) techniques, including text mining and sentiment analysis, to analyze the similarities and differences between commercial and personal tweets that mention CBD. CBD-TM enables us to identify conditions typically associated with commercial CBD advertising, or conditions not associated with positive sentiment, that are also absent from personal conversations. Through our technical contributions, including NLP, text mining, and sentiment analysis, we can effectively uncover areas where the public may be misled by CBD sellers. Since the rise in popularity of CBD, advertisements making bold claims about its benefits have become increasingly prevalent. The COVID-19 pandemic created a new opportunity for sellers to promote and sell products that purportedly treat and/or prevent the virus, with CBD being one of them. Although the U.S. Food and Drug Administration issued multiple warnings to CBD sellers, this type of misinformation still persists. In response, we have extended the CBD-TM framework with an additional layer of tweet classification designed to identify tweets that make potentially misleading claims about CBD\u27s efficacy in treating and/or preventing COVID-19. Our approach harnesses modern NLP algorithms, utilizing a transformer-based language model to establish the semantic relationship between statements extracted from the FDA\u27s website that contain false information and tweets conveying similar false claims. Our technical contributions build upon the impressive performance of deep language models in various natural language processing and understanding tasks. Specifically, we employ transfer learning via pre-trained deep language models, enabling us to achieve improved misinformation identification in tweets, even with relatively small training sets. Furthermore, this extension of CBD-TM can be easily adapted to detect other forms of misinformation. Through our innovative use of NLP techniques and algorithms, we can more effectively identify and combat false and potentially harmful claims related to CBD and COVID-19, as well as other forms of misinformation. As the conversations surrounding CBD on Twitter evolve over time, concept drift can occur, leading to changes in the topics being discussed. We observed significant changes within the CBD Twitter data stream with the emergence of COVID-19, introducing a new medical condition associated with CBD that would not have been discussed in conversations prior to the pandemic. These shifts in conversation introduce concept drift into CBD-TM, which has the potential to negatively impact our tweet classification models. Therefore, it is crucial to identify when such concept drift occurs to maintain the accuracy of our models. To this end, we propose an innovative approach for identifying potential changes within social network streams, allowing us to determine how and when these conversations evolve over time. Our approach leverages a BERT-based topic model, which can effectively capture how conversations related to CBD change over time. By incorporating advanced NLP techniques and algorithms, we are able to better understand the changes in topic that occur within the CBD Twitter data stream, allowing us to more effectively manage concept drift in CBD-TM. Our technical contributions enable us to maintain the accuracy and effectiveness of our tweet classification models, ensuring that we can continue to identify and address potentially harmful misinformation related to CBD

    Toxicity in Evolving Twitter Topics - Employing a novel Dynamic Topic volution Model (DyTEM) onTwitter data

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThis thesis presents an extensive investigation into the evolution of topics and their association with speech toxicity on Twitter, based on a large corpus of tweets, providing crucial insights for monitoring online discourse and potentially informing interventions to combat toxic behavior in digital communities. A Dynamic Topic Evolution Model (DyTEM) is introduced, constructed by combining static Topic Modelling techniques and sentence embeddings through the state-of-the-art sentence transformer, sBERT. The DyTEM, tested and validated on a substantial sample of tweets, is represented as a directed graph, encapsulating the inherent dynamism of Twitter discussions. For validating the consistency of DyTEM and providing guidance for hyperparameter selection, a novel, hashtag-based validation method is proposed. The analysis identifies and scrutinizes five distinct Topic Transition Types: Topic Stagnation, Topic Merge, Topic Split, Topic Disappearance, and Topic Emergence. A speech toxicity classification model is employed to delve into the toxicity dynamics within topic evolution. A standout finding of this study is the positive correlation between topic popularity and its toxicity, implying that trending or viral topics tend to contain more inflammatory speech. This insight, along with the methodologies introduced in this study, contributes significantly to the broader understanding of digital discourse dynamics and could guide future strategies aimed at fostering healthier and more constructive online spaces

    VIBE: Topic-Driven Temporal Adaptation for Twitter Classification

    Full text link
    Language features are evolving in real-world social media, resulting in the deteriorating performance of text classification in dynamics. To address this challenge, we study temporal adaptation, where models trained on past data are tested in the future. Most prior work focused on continued pretraining or knowledge updating, which may compromise their performance on noisy social media data. To tackle this issue, we reflect feature change via modeling latent topic evolution and propose a novel model, VIBE: Variational Information Bottleneck for Evolutions. Concretely, we first employ two Information Bottleneck (IB) regularizers to distinguish past and future topics. Then, the distinguished topics work as adaptive features via multi-task training with timestamp and class label prediction. In adaptive learning, VIBE utilizes retrieved unlabeled data from online streams created posterior to training data time. Substantial Twitter experiments on three classification tasks show that our model, with only 3% of data, significantly outperforms previous state-of-the-art continued-pretraining methods.Comment: accepted by EMNLP 202

    Evolution and Stylistic Characteristics of Ancient Chinese Stone Carving Decoration LSTM-DL Approach with Image Visualization

    Get PDF
    In recent years, advancements in data analysis techniques and deep learning algorithms have revolutionized the field of art and cultural studies. Ancient Chinese stone carving decoration holds significant historical and cultural value, reflecting the artistic and stylistic evolution of different periods. This paper explored the Weighted Long Short-Term Memory Deep Learning (WLSTM – DL) evolution and stylistic characteristics of ancient Chinese stone carving decoration through the application of image visualization techniques combined with a Long Short-Term Memory (LSTM) time-series deep learning architecture. The WLSTM-DL model uses the optimized feature selection with the grasshopper optimization for the feature extraction and selection. By analyzing a comprehensive dataset of stone carving images from different periods, the WLSTM-DL model captures the temporal relationships and patterns in the evolution of stone carving decoration. The model utilizes LSTM, a specialized deep-learning architecture for time-series data, to uncover stylistic characteristics and identify significant changes over time. The findings of this study provide valuable insights into the evolution and stylistic development of ancient Chinese stone carving decoration. The application of image visualization techniques and the WLSTM-DL model showcase the potential of data analysis and deep learning in uncovering hidden narratives and understanding the intricate details of ancient artworks
    • …
    corecore