8,889 research outputs found
Multimodal Graph Learning for Modeling Emerging Pandemics with Big Data
Accurate forecasting and analysis of emerging pandemics play a crucial role
in effective public health management and decision-making. Traditional
approaches primarily rely on epidemiological data, overlooking other valuable
sources of information that could act as sensors or indicators of pandemic
patterns. In this paper, we propose a novel framework called MGL4MEP that
integrates temporal graph neural networks and multi-modal data for learning and
forecasting. We incorporate big data sources, including social media content,
by utilizing specific pre-trained language models and discovering the
underlying graph structure among users. This integration provides rich
indicators of pandemic dynamics through learning with temporal graph neural
networks. Extensive experiments demonstrate the effectiveness of our framework
in pandemic forecasting and analysis, outperforming baseline methods across
different areas, pandemic situations, and prediction horizons. The fusion of
temporal graph learning and multi-modal data enables a comprehensive
understanding of the pandemic landscape with less time lag, cheap cost, and
more potential information indicators
Cannabidiol tweet miner: a framework for identifying misinformation In CBD tweets.
As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and drug surveillance. By leveraging Twitter data, NLP can offer invaluable insights into public perceptions around CBD, as well as the marketing tactics employed by those marketing such loosely-regulated substances to the general public. Given the lack of comprehensive clinical CBD testing, the various health claims made by CBD sellers regarding their products are highly dubious and potentially perilous, as is evident from the ongoing COVID-19 misinformation. It is therefore critically important to efficiently identify unsupportable claims to guide public health policy and action. To this end, we present our proposed framework, the Cannabidiol Tweet Miner (CBD-TM), which utilizes advanced natural language processing (NLP) techniques, including text mining and sentiment analysis, to analyze the similarities and differences between commercial and personal tweets that mention CBD. CBD-TM enables us to identify conditions typically associated with commercial CBD advertising, or conditions not associated with positive sentiment, that are also absent from personal conversations. Through our technical contributions, including NLP, text mining, and sentiment analysis, we can effectively uncover areas where the public may be misled by CBD sellers. Since the rise in popularity of CBD, advertisements making bold claims about its benefits have become increasingly prevalent. The COVID-19 pandemic created a new opportunity for sellers to promote and sell products that purportedly treat and/or prevent the virus, with CBD being one of them. Although the U.S. Food and Drug Administration issued multiple warnings to CBD sellers, this type of misinformation still persists. In response, we have extended the CBD-TM framework with an additional layer of tweet classification designed to identify tweets that make potentially misleading claims about CBD\u27s efficacy in treating and/or preventing COVID-19. Our approach harnesses modern NLP algorithms, utilizing a transformer-based language model to establish the semantic relationship between statements extracted from the FDA\u27s website that contain false information and tweets conveying similar false claims. Our technical contributions build upon the impressive performance of deep language models in various natural language processing and understanding tasks. Specifically, we employ transfer learning via pre-trained deep language models, enabling us to achieve improved misinformation identification in tweets, even with relatively small training sets. Furthermore, this extension of CBD-TM can be easily adapted to detect other forms of misinformation. Through our innovative use of NLP techniques and algorithms, we can more effectively identify and combat false and potentially harmful claims related to CBD and COVID-19, as well as other forms of misinformation. As the conversations surrounding CBD on Twitter evolve over time, concept drift can occur, leading to changes in the topics being discussed. We observed significant changes within the CBD Twitter data stream with the emergence of COVID-19, introducing a new medical condition associated with CBD that would not have been discussed in conversations prior to the pandemic. These shifts in conversation introduce concept drift into CBD-TM, which has the potential to negatively impact our tweet classification models. Therefore, it is crucial to identify when such concept drift occurs to maintain the accuracy of our models. To this end, we propose an innovative approach for identifying potential changes within social network streams, allowing us to determine how and when these conversations evolve over time. Our approach leverages a BERT-based topic model, which can effectively capture how conversations related to CBD change over time. By incorporating advanced NLP techniques and algorithms, we are able to better understand the changes in topic that occur within the CBD Twitter data stream, allowing us to more effectively manage concept drift in CBD-TM. Our technical contributions enable us to maintain the accuracy and effectiveness of our tweet classification models, ensuring that we can continue to identify and address potentially harmful misinformation related to CBD
Toxicity in Evolving Twitter Topics - Employing a novel Dynamic Topic volution Model (DyTEM) onTwitter data
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThis thesis presents an extensive investigation into the evolution of topics and their association with
speech toxicity on Twitter, based on a large corpus of tweets, providing crucial insights for monitoring
online discourse and potentially informing interventions to combat toxic behavior in digital
communities. A Dynamic Topic Evolution Model (DyTEM) is introduced, constructed by combining
static Topic Modelling techniques and sentence embeddings through the state-of-the-art sentence
transformer, sBERT. The DyTEM, tested and validated on a substantial sample of tweets, is represented
as a directed graph, encapsulating the inherent dynamism of Twitter discussions. For validating the
consistency of DyTEM and providing guidance for hyperparameter selection, a novel, hashtag-based
validation method is proposed. The analysis identifies and scrutinizes five distinct Topic Transition
Types: Topic Stagnation, Topic Merge, Topic Split, Topic Disappearance, and Topic Emergence. A
speech toxicity classification model is employed to delve into the toxicity dynamics within topic
evolution. A standout finding of this study is the positive correlation between topic popularity and its
toxicity, implying that trending or viral topics tend to contain more inflammatory speech. This insight,
along with the methodologies introduced in this study, contributes significantly to the broader
understanding of digital discourse dynamics and could guide future strategies aimed at fostering
healthier and more constructive online spaces
VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
Language features are evolving in real-world social media, resulting in the
deteriorating performance of text classification in dynamics. To address this
challenge, we study temporal adaptation, where models trained on past data are
tested in the future. Most prior work focused on continued pretraining or
knowledge updating, which may compromise their performance on noisy social
media data. To tackle this issue, we reflect feature change via modeling latent
topic evolution and propose a novel model, VIBE: Variational Information
Bottleneck for Evolutions. Concretely, we first employ two Information
Bottleneck (IB) regularizers to distinguish past and future topics. Then, the
distinguished topics work as adaptive features via multi-task training with
timestamp and class label prediction. In adaptive learning, VIBE utilizes
retrieved unlabeled data from online streams created posterior to training data
time. Substantial Twitter experiments on three classification tasks show that
our model, with only 3% of data, significantly outperforms previous
state-of-the-art continued-pretraining methods.Comment: accepted by EMNLP 202
Evolution and Stylistic Characteristics of Ancient Chinese Stone Carving Decoration LSTM-DL Approach with Image Visualization
In recent years, advancements in data analysis techniques and deep learning algorithms have revolutionized the field of art and cultural studies. Ancient Chinese stone carving decoration holds significant historical and cultural value, reflecting the artistic and stylistic evolution of different periods. This paper explored the Weighted Long Short-Term Memory Deep Learning (WLSTM – DL) evolution and stylistic characteristics of ancient Chinese stone carving decoration through the application of image visualization techniques combined with a Long Short-Term Memory (LSTM) time-series deep learning architecture. The WLSTM-DL model uses the optimized feature selection with the grasshopper optimization for the feature extraction and selection. By analyzing a comprehensive dataset of stone carving images from different periods, the WLSTM-DL model captures the temporal relationships and patterns in the evolution of stone carving decoration. The model utilizes LSTM, a specialized deep-learning architecture for time-series data, to uncover stylistic characteristics and identify significant changes over time. The findings of this study provide valuable insights into the evolution and stylistic development of ancient Chinese stone carving decoration. The application of image visualization techniques and the WLSTM-DL model showcase the potential of data analysis and deep learning in uncovering hidden narratives and understanding the intricate details of ancient artworks
- …