120 research outputs found

    BERT-based Financial Sentiment Index and LSTM-based Stock Return Predictability

    Full text link
    Traditional sentiment construction in finance relies heavily on the dictionary-based approach, with a few exceptions using simple machine learning techniques such as Naive Bayes classifier. While the current literature has not yet invoked the rapid advancement in the natural language processing, we construct in this research a textual-based sentiment index using a novel model BERT recently developed by Google, especially for three actively trading individual stocks in Hong Kong market with hot discussion on Weibo.com. On the one hand, we demonstrate a significant enhancement of applying BERT in sentiment analysis when compared with existing models. On the other hand, by combining with the other two existing methods commonly used on building the sentiment index in the financial literature, i.e., option-implied and market-implied approaches, we propose a more general and comprehensive framework for financial sentiment analysis, and further provide convincing outcomes for the predictability of individual stock return for the above three stocks using LSTM (with a feature of a nonlinear mapping), in contrast to the dominating econometric methods in sentiment influence analysis that are all of a nature of linear regression.Comment: 10 pages, 1 figure, 5 tables, submitted to NeurIPS 2019, under revie

    Implementing BERT and fine-tuned RobertA to detect AI generated news by ChatGPT

    Full text link
    The abundance of information on social media has increased the necessity of accurate real-time rumour detection. Manual techniques of identifying and verifying fake news generated by AI tools are impracticable and time-consuming given the enormous volume of information generated every day. This has sparked an increase in interest in creating automated systems to find fake news on the Internet. The studies in this research demonstrate that the BERT and RobertA models with fine-tuning had the best success in detecting AI generated news. With a score of 98%, tweaked RobertA in particular showed excellent precision. In conclusion, this study has shown that neural networks can be used to identify bogus news AI generation news created by ChatGPT. The RobertA and BERT models' excellent performance indicates that these models can play a critical role in the fight against misinformation

    Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

    Full text link
    Rumor spreaders are increasingly utilizing multimedia content to attract the attention and trust of news consumers. Though quite a few rumor detection models have exploited the multi-modal data, they seldom consider the inconsistent semantics between images and texts, and rarely spot the inconsistency among the post contents and background knowledge. In addition, they commonly assume the completeness of multiple modalities and thus are incapable of handling handle missing modalities in real-life scenarios. Motivated by the intuition that rumors in social media are more likely to have inconsistent semantics, a novel Knowledge-guided Dual-consistency Network is proposed to detect rumors with multimedia contents. It uses two consistency detection subnetworks to capture the inconsistency at the cross-modal level and the content-knowledge level simultaneously. It also enables robust multi-modal representation learning under different missing visual modality conditions, using a special token to discriminate between posts with visual modality and posts without visual modality. Extensive experiments on three public real-world multimedia datasets demonstrate that our framework can outperform the state-of-the-art baselines under both complete and incomplete modality conditions. Our codes are available at https://github.com/MengzSun/KDCN

    Mapping (Dis-)Information Flow about the MH17 Plane Crash

    Get PDF
    Digital media enables not only fast sharing of information, but also disinformation. One prominent case of an event leading to circulation of disinformation on social media is the MH17 plane crash. Studies analysing the spread of information about this event on Twitter have focused on small, manually annotated datasets, or used proxys for data annotation. In this work, we examine to what extent text classifiers can be used to label data for subsequent content analysis, in particular we focus on predicting pro-Russian and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though we find that a neural classifier improves over a hashtag based baseline, labeling pro-Russian and pro-Ukrainian content with high precision remains a challenging problem. We provide an error analysis underlining the difficulty of the task and identify factors that might help improve classification in future work. Finally, we show how the classifier can facilitate the annotation task for human annotators

    Detecting Mental Distresses Using Social Behavior Analysis in the Context of COVID-19: A Survey

    Get PDF
    Online social media provides a channel for monitoring people\u27s social behaviors from which to infer and detect their mental distresses. During the COVID-19 pandemic, online social networks were increasingly used to express opinions, views, and moods due to the restrictions on physical activities and in-person meetings, leading to a significant amount of diverse user-generated social media content. This offers a unique opportunity to examine how COVID-19 changed global behaviors regarding its ramifications on mental well-being. In this article, we surveyed the literature on social media analysis for the detection of mental distress, with a special emphasis on the studies published since the COVID-19 outbreak. We analyze relevant research and its characteristics and propose new approaches to organizing the large amount of studies arising from this emerging research area, thus drawing new views, insights, and knowledge for interested communities. Specifically, we first classify the studies in terms of feature extraction types, language usage patterns, aesthetic preferences, and online behaviors. We then explored various methods (including machine learning and deep learning techniques) for detecting mental health problems. Building upon the in-depth review, we present our findings and discuss future research directions and niche areas in detecting mental health problems using social media data. We also elaborate on the challenges of this fast-growing research area, such as technical issues in deploying such systems at scale as well as privacy and ethical concerns

    Combating Misinformation on Social Media by Exploiting Post and User-level Information

    Get PDF
    Misinformation on social media has far-reaching negative impact on the public and society. Given the large number of real-time posts on social media, traditional manual-based methods of misinformation detection are not viable. Therefore, computational approaches (i.e., data-driven) have been proposed to combat online misinformation. Previous work on computational misinformation analysis has mainly focused on employing natural language processing (NLP) techniques to develop misinformation detection systems at the post level (e.g., using text and propagation network). However, it is also important to exploit information at the user level in social media, as users play a significant role (e.g., post, diffuse, refute, etc.) in spreading misinformation. The main aim of this thesis is to: (i) develop novel methods for analysing the behaviour of users who are likely to share or refute misinformation in social media; and (ii) predict and characterise unreliable stories with high popularity in social media. To this end, we first highlight the limitations in the evaluation protocol in popular rumour detection benchmarks on the post level and propose to evaluate such systems using chronological splits (i.e., considering temporal concept drift). On the user level, we introduce two novel tasks on (i) early detecting Twitter users that are likely to share misinformation before they actually do it; and (ii) identifying and characterising active citizens who refute misinformation in social media. Finally, we develop a new dataset to enable the study on predicting the future popularity (e.g. number of likes, replies, retweets) of false rumour on Weibo
    • …
    corecore