21 research outputs found
Instruct-FinGPT: Financial Sentiment Analysis by Instruction Tuning of General-Purpose Large Language Models
Sentiment analysis is a vital tool for uncovering insights from financial
articles, news, and social media, shaping our understanding of market
movements. Despite the impressive capabilities of large language models (LLMs)
in financial natural language processing (NLP), they still struggle with
accurately interpreting numerical values and grasping financial context,
limiting their effectiveness in predicting financial sentiment. In this paper,
we introduce a simple yet effective instruction tuning approach to address
these issues. By transforming a small portion of supervised financial sentiment
analysis data into instruction data and fine-tuning a general-purpose LLM with
this method, we achieve remarkable advancements in financial sentiment
analysis. In the experiment, our approach outperforms state-of-the-art
supervised sentiment analysis models, as well as widely used LLMs like ChatGPT
and LLaMAs, particularly in scenarios where numerical understanding and
contextual comprehension are vital.Comment: FinLLM Symposium at IJCAI 202
Transformer-based Subject Entity Detection in Wikipedia Listings
In tasks like question answering or text summarisation, it is essential to
have background knowledge about the relevant entities. The information about
entities - in particular, about long-tail or emerging entities - in publicly
available knowledge graphs like DBpedia or CaLiGraph is far from complete. In
this paper, we present an approach that exploits the semi-structured nature of
listings (like enumerations and tables) to identify the main entities of the
listing items (i.e., of entries and rows). These entities, which we call
subject entities, can be used to increase the coverage of knowledge graphs. Our
approach uses a transformer network to identify subject entities at the
token-level and surpasses an existing approach in terms of performance while
being bound by fewer limitations. Due to a flexible input format, it is
applicable to any kind of listing and is, unlike prior work, not dependent on
entity boundaries as input. We demonstrate our approach by applying it to the
complete Wikipedia corpus and extracting 40 million mentions of subject
entities with an estimated precision of 71% and recall of 77%. The results are
incorporated in the most recent version of CaLiGraph.Comment: Published at Deep Learning for Knowledge Graphs workshop (DL4KG) at
International Semantic Web Conference 2022 (ISWC 2022
A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis
Multimodal sentiment analysis has attracted increasing attention with broad
application prospects. The existing methods focuses on single modality, which
fails to capture the social media content for multiple modalities. Moreover, in
multi-modal learning, most of the works have focused on simply combining the
two modalities, without exploring the complicated correlations between them.
This resulted in dissatisfying performance for multimodal sentiment
classification. Motivated by the status quo, we propose a Deep Multi-Level
Attentive network, which exploits the correlation between image and text
modalities to improve multimodal learning. Specifically, we generate the
bi-attentive visual map along the spatial and channel dimensions to magnify
CNNs representation power. Then we model the correlation between the image
regions and semantics of the word by extracting the textual features related to
the bi-attentive visual features by applying semantic attention. Finally,
self-attention is employed to automatically fetch the sentiment-rich multimodal
features for the classification. We conduct extensive evaluations on four
real-world datasets, namely, MVSA-Single, MVSA-Multiple, Flickr, and Getty
Images, which verifies the superiority of our method.Comment: 11 pages, 7 figure
Role of sentiment classification in sentiment analysis: a survey
Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results
Assessing causality among topics and sentiments: The case of the G20 discussion on Twitter
Although the identification of topics and sentiments from social media content has attracted substantial research, little work has been carried out on the extraction of causal relationships among those topics and sentiments. This article proposes a methodology aimed at building a causal graph where nodes represent topics and emotions extracted from social media users? posts. To illustrate the proposed methodology, we collected a large multi-year dataset of tweets related to different editions of the G20 summit, which was locally indexed for further analysis. Topic-relevant queries are crafted from phrases extracted by experts from G20 output documents on four main recurring topics, namely government, society, environment and health and economics. Subsequently, sentiments are identified on the retrieved tweets using a lexicon based on Plutchik?s wheel of emotions. Finally, a causality test that uses stochastic dominance is applied to build a causal graph among topics and emotions by exploiting the asymmetries of explaining a variable from other variables. The applied causality discovery process relies on observational data only and does not require any assumptions of linearity, parametric definitions or temporal precedence. In our analysis, we observe that although the time series of topics and emotions always show high correlation coefficients, stochastic causality provides a means to tell apart causal relationships from other forms of associations. The proposed methodology can be applied to better understand social behaviour on social media, offering support to decision and policy making and their communication by government leaders.Fil: Fonseca, Mauro. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación; ArgentinaFil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; ArgentinaFil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Soto, Axel Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentin
Thematic analysis of big data in financial institutions using NLP techniques with a cloud computing perspective : a systematic literature review
This literature review explores the existing work and practices in applying thematic analysis natural language processing techniques to financial data in cloud environments. This work aims to improve two of the five Vs of the big data system. We used the PRISMA approach (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) for the review. We analyzed the research papers published over the last 10 years about the topic in question using a keywordbased search and bibliometric analysis. The systematic literature review was conducted in multiple phases, and filters were applied to exclude papers based on the title and abstract initially, then based on the methodology/conclusion, and, finally, after reading the full text. The remaining papers were then considered and are discussed here. We found that automated data discovery methods can be augmented by applying an NLP-based thematic analysis on the financial data in cloud environments. This can help identify the correct classification/categorization and measure data quality for a sentiment analysis
HOW DO LARGE STAKES INFLUENCE BITCOIN PERFORMANCE? EVIDENCE FROM THE MT.GOX LIQUIDATION CASE
Bitcoin as the first and still most important decentralized cryptocurrency has gained wide popu-larity due to the steep rise of its price during the second half of 2017. Because of its digital na-ture, Bitcoin cannot be valuated exclusively with fundamental approaches, which is why factors such as investor sentiment have become a common alternative to capture its performance. In this work, we studied whether and how the sale of Bitcoins from the insolvency assets of Mt.Gox, which represent about 1.1% of the current global total, relates to Bitcoin price movements. We used social media sentiment analysis of Twitter data to examine how investors are influenced in their decision to buy or sell Bitcoin when confronted with the trade actions of Nobuaki Koba-yashi, the trustee in charge of the Mt.Gox case. We built a vector error correction model to ana-lyze the long-run relationship between cointegrated variables. Our analysis confirms the posi-tive association of Bitcoin performance with positive Twitter sentiment and tweet volume and the negative association with negative sentiment. We further found empirical evidence that Mt.Gox selloff events have a lasting negative impact on the Bitcoin price and that we can measure this effect by Twitter sentiment and tweet volume