Search CORE

307 research outputs found

Text Mining of Stocktwits Data for Predicting Stock Prices

Author: Jaggi Mukul
Khushi Matloob
Mandal Priyanka
Narang Shreya
Naseem Usman
Publication venue: 'MDPI AG'
Publication date: 17/02/2021
Field of study

Stock price prediction can be made more efficient by considering the price fluctuations and understanding the sentiments of people. A limited number of models understand financial jargon or have labelled datasets concerning stock price change. To overcome this challenge, we introduced FinALBERT, an ALBERT based model trained to handle financial domain text classification tasks by labelling Stocktwits text data based on stock price change. We collected Stocktwits data for over ten years for 25 different companies, including the major five FAANG (Facebook, Amazon, Apple, Netflix, Google). These datasets were labelled with three labelling techniques based on stock price changes. Our proposed model FinALBERT is fine-tuned with these labels to achieve optimal results. We experimented with the labelled dataset by training it on traditional machine learning, BERT, and FinBERT models, which helped us understand how these labels behaved with different model architectures. Our labelling method competitive advantage is that it can help analyse the historical data effectively, and the mathematical function can be easily customised to predict stock movement

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Brunel University Research Archive

Document-level sentiment analysis of email data

Author: Liu Sisi
Publication venue
Publication date: 01/01/2020
Field of study

Sisi Liu investigated machine learning methods for Email document sentiment analysis. She developed a systematic framework that has been qualitatively and quantitatively proved to be effective and efficient in identifying sentiment from massive amount of Email data. Analytical results obtained from the document-level Email sentiment analysis framework are beneficial for better decision making in various business settings

ResearchOnline at James Cook University

Working in Detail: How LSTM Hyperparameter Selection Influences Sentiment Analysis Results

Author: Baier Daniel
Derra Nicholas Daniel
Publication venue
Publication date: 16/07/2020
Field of study

Sentiment analysis of written customer reviews is a powerful way to generate knowledge about customer attitudes for future marketing activities. Meanwhile, Deep Learning as the most powerful machine learning method is of particular importance for sentiment analysis tasks. Due to this current relevance, an LSTM network based on a literature review to solve the challenging classification task of the IMDB LargeMovie Dataset is created. Hyperparameters are varied separately from each other to better understand their single influences on the overall model accuracy. Furthermore, we transformed variants with positive impacts into a final model in order to investigate whether the impacts can be cumulated. While preparing the amount of training data and the number of iteration steps resulted in a higher accuracy, pre-trained word vectors and higher network capacity did not work well separately. Even though implementing the variants with positive influences together raised the model´s performance, the improvement was lower than some single variants

KITopen

Detection of Hate-Speech Tweets Based on Deep Learning: A Review

Author: Abdulazeez Adnan Mohsin
Miran Ara Zozan
Publication venue: Universitas Trilogi
Publication date: 27/12/2023
Field of study

Cybercrime, cyberbullying, and hate speech have all increased in conjunction with the use of the internet and social media. The scope of hate speech knows no bounds or organizational or individual boundaries. This disorder affects many people in diverse ways. It can be harsh, offensive, or discriminating depending on the target's gender, race, political opinions, religious intolerance, nationality, human color, disability, ethnicity, sexual orientation, or status as an immigrant. Authorities and academics are investigating new methods for identifying hate speech on social media platforms like Facebook and Twitter. This study adds to the ongoing discussion about creating safer digital spaces while balancing limiting hate speech and protecting freedom of speech. Partnerships between researchers, platform developers, and communities are crucial in creating efficient and ethical content moderation systems on Twitter and other social media sites. For this reason, multiple methodologies, models, and algorithms are employed. This study presents a thorough analysis of hate speech in numerous research publications. Each article has been thoroughly examined, including evaluating the algorithms or methodologies used, databases, classification techniques, and the findings achieved. In addition, comprehensive discussions were held on all the examined papers, explicitly focusing on consuming deep learning techniques to detect hate speech

E-Journal Portal System Universitas Trilogi

A novel Auto-ML Framework for Sarcasm Detection

Author: Imtiaz S.
Imtiaz S.
Publication venue: University of East London
Publication date: 01/01/2022
Field of study

Many domains have sarcasm or verbal irony presented in the text of reviews, tweets, comments, and dialog discussions. The purpose of this research is to classify sarcasm for multiple domains using the deep learning based AutoML framework. The proposed AutoML framework has five models in the model search pipeline, these five models are the combination of convolutional neural network (CNN), Long Short-Term Memory (LSTM), deep neural network (DNN), and Bidirectional Long Short-Term Memory (BiLSTM). The hybrid combination of CNN, LSTM, and DNN models are presented as CNN-LSTM-DNN, LSTM-DNN, BiLSTM-DNN, and CNN-BiLSTM-DNN. This work has proposed the algorithms that contrast polarities between terms and phrases, which are categorized into implicit and explicit incongruity categories. The incongruity and pragmatic features like punctuation, exclamation marks, and others integrated into the AutoML DeepConcat framework models. That integration was possible when the DeepConcat AutoML framework initiate a model search pipeline for five models to achieve better performance. Conceptually, DeepConcat means that model will integrate with generalized features. It was evident that the pretrain model BiLSTM achieved a better performance of 0.98 F1 when compared with the other five model performances. Similarly, the AutoML based BiLSTM-DNN model achieved the best performance of 0.98 F1, which is better than core approaches and existing state-of-the-art Tweeter tweet dataset, Amazon reviews, and dialog discussion comments. The proposed AutoML framework has compared performance metrics F1 and AUC and discovered that F1 is better than AUC. The integration of all feature categories achieved a better performance than the individual category of pragmatic and incongruity features. This research also evaluated the performance of the dropout layer hyperparameter and it achieved better performance than the fixed percentage like 10% of dropout parameter of the AutoML based Bayesian optimization. Proposed AutoML framework DeepConcat evaluated best pretrain models BiLSTM-DNN and CNN-CNN-DNN to transfer knowledge across domains like Amazon reviews and Dialog discussion comments (text) using the last strategy, full layer, and our fade-out freezing strategies. In the transfer learning fade-out strategy outperformed the existing state-of-the-art model BiLSTM-DNN, the performance is 0.98 F1 on tweets, 0.85 F1 on Amazon reviews, and 0.87 F1 on the dialog discussion SCV2-Gen dataset. Further, all strategies with various domains can be compared for the best model selection

UEL Research Repository at University of East London