Search CORE

613 research outputs found

Stock market prediction using machine learning classifiers and social media, news

Author: Alfakeeh A. S.
Alfakeeh A. S.
Alyoubi K. H.
Alyoubi K. H.
Azam M. A.
Azam M. A.
Ghazanfar M.
Ghazanfar M.
Karami A.
Karami A.
Khan W.
Khan W.
Publication venue: Springer
Publication date: 01/01/2020
Field of study

Accurate stock market prediction is of great interest to investors; however, stock markets are driven by volatile factors such as microblogs and news that make it hard to predict stock market index based on merely the historical data. The enormous stock market volatility emphasizes the need to effectively assess the role of external factors in stock prediction. Stock markets can be predicted using machine learning algorithms on information contained in social media and financial news, as this data can change investors’ behavior. In this paper, we use algorithms on social media and financial news data to discover the impact of this data on stock market prediction accuracy for ten subsequent days. For improving performance and quality of predictions, feature selection and spam tweets reduction are performed on the data sets. Moreover, we perform experiments to find such stock markets that are difficult to predict and those that are more influenced by social media and financial news. We compare results of different algorithms to find a consistent classifier. Finally, for achieving maximum prediction accuracy, deep learning is used and some classifiers are ensembled. Our experimental results show that highest prediction accuracies of 80.53% and 75.16% are achieved using social media and financial news, respectively. We also show that New York and Red Hat stock markets are hard to predict, New York and IBM stocks are more influenced by social media, while London and Microsoft stocks by financial news. Random forest classifier is found to be consistent and highest accuracy of 83.22% is achieved by its ensemble

UEL Research Repository at University of East London

Domain knowledge, uncertainty, and parameter constraints

Author: Mao Yi
Publication venue: Georgia Institute of Technology
Publication date: 24/08/2010
Field of study

Ph.D.Committee Chair: Guy Lebanon; Committee Member: Alex Shapiro; Committee Member: Alexander Gray; Committee Member: Chin-Hui Lee; Committee Member: Hongyuan Zh

Scholarly Materials And Research @ Georgia Tech

Sentiment Analysis for Social Media

Author: Iglesias Carlos A.
Moreno Antonio
Publication venue: 'MDPI AG'
Publication date: 09/06/2020
Field of study

Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection

Directory of Open Access Books (DOAB)

Semi-Supervised Learning For Identifying Opinions In Web Content

Author: Yu Ning
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/01/2011
Field of study

Thesis (Ph.D.) - Indiana University, Information Science, 2011Opinions published on the World Wide Web (Web) offer opportunities for detecting personal attitudes regarding topics, products, and services. The opinion detection literature indicates that both a large body of opinions and a wide variety of opinion features are essential for capturing subtle opinion information. Although a large amount of opinion-labeled data is preferable for opinion detection systems, opinion-labeled data is often limited, especially at sub-document levels, and manual annotation is tedious, expensive and error-prone. This shortage of opinion-labeled data is less challenging in some domains (e.g., movie reviews) than in others (e.g., blog posts). While a simple method for improving accuracy in challenging domains is to borrow opinion-labeled data from a non-target data domain, this approach often fails because of the domain transfer problem: Opinion detection strategies designed for one data domain generally do not perform well in another domain. However, while it is difficult to obtain opinion-labeled data, unlabeled user-generated opinion data are readily available. Semi-supervised learning (SSL) requires only limited labeled data to automatically label unlabeled data and has achieved promising results in various natural language processing (NLP) tasks, including traditional topic classification; but SSL has been applied in only a few opinion detection studies. This study investigates application of four different SSL algorithms in three types of Web content: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. SSL algorithms are also evaluated for their effectiveness in sparse data situations and domain adaptation. Research findings suggest that, when there is limited labeled data, SSL is a promising approach for opinion detection in Web content. Although the contributions of SSL varied across data domains, significant improvement was demonstrated for the most challenging data domain--the blogosphere--when a domain transfer-based SSL strategy was implemented

IUScholarWorks (University of Indiana)

Sensing Human Sentiment via Social Media Images: Methodologies and Applications

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Social media refers computer-based technology that allows the sharing of information and building the virtual networks and communities. With the development of internet based services and applications, user can engage with social media via computer and smart mobile devices. In recent years, social media has taken the form of different activities such as social network, business network, text sharing, photo sharing, blogging, etc. With the increasing popularity of social media, it has accumulated a large amount of data which enables understanding the human behavior possible. Compared with traditional survey based methods, the analysis of social media provides us a golden opportunity to understand individuals at scale and in turn allows us to design better services that can tailor to individuals’ needs. From this perspective, we can view social media as sensors, which provides online signals from a virtual world that has no geographical boundaries for the real world individual's activity. One of the key features for social media is social, where social media users actively interact to each via generating content and expressing the opinions, such as post and comment in Facebook. As a result, sentiment analysis, which refers a computational model to identify, extract or characterize subjective information expressed in a given piece of text, has successfully employs user signals and brings many real world applications in different domains such as e-commerce, politics, marketing, etc. The goal of sentiment analysis is to classify a user’s attitude towards various topics into positive, negative or neutral categories based on textual data in social media. However, recently, there is an increasing number of people start to use photos to express their daily life on social media platforms like Flickr and Instagram. Therefore, analyzing the sentiment from visual data is poise to have great improvement for user understanding. In this dissertation, I study the problem of understanding human sentiments from large scale collection of social images based on both image features and contextual social network features. We show that neither visual features nor the textual features are by themselves sufficient for accurate sentiment prediction. Therefore, we provide a way of using both of them, and formulate sentiment prediction problem in two scenarios: supervised and unsupervised. We first show that the proposed framework has flexibility to incorporate multiple modalities of information and has the capability to learn from heterogeneous features jointly with sufficient training data. Secondly, we observe that negative sentiment may related to human mental health issues. Based on this observation, we aim to understand the negative social media posts, especially the post related to depression e.g., self-harm content. Our analysis, the first of its kind, reveals a number of important findings. Thirdly, we extend the proposed sentiment prediction task to a general multi-label visual recognition task to demonstrate the methodology flexibility behind our sentiment analysis model.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

Recommended from our members

Moment-to-moment mood change modelling in mobile mental health network

Author: Alharbi A
Publication venue
Publication date: 01/01/2022
Field of study

Human interests and behaviour change over time and often affected by multiple factors. In particular, human emotions, mood and its constituent processes change and interact over time. Therefore, modelling human behaviour should take into account the changes over time for customization and adaptation of systems to the users’ specific needs. Understanding and assessing the temporal dynamics of mood are critical for modelling human behaviour for both individuals and group of people who share similar habits, life style and personal circumstances. Thus, in order to construct a personalized recommendation for a given user, it is first necessary to have some knowledge about previous user interests and behaviours. However, the challenge of obtaining large-scale data on human emotions has left the most fundamental questions on emotions less explored: How do emotions vary across individuals, evolve over time, and are connected to social ties? We address these questions using a large-scale dataset of users that contains both their users’ interactions with momentary emotions and topical labels. Using this dataset, we identify patterns of human emotions on different levels, starting from the network level, group-level (cluster) and moving towards the user level. At the user-level, we identify how human emotions are distributed and vary over time. In particular, we model changes in mood using multi-level multimodal features including users’ sentimental status, engagement and linguistic queries. We also utilise language models to model and understand patterns of mood change. We model the changes of users’ mental states based on replies and responses to posts over time and predict future states. We find that the future mental states can be predicted with reasonable accuracy given users’ historical posts, current participation features. Our findings form a step forward towards better understand the interplay between user behaviour and mood change exhibited while interacting on mental health network and providing some interpretable summaries that can be used in the future by health experts and individuals and work on possible medical interventions together with clinical experts

Nottingham Trent Institutional Repository (IRep)

Recommended from our members

Quantum Cognitively Motivated Context-Aware Multimodal Representation Learning for Human Language Analysis

Author: Gkoumas Dimitrios
Publication venue
Publication date: 06/05/2021
Field of study

A long-standing goal in the field of Artificial Intelligence (AI) is to develop systems that can perceive and understand human multimodal language. This requires both the consideration of context in the form of surrounding utterances in a conversation, i.e., context modelling, as well as the impact of different modalities (e.g., linguistic, visual acoustic), i.e., multimodal fusion. In the last few years, significant strides have been made towards the interpretation of human language due to simultaneous advancement in deep learning, data gathering and computing infrastructure. AI models have been investigated to either model interactions across distinct modalities, i.e., linguistic, visual and acoustic, or model interactions across parties in a conversation, achieving unprecedented levels of performance. However, AI models are often designed with only performance as their design target, leaving aside other essential factors such as transparency, interpretability, and how humans understand and reason about cognitive states. In line with this observation, in this dissertation, we develop quantum probabilistic neural models and techniques that allow us to capture rational and irrational cognitive biases, without requiring a priori understanding and identification of them. First, we present a comprehensive empirical comparison of state-of-the-art (SOTA) modality fusion strategies for video sentiment analysis. The findings provide us helpful insights into the development of more effective modality fusion models incorporating quantum-inspired components. Second, we introduce an end-to-end complex-valued neural model for video sentiment analysis, simulating quantum procedural steps, outside of physics, into the neural network modelling paradigm. Third, we investigate non-classical correlations across different modalities. In particular, we describe a methodology to model interactions between image and text for an information retrieval scenario. The results provide us with theoretical and empirical insights to develop a transparent end-to-end probabilistic neural model for video emotion detection in conversations, capturing non-classical correlations across distinct modalities. Fourth, we introduce a theoretical framework to model user's cognitive states underlying their multimodal decision perspectives, and propose a methodology to capture interference of modalities in decision making. Overall, we show that our models advance the SOTA on various affective analysis tasks, achieve high transparency due to the mapping to quantum physics meanings, and improve post-hoc interpretability, unearthing useful and explainable knowledge about cross-modal interactions

Open Research Online (The Open University)

Probabilistic latent variable models for knowledge discovery and optimization

Author: Wang Xiaolong
Publication venue
Publication date: 01/05/2017
Field of study

I conduct a systematic study of probabilistic latent variable models (PLVMs) with applications to knowledge discovery and optimization. Probabilistic modeling is a principled means to gain insight of data. By assuming that the observed data are generated from a distribution, we can estimate its density, or the statistics of our interest, by either Maximum Likelihood Estimation or Bayesian inference, depending on whether there is a prior distribution for the parameters of the assumed data distribution. One of the primary goals of various machine learning/data mining models is to reveal the underlying knowledge of observed data. A common practice is to introduce latent variables, which are modeled together with the observations. Such latent variables compute, for example, the class assignments (labels), the cluster membership, as well as other unobserved measurements of the data. Besides, proper exploitation of latent variables facilities the optimization itself, which leads to computationally efficient inference algorithms. In this thesis, I describe a range of applications where latent variables can be leveraged for knowledge discovery and efficient optimization. Works in this thesis demonstrate that PLVMs are a powerful tool for modeling incomplete observations. Through incorporating latent variables and assuming that the observations such as citations, pairwise preferences as well as text are generated following tractable distributions parametrized by the latent variables, PLVMs are flexible and effective to discover knowledge in data mining problems, where the knowledge is mathematically modelled as continuous or discrete values, distributions or uncertainty. In addition, I also explore PLVMs for deriving efficient algorithms. It has been shown that latent variables can be employed as a means for model reduction and facilitates the computation/sampling of intractable distributions. Our results lead to algorithms which take advantage of latent variables in probabilistic models. We conduct experiments against state-of-the-art models and empirical evaluation shows that our proposed approaches improve both learning performance and computational efficiency

Illinois Digital Environment for Access to Learning and Scholarship Repository

On Automatic Music Genre Recognition by Sparse Representation Classification using Auditory Temporal Modulations

Author: Noorzad Pardis
Sturm Bob L.
Publication venue
Publication date: 01/01/2012
Field of study

VBN