9,829 research outputs found
Enhanced news sentiment analysis using deep learning methods
We explore the predictive power of historical news sentiments based on financial market performance to forecast financial news sentiments. We define news sentiments based on stock price returns averaged over one minute right after a news article has been released. If the stock price exhibits positive (negative) return, we classify the news article released just prior to the observed stock return as positive (negative). We use Wikipedia and Gigaword five corpus articles from 2014 and we apply the global vectors for word representation method to this corpus to create word vectors to use as inputs into the deep learning TensorFlow network. We analyze high-frequency (intraday) Thompson Reuters News Archive as well as the high-frequency price tick history of the Dow Jones Industrial Average (DJIA 30) Index individual stocks for the period between 1/1/2003 and 12/30/2013. We apply a combination of deep learning methodologies of recurrent neural network with long short-term memory units to train the Thompson Reuters News Archive Data from 2003 to 2012, and we test the forecasting power of our method on 2013 News Archive data. We find that the forecasting accuracy of our methodology improves when we switch from random selection of positive and negative news to selecting the news with highest positive scores as positive news and news with highest negative scores as negative news to create our training data set.Published versio
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
Active learning in annotating micro-blogs dealing with e-reputation
Elections unleash strong political views on Twitter, but what do people
really think about politics? Opinion and trend mining on micro blogs dealing
with politics has recently attracted researchers in several fields including
Information Retrieval and Machine Learning (ML). Since the performance of ML
and Natural Language Processing (NLP) approaches are limited by the amount and
quality of data available, one promising alternative for some tasks is the
automatic propagation of expert annotations. This paper intends to develop a
so-called active learning process for automatically annotating French language
tweets that deal with the image (i.e., representation, web reputation) of
politicians. Our main focus is on the methodology followed to build an original
annotated dataset expressing opinion from two French politicians over time. We
therefore review state of the art NLP-based ML algorithms to automatically
annotate tweets using a manual initiation step as bootstrap. This paper focuses
on key issues about active learning while building a large annotated data set
from noise. This will be introduced by human annotators, abundance of data and
the label distribution across data and entities. In turn, we show that Twitter
characteristics such as the author's name or hashtags can be considered as the
bearing point to not only improve automatic systems for Opinion Mining (OM) and
Topic Classification but also to reduce noise in human annotations. However, a
later thorough analysis shows that reducing noise might induce the loss of
crucial information.Comment: Journal of Interdisciplinary Methodologies and Issues in Science -
Vol 3 - Contextualisation digitale - 201
- …