2,681 research outputs found
Improving Polarity Classification for Financial News Using Semantic Similarity Techniques
This article discusses polarity classification for financial news articles. The proposed Semantic Sentiment Analyser makes use of semantic similarity techniques, sentiment composition rules, and the Positivity/Negativity (P/N) ratio in performing polarity classification. An experiment was conducted to compare the performance of three semantic similarity metrics namely HSO, LESK, and LIN to find the semantically similar pair of word as the input word. The best similarity technique (HSO) is incorporated into the sentiment analyser to find the possible polarity carrier from the analysed text before performing polarity classification. The performance of the proposed Semantic Sentiment Analyser was evaluated using a set of manually annotated financial news articles. The results obtained from the experiment showed that the proposed SSA was able to achieve an F-Score of 90.89% for all cases classification
Topic-dependent sentiment analysis of financial blogs
While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full documentclassification and that word-based approaches perform better than sentence-based or paragraph-based approaches
Combination of Domain Knowledge and Deep Learning for Sentiment Analysis of Short and Informal Messages on Social Media
Sentiment analysis has been emerging recently as one of the major natural
language processing (NLP) tasks in many applications. Especially, as social
media channels (e.g. social networks or forums) have become significant sources
for brands to observe user opinions about their products, this task is thus
increasingly crucial. However, when applied with real data obtained from social
media, we notice that there is a high volume of short and informal messages
posted by users on those channels. This kind of data makes the existing works
suffer from many difficulties to handle, especially ones using deep learning
approaches. In this paper, we propose an approach to handle this problem. This
work is extended from our previous work, in which we proposed to combine the
typical deep learning technique of Convolutional Neural Networks with domain
knowledge. The combination is used for acquiring additional training data
augmentation and a more reasonable loss function. In this work, we further
improve our architecture by various substantial enhancements, including
negation-based data augmentation, transfer learning for word embeddings, the
combination of word-level embeddings and character-level embeddings, and using
multitask learning technique for attaching domain knowledge rules in the
learning process. Those enhancements, specifically aiming to handle short and
informal messages, help us to enjoy significant improvement in performance once
experimenting on real datasets.Comment: A Preprint of an article accepted for publication by Inderscience in
IJCVR on September 201
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
Active learning in annotating micro-blogs dealing with e-reputation
Elections unleash strong political views on Twitter, but what do people
really think about politics? Opinion and trend mining on micro blogs dealing
with politics has recently attracted researchers in several fields including
Information Retrieval and Machine Learning (ML). Since the performance of ML
and Natural Language Processing (NLP) approaches are limited by the amount and
quality of data available, one promising alternative for some tasks is the
automatic propagation of expert annotations. This paper intends to develop a
so-called active learning process for automatically annotating French language
tweets that deal with the image (i.e., representation, web reputation) of
politicians. Our main focus is on the methodology followed to build an original
annotated dataset expressing opinion from two French politicians over time. We
therefore review state of the art NLP-based ML algorithms to automatically
annotate tweets using a manual initiation step as bootstrap. This paper focuses
on key issues about active learning while building a large annotated data set
from noise. This will be introduced by human annotators, abundance of data and
the label distribution across data and entities. In turn, we show that Twitter
characteristics such as the author's name or hashtags can be considered as the
bearing point to not only improve automatic systems for Opinion Mining (OM) and
Topic Classification but also to reduce noise in human annotations. However, a
later thorough analysis shows that reducing noise might induce the loss of
crucial information.Comment: Journal of Interdisciplinary Methodologies and Issues in Science -
Vol 3 - Contextualisation digitale - 201
- …