8,924 research outputs found
Topic-dependent sentiment analysis of financial blogs
While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full documentclassification and that word-based approaches perform better than sentence-based or paragraph-based approaches
Multilingual Twitter Sentiment Classification: The Role of Human Annotators
What are the limits of automated Twitter sentiment classification? We analyze
a large set of manually labeled tweets in different languages, use them as
training data, and construct automated classification models. It turns out that
the quality of classification models depends much more on the quality and size
of training data than on the type of the model trained. Experimental results
indicate that there is no statistically significant difference between the
performance of the top classification models. We quantify the quality of
training data by applying various annotator agreement measures, and identify
the weakest points of different datasets. We show that the model performance
approaches the inter-annotator agreement when the size of the training set is
sufficiently large. However, it is crucial to regularly monitor the self- and
inter-annotator agreements since this improves the training datasets and
consequently the model performance. Finally, we show that there is strong
evidence that humans perceive the sentiment classes (negative, neutral, and
positive) as ordered
Exploring the use of paragraph-level annotations for sentiment analysis of financial blogs
In this paper we describe our work in the area of topic-based sentiment analysis in the domain of financial blogs. We explore the use of paragraph-level and document-level annotations, examining how additional information from paragraph-level annotations can be used to increase the accuracy of document-level sentiment classification. We acknowledge the additional effort required to provide these paragraph-level annotations, and so we compare these findings against an automatic means of generating topic-specific sub-documents
- ā¦