9,422 research outputs found
Topic-dependent sentiment analysis of financial blogs
While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full documentclassification and that word-based approaches perform better than sentence-based or paragraph-based approaches
Connotation Frames: A Data-Driven Investigation
Through a particular choice of a predicate (e.g., "x violated y"), a writer
can subtly connote a range of implied sentiments and presupposed facts about
the entities x and y: (1) writer's perspective: projecting x as an
"antagonist"and y as a "victim", (2) entities' perspective: y probably dislikes
x, (3) effect: something bad happened to y, (4) value: y is something valuable,
and (5) mental state: y is distressed by the event. We introduce connotation
frames as a representation formalism to organize these rich dimensions of
connotation using typed relations. First, we investigate the feasibility of
obtaining connotative labels through crowdsourcing experiments. We then present
models for predicting the connotation frames of verb predicates based on their
distributional word representations and the interplay between different types
of connotative relations. Empirical results confirm that connotation frames can
be induced from various data sources that reflect how people use language and
give rise to the connotative meanings. We conclude with analytical results that
show the potential use of connotation frames for analyzing subtle biases in
online news media.Comment: 11 pages, published in Proceedings of ACL 201
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
Exploring the use of paragraph-level annotations for sentiment analysis of financial blogs
In this paper we describe our work in the area of topic-based sentiment analysis in the domain of financial blogs. We explore the use of paragraph-level and document-level annotations, examining how additional information from paragraph-level annotations can be used to increase the accuracy of document-level sentiment classification. We acknowledge the additional effort required to provide these paragraph-level annotations, and so we compare these findings against an automatic means of generating topic-specific sub-documents
YouTube AV 50K: An Annotated Corpus for Comments in Autonomous Vehicles
With one billion monthly viewers, and millions of users discussing and
sharing opinions, comments below YouTube videos are rich sources of data for
opinion mining and sentiment analysis. We introduce the YouTube AV 50K dataset,
a freely-available collections of more than 50,000 YouTube comments and
metadata below autonomous vehicle (AV)-related videos. We describe its creation
process, its content and data format, and discuss its possible usages.
Especially, we do a case study of the first self-driving car fatality to
evaluate the dataset, and show how we can use this dataset to better understand
public attitudes toward self-driving cars and public reactions to the accident.
Future developments of the dataset are also discussed.Comment: in Proceedings of the Thirteenth International Joint Symposium on
Artificial Intelligence and Natural Language Processing (iSAI-NLP 2018
- ā¦