573,790 research outputs found
Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks
Sentiment analysis of online user generated content is important for many
social media analytics tasks. Researchers have largely relied on textual
sentiment analysis to develop systems to predict political elections, measure
economic indicators, and so on. Recently, social media users are increasingly
using images and videos to express their opinions and share their experiences.
Sentiment analysis of such large scale visual content can help better extract
user sentiments toward events or topics, such as those in image tweets, so that
prediction of sentiment from visual content is complementary to textual
sentiment analysis. Motivated by the needs in leveraging large scale yet noisy
training data to solve the extremely challenging problem of image sentiment
analysis, we employ Convolutional Neural Networks (CNN). We first design a
suitable CNN architecture for image sentiment analysis. We obtain half a
million training samples by using a baseline sentiment algorithm to label
Flickr images. To make use of such noisy machine labeled data, we employ a
progressive strategy to fine-tune the deep network. Furthermore, we improve the
performance on Twitter images by inducing domain transfer with a small number
of manually labeled Twitter images. We have conducted extensive experiments on
manually labeled Twitter images. The results show that the proposed CNN can
achieve better performance in image sentiment analysis than competing
algorithms.Comment: 9 pages, 5 figures, AAAI 201
Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages
Sentiment analysis in low-resource languages suffers from a lack of annotated
corpora to estimate high-performing models. Machine translation and bilingual
word embeddings provide some relief through cross-lingual sentiment approaches.
However, they either require large amounts of parallel data or do not
sufficiently capture sentiment information. We introduce Bilingual Sentiment
Embeddings (BLSE), which jointly represent sentiment information in a source
and target language. This model only requires a small bilingual lexicon, a
source-language corpus annotated for sentiment, and monolingual word embeddings
for each language. We perform experiments on three language combinations
(Spanish, Catalan, Basque) for sentence-level cross-lingual sentiment
classification and find that our model significantly outperforms
state-of-the-art methods on four out of six experimental setups, as well as
capturing complementary information to machine translation. Our analysis of the
resulting embedding space provides evidence that it represents sentiment
information in the resource-poor target language without any annotated data in
that language.Comment: Accepted to ACL 2018 (Long Papers
Automatic Dream Sentiment Analysis
In this position paper, we propose a first step toward automatic analysis of sentiments in dreams. 100 dreams were sampled from a dream bank created for a normative study of dreams. Two human judges assigned a score to describe dream sentiments. We ran four baseline algorithms in an attempt to automate the rating of sentiments in dreams. Particularly, we compared the General Inquirer (GI) tool, the Linguistic Inquiry and Word Count (LIWC), a weighted version of the GI lexicon and of the HM lexicon and a standard bag-of-words. We show that machine learning allows automating the human judgment with accuracy superior to majority class choice
Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold
Sentiment analysis over Twitter offers organisations and individuals a fast and effective way to monitor the publics' feelings towards them and their competitors. To assess the performance of sentiment analysis methods over Twitter a small set of evaluation datasets have been released in the last few years. In this paper we present an overview of eight publicly available and manually annotated evaluation datasets for Twitter sentiment analysis. Based on this review, we show that a common limitation of most of these datasets, when assessing sentiment analysis at target (entity) level, is the lack of distinctive sentiment annotations among the tweets and the entities contained in them. For example, the tweet "I love iPhone, but I hate iPad" can be annotated with a mixed sentiment label, but the entity iPhone within this tweet should be annotated with a positive sentiment label. Aiming to overcome this limitation, and to complement current evaluation datasets, we present STS-Gold, a new evaluation dataset where tweets and targets (entities) are annotated individually and therefore may present different sentiment labels. This paper also provides a comparative study of the various datasets along several dimensions including: total number of tweets, vocabulary size and sparsity. We also investigate the pair-wise correlation among these dimensions as well as their correlations to the sentiment classification performance on different datasets
- …
