8,940 research outputs found
Comparing and Combining Sentiment Analysis Methods
Several messages express opinions about events, products, and services,
political views or even their author's emotional state and mood. Sentiment
analysis has been used in several applications including analysis of the
repercussions of events in social networks, analysis of opinions about products
and services, and simply to better understand aspects of social communication
in Online Social Networks (OSNs). There are multiple methods for measuring
sentiments, including lexical-based approaches and supervised machine learning
methods. Despite the wide use and popularity of some methods, it is unclear
which method is better for identifying the polarity (i.e., positive or
negative) of a message as the current literature does not provide a method of
comparison among existing methods. Such a comparison is crucial for
understanding the potential limitations, advantages, and disadvantages of
popular methods in analyzing the content of OSNs messages. Our study aims at
filling this gap by presenting comparisons of eight popular sentiment analysis
methods in terms of coverage (i.e., the fraction of messages whose sentiment is
identified) and agreement (i.e., the fraction of identified sentiments that are
in tune with ground truth). We develop a new method that combines existing
approaches, providing the best coverage results and competitive agreement. We
also present a free Web service called iFeel, which provides an open API for
accessing and comparing results across different sentiment methods for a given
text.Comment: Proceedings of the first ACM conference on Online social networks
(2013) 27-3
Twitter analysis for depression on social networks based on sentiment and stress
Detecting words that express negativity in a social media message is one step towards detecting depressive moods. To understand if a Twitter user could exhibit depression over a period of time, we applied techniques in stages to discover words that are negative in expression. Existing methods either use a single step or a data subset, whereas we applied a multi-step approach which allowed us to identify potential users and then discover the words that expressed negativity by these users. We address some Twitter specific characteristics in our research. One of which is that Twitter data can be very large, hence our desire to be able to process the data efficiently. The other is that due to its enforced character limitation, the style of writing makes interpreting and obtaining the semantic meaning of the words more challenging. Results show that the sentiment of these words can be obtained and scored efficiently as the computation on these dataset were narrowed to only these selected users. We also obtained the stress scores which correlated well with negative sentiment expressed in the content. This work shows that by first identifying users and then using methods to discover words can be a very effective technique
Emoticon-based Ambivalent Expression: A Hidden Indicator for Unusual Behaviors in Weibo
Recent decades have witnessed online social media being a big-data window for
quantificationally testifying conventional social theories and exploring much
detailed human behavioral patterns. In this paper, by tracing the emoticon use
in Weibo, a group of hidden "ambivalent users" are disclosed for frequently
posting ambivalent tweets containing both positive and negative emotions.
Further investigation reveals that this ambivalent expression could be a novel
indicator of many unusual social behaviors. For instance, ambivalent users with
the female as the majority like to make a sound in midnights or at weekends.
They mention their close friends frequently in ambivalent tweets, which attract
more replies and thus serve as a more private communication way. Ambivalent
users also respond differently to public affairs from others and demonstrate
more interests in entertainment and sports events. Moreover, the sentiment
shift of words adopted in ambivalent tweets is more evident than usual and
exhibits a clear "negative to positive" pattern. The above observations, though
being promiscuous seemingly, actually point to the self regulation of negative
mood in Weibo, which could find its base from the emotion management theories
in sociology but makes an interesting extension to the online environment.
Finally, as an interesting corollary, ambivalent users are found connected with
compulsive buyers and turn out to be perfect targets for online marketing.Comment: Data sets can be downloaded freely from www.datatang.com/data/47207
or http://pan.baidu.com/s/1mg67cbm. Any issues feel free to contact
[email protected]
Tweeting your Destiny: Profiling Users in the Twitter Landscape around an Online Game
Social media has become a major communication channel for communities
centered around video games. Consequently, social media offers a rich data
source to study online communities and the discussions evolving around games.
Towards this end, we explore a large-scale dataset consisting of over 1 million
tweets related to the online multiplayer shooter Destiny and spanning a time
period of about 14 months using unsupervised clustering and topic modelling.
Furthermore, we correlate Twitter activity of over 3,000 players with their
playtime. Our results contribute to the understanding of online player
communities by identifying distinct player groups with respect to their Twitter
characteristics, describing subgroups within the Destiny community, and
uncovering broad topics of community interest.Comment: Accepted at IEEE Conference on Games 201
Assessing candidate preference through web browsing history
Predicting election outcomes is of considerable interest to candidates, political scientists, and the public at large. We propose the use of Web browsing history as a new indicator of candidate preference among the electorate, one that has potential to overcome a number of the drawbacks of election polls. However, there are a number of challenges that must be overcome to effectively use Web browsing for assessing candidate preference—including the lack of suitable ground truth data and the heterogeneity of user populations in time and space. We address these challenges, and show that the resulting methods can shed considerable light on the dynamics of voters’ candidate preferences in ways that are difficult to achieve using polls.Accepted manuscrip
An Ambivalent Embrace: Service Needs and Gaps for Asian Immigrants in New Destinations
Asian immigrants to the U.S. are settling in new destinations, but there has been little research on their health care and social service needs. Our analysis of Census data to identify cities with the fastest Asian immigrant population growth (1990-2000) yielded 33 smaller cities in 13 states. The cities ranged in population from 7,677 to 86,660; were spread across 13 states in the Northeast, South, and Midwest regions of the US; and varied widely demographically. Pilot surveys conducted in 2009 indicated that, although many residents had positive attitudes towards immigrants, many were also concerned about job competition and dilution of American culture. Respondents reported a number of immigrant-targeted services but also service gaps and intergroup violence. We characterize smaller new destination cities\u27 mixed response to their fast-growing immigrant populations as an ambivalent embrace. Service gaps may be related to small city size and relatively small Asian immigrant population size, despite rapid population growth. Funding shortages were also cited as obstacles to cities\u27 responsiveness, suggesting the importance of state and federal government aid
QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns
Given the extremely large pool of events and stories available, media outlets
need to focus on a subset of issues and aspects to convey to their audience.
Outlets are often accused of exhibiting a systematic bias in this selection
process, with different outlets portraying different versions of reality.
However, in the absence of objective measures and empirical evidence, the
direction and extent of systematicity remains widely disputed.
In this paper we propose a framework based on quoting patterns for
quantifying and characterizing the degree to which media outlets exhibit
systematic bias. We apply this framework to a massive dataset of news articles
spanning the six years of Obama's presidency and all of his speeches, and
reveal that a systematic pattern does indeed emerge from the outlet's quoting
behavior. Moreover, we show that this pattern can be successfully exploited in
an unsupervised prediction setting, to determine which new quotes an outlet
will select to broadcast. By encoding bias patterns in a low-rank space we
provide an analysis of the structure of political media coverage. This reveals
a latent media bias space that aligns surprisingly well with political ideology
and outlet type. A linguistic analysis exposes striking differences across
these latent dimensions, showing how the different types of media outlets
portray different realities even when reporting on the same events. For
example, outlets mapped to the mainstream conservative side of the latent space
focus on quotes that portray a presidential persona disproportionately
characterized by negativity.Comment: To appear in the Proceedings of WWW 2015. 11pp, 10 fig. Interactive
visualization, data, and other info available at
http://snap.stanford.edu/quotus
- …