8,940 research outputs found

    Comparing and Combining Sentiment Analysis Methods

    Full text link
    Several messages express opinions about events, products, and services, political views or even their author's emotional state and mood. Sentiment analysis has been used in several applications including analysis of the repercussions of events in social networks, analysis of opinions about products and services, and simply to better understand aspects of social communication in Online Social Networks (OSNs). There are multiple methods for measuring sentiments, including lexical-based approaches and supervised machine learning methods. Despite the wide use and popularity of some methods, it is unclear which method is better for identifying the polarity (i.e., positive or negative) of a message as the current literature does not provide a method of comparison among existing methods. Such a comparison is crucial for understanding the potential limitations, advantages, and disadvantages of popular methods in analyzing the content of OSNs messages. Our study aims at filling this gap by presenting comparisons of eight popular sentiment analysis methods in terms of coverage (i.e., the fraction of messages whose sentiment is identified) and agreement (i.e., the fraction of identified sentiments that are in tune with ground truth). We develop a new method that combines existing approaches, providing the best coverage results and competitive agreement. We also present a free Web service called iFeel, which provides an open API for accessing and comparing results across different sentiment methods for a given text.Comment: Proceedings of the first ACM conference on Online social networks (2013) 27-3

    Twitter analysis for depression on social networks based on sentiment and stress

    Get PDF
    Detecting words that express negativity in a social media message is one step towards detecting depressive moods. To understand if a Twitter user could exhibit depression over a period of time, we applied techniques in stages to discover words that are negative in expression. Existing methods either use a single step or a data subset, whereas we applied a multi-step approach which allowed us to identify potential users and then discover the words that expressed negativity by these users. We address some Twitter specific characteristics in our research. One of which is that Twitter data can be very large, hence our desire to be able to process the data efficiently. The other is that due to its enforced character limitation, the style of writing makes interpreting and obtaining the semantic meaning of the words more challenging. Results show that the sentiment of these words can be obtained and scored efficiently as the computation on these dataset were narrowed to only these selected users. We also obtained the stress scores which correlated well with negative sentiment expressed in the content. This work shows that by first identifying users and then using methods to discover words can be a very effective technique

    Emoticon-based Ambivalent Expression: A Hidden Indicator for Unusual Behaviors in Weibo

    Full text link
    Recent decades have witnessed online social media being a big-data window for quantificationally testifying conventional social theories and exploring much detailed human behavioral patterns. In this paper, by tracing the emoticon use in Weibo, a group of hidden "ambivalent users" are disclosed for frequently posting ambivalent tweets containing both positive and negative emotions. Further investigation reveals that this ambivalent expression could be a novel indicator of many unusual social behaviors. For instance, ambivalent users with the female as the majority like to make a sound in midnights or at weekends. They mention their close friends frequently in ambivalent tweets, which attract more replies and thus serve as a more private communication way. Ambivalent users also respond differently to public affairs from others and demonstrate more interests in entertainment and sports events. Moreover, the sentiment shift of words adopted in ambivalent tweets is more evident than usual and exhibits a clear "negative to positive" pattern. The above observations, though being promiscuous seemingly, actually point to the self regulation of negative mood in Weibo, which could find its base from the emotion management theories in sociology but makes an interesting extension to the online environment. Finally, as an interesting corollary, ambivalent users are found connected with compulsive buyers and turn out to be perfect targets for online marketing.Comment: Data sets can be downloaded freely from www.datatang.com/data/47207 or http://pan.baidu.com/s/1mg67cbm. Any issues feel free to contact [email protected]

    Tweeting your Destiny: Profiling Users in the Twitter Landscape around an Online Game

    Full text link
    Social media has become a major communication channel for communities centered around video games. Consequently, social media offers a rich data source to study online communities and the discussions evolving around games. Towards this end, we explore a large-scale dataset consisting of over 1 million tweets related to the online multiplayer shooter Destiny and spanning a time period of about 14 months using unsupervised clustering and topic modelling. Furthermore, we correlate Twitter activity of over 3,000 players with their playtime. Our results contribute to the understanding of online player communities by identifying distinct player groups with respect to their Twitter characteristics, describing subgroups within the Destiny community, and uncovering broad topics of community interest.Comment: Accepted at IEEE Conference on Games 201

    Assessing candidate preference through web browsing history

    Full text link
    Predicting election outcomes is of considerable interest to candidates, political scientists, and the public at large. We propose the use of Web browsing history as a new indicator of candidate preference among the electorate, one that has potential to overcome a number of the drawbacks of election polls. However, there are a number of challenges that must be overcome to effectively use Web browsing for assessing candidate preference—including the lack of suitable ground truth data and the heterogeneity of user populations in time and space. We address these challenges, and show that the resulting methods can shed considerable light on the dynamics of voters’ candidate preferences in ways that are difficult to achieve using polls.Accepted manuscrip

    An Ambivalent Embrace: Service Needs and Gaps for Asian Immigrants in New Destinations

    Full text link
    Asian immigrants to the U.S. are settling in new destinations, but there has been little research on their health care and social service needs. Our analysis of Census data to identify cities with the fastest Asian immigrant population growth (1990-2000) yielded 33 smaller cities in 13 states. The cities ranged in population from 7,677 to 86,660; were spread across 13 states in the Northeast, South, and Midwest regions of the US; and varied widely demographically. Pilot surveys conducted in 2009 indicated that, although many residents had positive attitudes towards immigrants, many were also concerned about job competition and dilution of American culture. Respondents reported a number of immigrant-targeted services but also service gaps and intergroup violence. We characterize smaller new destination cities\u27 mixed response to their fast-growing immigrant populations as an ambivalent embrace. Service gaps may be related to small city size and relatively small Asian immigrant population size, despite rapid population growth. Funding shortages were also cited as obstacles to cities\u27 responsiveness, suggesting the importance of state and federal government aid

    QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns

    Full text link
    Given the extremely large pool of events and stories available, media outlets need to focus on a subset of issues and aspects to convey to their audience. Outlets are often accused of exhibiting a systematic bias in this selection process, with different outlets portraying different versions of reality. However, in the absence of objective measures and empirical evidence, the direction and extent of systematicity remains widely disputed. In this paper we propose a framework based on quoting patterns for quantifying and characterizing the degree to which media outlets exhibit systematic bias. We apply this framework to a massive dataset of news articles spanning the six years of Obama's presidency and all of his speeches, and reveal that a systematic pattern does indeed emerge from the outlet's quoting behavior. Moreover, we show that this pattern can be successfully exploited in an unsupervised prediction setting, to determine which new quotes an outlet will select to broadcast. By encoding bias patterns in a low-rank space we provide an analysis of the structure of political media coverage. This reveals a latent media bias space that aligns surprisingly well with political ideology and outlet type. A linguistic analysis exposes striking differences across these latent dimensions, showing how the different types of media outlets portray different realities even when reporting on the same events. For example, outlets mapped to the mainstream conservative side of the latent space focus on quotes that portray a presidential persona disproportionately characterized by negativity.Comment: To appear in the Proceedings of WWW 2015. 11pp, 10 fig. Interactive visualization, data, and other info available at http://snap.stanford.edu/quotus
    • …
    corecore