43,536 research outputs found

    Identifying Prominent Life Events on Twitter

    Get PDF
    Social media is a common place for people to post and share digital reflections of their life events, including major events such as getting married, having children, graduating, etc. Although the creation of such posts is straightforward, the identification of events on online media remains a challenge. Much research in recent years focused on extracting major events from Twitter, such as earthquakes, storms, and floods. This paper however, targets the automatic detection of personal life events, focusing on five events that psychologists found to be the most prominent in people lives. We define a variety of features (user, content, semantic and interaction) to capture the characteristics of those life events and present the results of several classification methods to automatically identify these events in Twitter. Our proposed classification methods obtain results between 0.84 and 0.92 F1-measure for the different types of life events. A novel contribution of this work also lies in a new corpus of tweets, which has been annotated by using crowdsourcing and that constitutes, to the best of our knowledge, the first publicly available dataset for the automatic identification of personal life events from Twitter

    Computational Content Analysis of Negative Tweets for Obesity, Diet, Diabetes, and Exercise

    Full text link
    Social media based digital epidemiology has the potential to support faster response and deeper understanding of public health related threats. This study proposes a new framework to analyze unstructured health related textual data via Twitter users' post (tweets) to characterize the negative health sentiments and non-health related concerns in relations to the corpus of negative sentiments, regarding Diet Diabetes Exercise, and Obesity (DDEO). Through the collection of 6 million Tweets for one month, this study identified the prominent topics of users as it relates to the negative sentiments. Our proposed framework uses two text mining methods, sentiment analysis and topic modeling, to discover negative topics. The negative sentiments of Twitter users support the literature narratives and the many morbidity issues that are associated with DDEO and the linkage between obesity and diabetes. The framework offers a potential method to understand the publics' opinions and sentiments regarding DDEO. More importantly, this research provides new opportunities for computational social scientists, medical experts, and public health professionals to collectively address DDEO-related issues.Comment: The 2017 Annual Meeting of the Association for Information Science and Technology (ASIST

    $1.00 per RT #BostonMarathon #PrayForBoston: analyzing fake content on Twitter

    Get PDF
    This study found that 29% of the most viral content on Twitter during the Boston bombing crisis were rumors and fake content.AbstractOnline social media has emerged as one of the prominent channels for dissemination of information during real world events. Malicious content is posted online during events, which can result in damage, chaos and monetary losses in the real world. We analyzed one such media i.e. Twitter, for content generated during the event of Boston Marathon Blasts, that occurred on April, 15th, 2013. A lot of fake content and malicious profiles originated on Twitter network during this event. The aim of this work is to perform in-depth characterization of what factors influenced in malicious content and profiles becoming viral. Our results showed that 29% of the most viral content on Twitter, during the Boston crisis were rumors and fake content; while 51% was generic opinions and comments; and rest was true information. We found that large number of users with high social reputation and verified accounts were responsible for spreading the fake content. Next, we used regression prediction model, to verify that, overall impact of all users who propagate the fake content at a given time, can be used to estimate the growth of that content in future. Many malicious accounts were created on Twitter during the Boston event, that were later suspended by Twitter. We identified over six thousand such user profiles, we observed that the creation of such profiles surged considerably right after the blasts occurred. We identified closed community structure and star formation in the interaction network of these suspended profiles amongst themselves

    Crisis Communication Patterns in Social Media during Hurricane Sandy

    Full text link
    Hurricane Sandy was one of the deadliest and costliest of hurricanes over the past few decades. Many states experienced significant power outage, however many people used social media to communicate while having limited or no access to traditional information sources. In this study, we explored the evolution of various communication patterns using machine learning techniques and determined user concerns that emerged over the course of Hurricane Sandy. The original data included ~52M tweets coming from ~13M users between October 14, 2012 and November 12, 2012. We run topic model on ~763K tweets from top 4,029 most frequent users who tweeted about Sandy at least 100 times. We identified 250 well-defined communication patterns based on perplexity. Conversations of most frequent and relevant users indicate the evolution of numerous storm-phase (warning, response, and recovery) specific topics. People were also concerned about storm location and time, media coverage, and activities of political leaders and celebrities. We also present each relevant keyword that contributed to one particular pattern of user concerns. Such keywords would be particularly meaningful in targeted information spreading and effective crisis communication in similar major disasters. Each of these words can also be helpful for efficient hash-tagging to reach target audience as needed via social media. The pattern recognition approach of this study can be used in identifying real time user needs in future crises

    Exploring Russian Cyberspace: Digitally-Mediated Collective Action and the Networked Public Sphere

    Get PDF
    This paper summarizes the major findings of a three-year research project to investigate the Internet's impact on Russian politics, media and society. We employed multiple methods to study online activity: the mapping and study of the structure, communities and content of the blogosphere; an analogous mapping and study of Twitter; content analysis of different media sources using automated and human-based evaluation approaches; and a survey of bloggers; augmented by infrastructure mapping, interviews and background research. We find the emergence of a vibrant and diverse networked public sphere that constitutes an independent alternative to the more tightly controlled offline media and political space, as well as the growing use of digital platforms in social mobilization and civic action. Despite various indirect efforts to shape cyberspace into an environment that is friendlier towards the government, we find that the Russian Internet remains generally open and free, although the current degree of Internet freedom is in no way a prediction of the future of this contested space

    #Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds

    Full text link
    Compounding of natural language units is a very common phenomena. In this paper, we show, for the first time, that Twitter hashtags which, could be considered as correlates of such linguistic units, undergo compounding. We identify reasons for this compounding and propose a prediction model that can identify with 77.07% accuracy if a pair of hashtags compounding in the near future (i.e., 2 months after compounding) shall become popular. At longer times T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This technique has strong implications to trending hashtag recommendation since newly formed hashtag compounds can be recommended early, even before the compounding has taken place. Further, humans can predict compounds with an overall accuracy of only 48.7% (treated as baseline). Notably, while humans can discriminate the relatively easier cases, the automatic framework is successful in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2016

    Scraping the Social? Issues in live social research

    Get PDF
    What makes scraping methodologically interesting for social and cultural research? This paper seeks to contribute to debates about digital social research by exploring how a ‘medium-specific’ technique for online data capture may be rendered analytically productive for social research. As a device that is currently being imported into social research, scraping has the capacity to re-structure social research, and this in at least two ways. Firstly, as a technique that is not native to social research, scraping risks to introduce ‘alien’ methodological assumptions into social research (such as an pre-occupation with freshness). Secondly, to scrape is to risk importing into our inquiry categories that are prevalent in the social practices enabled by the media: scraping makes available already formatted data for social research. Scraped data, and online social data more generally, tend to come with ‘external’ analytics already built-in. This circumstance is often approached as a ‘problem’ with online data capture, but we propose it may be turned into virtue, insofar as data formats that have currency in the areas under scrutiny may serve as a source of social data themselves. Scraping, we propose, makes it possible to render traffic between the object and process of social research analytically productive. It enables a form of ‘real-time’ social research, in which the formats and life cycles of online data may lend structure to the analytic objects and findings of social research. By way of a conclusion, we demonstrate this point in an exercise of online issue profiling, and more particularly, by relying on Twitter to profile the issue of ‘austerity’. Here we distinguish between two forms of real-time research, those dedicated to monitoring live content (which terms are current?) and those concerned with analysing the liveliness of issues (which topics are happening?)
    • …
    corecore