43,536 research outputs found
Identifying Prominent Life Events on Twitter
Social media is a common place for people to post and share digital reflections of their life events, including major events such as getting married, having children, graduating, etc. Although the creation of such posts is straightforward, the identification of events on online media remains a challenge. Much research in recent years focused on extracting major events from Twitter, such as earthquakes, storms, and floods. This paper however, targets the automatic detection of personal life events, focusing on five events that psychologists found to be the most prominent in people lives. We define a variety of features (user, content, semantic and interaction) to capture the characteristics of those life events and present the results of several classification methods to automatically identify these events in Twitter. Our proposed classification methods obtain results between 0.84 and 0.92 F1-measure for the different types of life events. A novel contribution of this work also lies in a new corpus of tweets, which has been annotated by using crowdsourcing and that constitutes, to the best of our knowledge, the first publicly available dataset for the automatic identification of personal life events from Twitter
Recommended from our members
Detecting Important Life Events on Twitter Using Frequent Semantic and Syntactic Subgraphs
Identifying global events from social media has been the focus of much research in recent years. However, the identification of personal life events poses new requirements and challenges that have received relatively little research attention. In this paper we explore a new approach for life event identification, where we expand social media posts into both semantic, and syntactic networks of content. Frequent graph patterns are mined from these networks and used as features to enrich life-event classifiers. Results show that our approach significantly outperforms the best performing baseline in accuracy (by 4.48% points) and F-measure (by 4.54% points) when used to identify five major life events identified from the psychology literature: Getting Married, Having Children, Death of a Parent, Starting School, and Falling in Love. In addition, our results show that, while semantic graphs are effective at discriminating the theme of the post (e.g. the topic of marriage), syntactic graphs help identify whether the post describes a personal event (e.g. someone getting married)
Computational Content Analysis of Negative Tweets for Obesity, Diet, Diabetes, and Exercise
Social media based digital epidemiology has the potential to support faster
response and deeper understanding of public health related threats. This study
proposes a new framework to analyze unstructured health related textual data
via Twitter users' post (tweets) to characterize the negative health sentiments
and non-health related concerns in relations to the corpus of negative
sentiments, regarding Diet Diabetes Exercise, and Obesity (DDEO). Through the
collection of 6 million Tweets for one month, this study identified the
prominent topics of users as it relates to the negative sentiments. Our
proposed framework uses two text mining methods, sentiment analysis and topic
modeling, to discover negative topics. The negative sentiments of Twitter users
support the literature narratives and the many morbidity issues that are
associated with DDEO and the linkage between obesity and diabetes. The
framework offers a potential method to understand the publics' opinions and
sentiments regarding DDEO. More importantly, this research provides new
opportunities for computational social scientists, medical experts, and public
health professionals to collectively address DDEO-related issues.Comment: The 2017 Annual Meeting of the Association for Information Science
and Technology (ASIST
$1.00 per RT #BostonMarathon #PrayForBoston: analyzing fake content on Twitter
This study found that 29% of the most viral content on Twitter during the Boston bombing crisis were rumors and fake content.AbstractOnline social media has emerged as one of the prominent channels for dissemination of information during real world events. Malicious content is posted online during events, which can result in damage, chaos and monetary losses in the real world. We analyzed one such media i.e. Twitter, for content generated during the event of Boston Marathon Blasts, that occurred on April, 15th, 2013. A lot of fake content and malicious profiles originated on Twitter network during this event. The aim of this work is to perform in-depth characterization of what factors influenced in malicious content and profiles becoming viral. Our results showed that 29% of the most viral content on Twitter, during the Boston crisis were rumors and fake content; while 51% was generic opinions and comments; and rest was true information. We found that large number of users with high social reputation and verified accounts were responsible for spreading the fake content. Next, we used regression prediction model, to verify that, overall impact of all users who propagate the fake content at a given time, can be used to estimate the growth of that content in future. Many malicious accounts were created on Twitter during the Boston event, that were later suspended by Twitter. We identified over six thousand such user profiles, we observed that the creation of such profiles surged considerably right after the blasts occurred. We identified closed community structure and star formation in the interaction network of these suspended profiles amongst themselves
Crisis Communication Patterns in Social Media during Hurricane Sandy
Hurricane Sandy was one of the deadliest and costliest of hurricanes over the
past few decades. Many states experienced significant power outage, however
many people used social media to communicate while having limited or no access
to traditional information sources. In this study, we explored the evolution of
various communication patterns using machine learning techniques and determined
user concerns that emerged over the course of Hurricane Sandy. The original
data included ~52M tweets coming from ~13M users between October 14, 2012 and
November 12, 2012. We run topic model on ~763K tweets from top 4,029 most
frequent users who tweeted about Sandy at least 100 times. We identified 250
well-defined communication patterns based on perplexity. Conversations of most
frequent and relevant users indicate the evolution of numerous storm-phase
(warning, response, and recovery) specific topics. People were also concerned
about storm location and time, media coverage, and activities of political
leaders and celebrities. We also present each relevant keyword that contributed
to one particular pattern of user concerns. Such keywords would be particularly
meaningful in targeted information spreading and effective crisis communication
in similar major disasters. Each of these words can also be helpful for
efficient hash-tagging to reach target audience as needed via social media. The
pattern recognition approach of this study can be used in identifying real time
user needs in future crises
Exploring Russian Cyberspace: Digitally-Mediated Collective Action and the Networked Public Sphere
This paper summarizes the major findings of a three-year research project to investigate the Internet's impact on Russian politics, media and society. We employed multiple methods to study online activity: the mapping and study of the structure, communities and content of the blogosphere; an analogous mapping and study of Twitter; content analysis of different media sources using automated and human-based evaluation approaches; and a survey of bloggers; augmented by infrastructure mapping, interviews and background research. We find the emergence of a vibrant and diverse networked public sphere that constitutes an independent alternative to the more tightly controlled offline media and political space, as well as the growing use of digital platforms in social mobilization and civic action. Despite various indirect efforts to shape cyberspace into an environment that is friendlier towards the government, we find that the Russian Internet remains generally open and free, although the current degree of Internet freedom is in no way a prediction of the future of this contested space
#Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds
Compounding of natural language units is a very common phenomena. In this
paper, we show, for the first time, that Twitter hashtags which, could be
considered as correlates of such linguistic units, undergo compounding. We
identify reasons for this compounding and propose a prediction model that can
identify with 77.07% accuracy if a pair of hashtags compounding in the near
future (i.e., 2 months after compounding) shall become popular. At longer times
T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This
technique has strong implications to trending hashtag recommendation since
newly formed hashtag compounds can be recommended early, even before the
compounding has taken place. Further, humans can predict compounds with an
overall accuracy of only 48.7% (treated as baseline). Notably, while humans can
discriminate the relatively easier cases, the automatic framework is successful
in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported
Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM
conference on Computer-Supported Cooperative Work and Social Computing (CSCW
2016
Scraping the Social? Issues in live social research
What makes scraping methodologically interesting for social and cultural research? This paper seeks to contribute to debates about digital social research by exploring how a âmedium-specificâ technique for online data capture may be rendered analytically productive for social research. As a device that is currently being imported into social research, scraping has the capacity to re-structure social research, and this in at least two ways. Firstly, as a technique that is not native to social research, scraping risks to introduce âalienâ methodological assumptions into social research (such as an pre-occupation with freshness). Secondly, to scrape is to risk importing into our inquiry categories that are prevalent in the social practices enabled by the media: scraping makes available already formatted data for social research. Scraped data, and online social data more generally, tend to come with âexternalâ analytics already built-in. This circumstance is often approached as a âproblemâ with online data capture, but we propose it may be turned into virtue, insofar as data formats that have currency in the areas under scrutiny may serve as a source of social data themselves. Scraping, we propose, makes it possible to render traffic between the object and process of social research analytically productive. It enables a form of âreal-timeâ social research, in which the formats and life cycles of online data may lend structure to the analytic objects and findings of social research. By way of a conclusion, we demonstrate this point in an exercise of online issue profiling, and more particularly, by relying on Twitter to profile the issue of âausterityâ. Here we distinguish between two forms of real-time research, those dedicated to monitoring live content (which terms are current?) and those concerned with analysing the liveliness of issues (which topics are happening?)
- âŚ