3,069 research outputs found
Timeline Generation: Tracking individuals on Twitter
In this paper, we propose a unsupervised framework to reconstruct a person's
life history by creating a chronological list for {\it personal important
events} (PIE) of individuals based on the tweets they published. By analyzing
individual tweet collections, we find that what are suitable for inclusion in
the personal timeline should be tweets talking about personal (as opposed to
public) and time-specific (as opposed to time-general) topics. To further
extract these types of topics, we introduce a non-parametric multi-level
Dirichlet Process model to recognize four types of tweets: personal
time-specific (PersonTS), personal time-general (PersonTG), public
time-specific (PublicTS) and public time-general (PublicTG) topics, which, in
turn, are used for further personal event extraction and timeline generation.
To the best of our knowledge, this is the first work focused on the generation
of timeline for individuals from twitter data. For evaluation, we have built a
new golden standard Timelines based on Twitter and Wikipedia that contain PIE
related events from 20 {\it ordinary twitter users} and 20 {\it celebrities}.
Experiments on real Twitter data quantitatively demonstrate the effectiveness
of our approach
Clustering Memes in Social Media
The increasing pervasiveness of social media creates new opportunities to
study human social behavior, while challenging our capability to analyze their
massive data streams. One of the emerging tasks is to distinguish between
different kinds of activities, for example engineered misinformation campaigns
versus spontaneous communication. Such detection problems require a formal
definition of meme, or unit of information that can spread from person to
person through the social network. Once a meme is identified, supervised
learning methods can be applied to classify different types of communication.
The appropriate granularity of a meme, however, is hardly captured from
existing entities such as tags and keywords. Here we present a framework for
the novel task of detecting memes by clustering messages from large streams of
social data. We evaluate various similarity measures that leverage content,
metadata, network features, and their combinations. We also explore the idea of
pre-clustering on the basis of existing entities. A systematic evaluation is
carried out using a manually curated dataset as ground truth. Our analysis
shows that pre-clustering and a combination of heterogeneous features yield the
best trade-off between number of clusters and their quality, demonstrating that
a simple combination based on pairwise maximization of similarity is as
effective as a non-trivial optimization of parameters. Our approach is fully
automatic, unsupervised, and scalable for real-time detection of memes in
streaming data.Comment: Proceedings of the 2013 IEEE/ACM International Conference on Advances
in Social Networks Analysis and Mining (ASONAM'13), 201
Identifying communicator roles in Twitter
Twitter has redefined the way social activities can be coordinated; used for mobilizing people during natural disasters, studying health epidemics, and recently, as a communication platform during social and political change. As a large scale system, the volume of data transmitted per day presents Twitter users with a problem: how can valuable content be distilled from the back chatter, how can the providers of valuable information be promoted, and ultimately how can influential individuals be identified?To tackle this, we have developed a model based upon the Twitter message exchange which enables us to analyze conversations around specific topics and identify key players in a conversation. A working implementation of the model helps categorize Twitter users by specific roles based on their dynamic communication behavior rather than an analysis of their static friendship network. This provides a method of identifying users who are potentially producers or distributers of valuable knowledge
- …