1,399 research outputs found
Toward Order-of-Magnitude Cascade Prediction
When a piece of information (microblog, photograph, video, link, etc.) starts
to spread in a social network, an important question arises: will it spread to
"viral" proportions -- where "viral" is defined as an order-of-magnitude
increase. However, several previous studies have established that cascade size
and frequency are related through a power-law - which leads to a severe
imbalance in this classification problem. In this paper, we devise a suite of
measurements based on "structural diversity" -- the variety of social contexts
(communities) in which individuals partaking in a given cascade engage. We
demonstrate these measures are able to distinguish viral from non-viral
cascades, despite the severe imbalance of the data for this problem. Further,
we leverage these measurements as features in a classification approach,
successfully predicting microblogs that grow from 50 to 500 reposts with
precision of 0.69 and recall of 0.52 for the viral class - despite this class
comprising under 2\% of samples. This significantly outperforms our baseline
approach as well as the current state-of-the-art. Our work also demonstrates
how we can tradeoff between precision and recall.Comment: 4 pages, 15 figures, ASONAM 2015 poster pape
An Empirical Evaluation Of Social Influence Metrics
Predicting when an individual will adopt a new behavior is an important
problem in application domains such as marketing and public health. This paper
examines the perfor- mance of a wide variety of social network based
measurements proposed in the literature - which have not been previously
compared directly. We study the probability of an individual becoming
influenced based on measurements derived from neigh- borhood (i.e. number of
influencers, personal network exposure), structural diversity, locality,
temporal measures, cascade mea- sures, and metadata. We also examine the
ability to predict influence based on choice of classifier and how the ratio of
positive to negative samples in both training and testing affect prediction
results - further enabling practical use of these concepts for social influence
applications.Comment: 8 pages, 5 figure
Toward automatic censorship detection in microblogs
Social media is an area where users often experience censorship through a
variety of means such as the restriction of search terms or active and
retroactive deletion of messages. In this paper we examine the feasibility of
automatically detecting censorship of microblogs. We use a network growing
model to simulate discussion over a microblog follow network and compare two
censorship strategies to simulate varying levels of message deletion. Using
topological features extracted from the resulting graphs, a classifier is
trained to detect whether or not a given communication graph has been censored.
The results show that censorship detection is feasible under empirically
measured levels of message deletion. The proposed framework can enable
automated censorship measurement and tracking, which, when combined with
aggregated citizen reports of censorship, can allow users to make informed
decisions about online communication habits.Comment: 13 pages. Updated with example cascades figure and typo fixes. To
appear at the International Workshop on Data Mining in Social Networks
(PAKDD-SocNet) 201
Unsupervised keyword extraction from microblog posts via hashtags
© River Publishers. Nowadays, huge amounts of texts are being generated for social networking purposes on Web. Keyword extraction from such texts like microblog posts benefits many applications such as advertising, search, and content filtering. Unlike traditional web pages, a microblog post usually has some special social feature like a hashtag that is topical in nature and generated by users. Extracting keywords related to hashtags can reflect the intents of users and thus provides us better understanding on post content. In this paper, we propose a novel unsupervised keyword extraction approach for microblog posts by treating hashtags as topical indicators. Our approach consists of two hashtag enhanced algorithms. One is a topic model algorithm that infers topic distributions biased to hashtags on a collection of microblog posts. The words are ranked by their average topic probabilities. Our topic model algorithm can not only find the topics of a collection, but also extract hashtag-related keywords. The other is a random walk based algorithm. It first builds a word-post weighted graph by taking into account posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides the algorithm to extract keywords according to hashtag topics. Last, the final ranking score of a word is determined by the stationary probability after a number of iterations. We evaluate our proposed approach on a collection of real Chinese microblog posts. Experiments show that our approach is more effective in terms of precision than traditional approaches considering no hashtag. The result achieved by the combination of two algorithms performs even better than each individual algorithm
- …