544 research outputs found
Bayesian Inference of Online Social Network Statistics via Lightweight Random Walk Crawls
Online social networks (OSN) contain extensive amount of information about
the underlying society that is yet to be explored. One of the most feasible
technique to fetch information from OSN, crawling through Application
Programming Interface (API) requests, poses serious concerns over the the
guarantees of the estimates. In this work, we focus on making reliable
statistical inference with limited API crawls. Based on regenerative properties
of the random walks, we propose an unbiased estimator for the aggregated sum of
functions over edges and proved the connection between variance of the
estimator and spectral gap. In order to facilitate Bayesian inference on the
true value of the estimator, we derive the approximate posterior distribution
of the estimate. Later the proposed ideas are validated with numerical
experiments on inference problems in real-world networks
Conditional Reliability in Uncertain Graphs
Network reliability is a well-studied problem that requires to measure the
probability that a target node is reachable from a source node in a
probabilistic (or uncertain) graph, i.e., a graph where every edge is assigned
a probability of existence. Many approaches and problem variants have been
considered in the literature, all assuming that edge-existence probabilities
are fixed. Nevertheless, in real-world graphs, edge probabilities typically
depend on external conditions. In metabolic networks a protein can be converted
into another protein with some probability depending on the presence of certain
enzymes. In social influence networks the probability that a tweet of some user
will be re-tweeted by her followers depends on whether the tweet contains
specific hashtags. In transportation networks the probability that a network
segment will work properly or not might depend on external conditions such as
weather or time of the day. In this paper we overcome this limitation and focus
on conditional reliability, that is assessing reliability when edge-existence
probabilities depend on a set of conditions. In particular, we study the
problem of determining the k conditions that maximize the reliability between
two nodes. We deeply characterize our problem and show that, even employing
polynomial-time reliability-estimation methods, it is NP-hard, does not admit
any PTAS, and the underlying objective function is non-submodular. We then
devise a practical method that targets both accuracy and efficiency. We also
study natural generalizations of the problem with multiple source and target
nodes. An extensive empirical evaluation on several large, real-life graphs
demonstrates effectiveness and scalability of the proposed methods.Comment: 14 pages, 13 figure
Toward automatic censorship detection in microblogs
Social media is an area where users often experience censorship through a
variety of means such as the restriction of search terms or active and
retroactive deletion of messages. In this paper we examine the feasibility of
automatically detecting censorship of microblogs. We use a network growing
model to simulate discussion over a microblog follow network and compare two
censorship strategies to simulate varying levels of message deletion. Using
topological features extracted from the resulting graphs, a classifier is
trained to detect whether or not a given communication graph has been censored.
The results show that censorship detection is feasible under empirically
measured levels of message deletion. The proposed framework can enable
automated censorship measurement and tracking, which, when combined with
aggregated citizen reports of censorship, can allow users to make informed
decisions about online communication habits.Comment: 13 pages. Updated with example cascades figure and typo fixes. To
appear at the International Workshop on Data Mining in Social Networks
(PAKDD-SocNet) 201
- …