7,716 research outputs found
Structure and Dynamics of Information Pathways in Online Media
Diffusion of information, spread of rumors and infectious diseases are all
instances of stochastic processes that occur over the edges of an underlying
network. Many times networks over which contagions spread are unobserved, and
such networks are often dynamic and change over time. In this paper, we
investigate the problem of inferring dynamic networks based on information
diffusion data. We assume there is an unobserved dynamic network that changes
over time, while we observe the results of a dynamic process spreading over the
edges of the network. The task then is to infer the edges and the dynamics of
the underlying network.
We develop an on-line algorithm that relies on stochastic convex optimization
to efficiently solve the dynamic network inference problem. We apply our
algorithm to information diffusion among 3.3 million mainstream media and blog
sites and experiment with more than 179 million different pieces of information
spreading over the network in a one year period. We study the evolution of
information pathways in the online media space and find interesting insights.
Information pathways for general recurrent topics are more stable across time
than for on-going news events. Clusters of news media sites and blogs often
emerge and vanish in matter of days for on-going news events. Major social
movements and events involving civil population, such as the Libyan's civil war
or Syria's uprise, lead to an increased amount of information pathways among
blogs as well as in the overall increase in the network centrality of blogs and
social media sites.Comment: To Appear at the 6th International Conference on Web Search and Data
Mining (WSDM '13
Topology Discovery of Sparse Random Graphs With Few Participants
We consider the task of topology discovery of sparse random graphs using
end-to-end random measurements (e.g., delay) between a subset of nodes,
referred to as the participants. The rest of the nodes are hidden, and do not
provide any information for topology discovery. We consider topology discovery
under two routing models: (a) the participants exchange messages along the
shortest paths and obtain end-to-end measurements, and (b) additionally, the
participants exchange messages along the second shortest path. For scenario
(a), our proposed algorithm results in a sub-linear edit-distance guarantee
using a sub-linear number of uniformly selected participants. For scenario (b),
we obtain a much stronger result, and show that we can achieve consistent
reconstruction when a sub-linear number of uniformly selected nodes
participate. This implies that accurate discovery of sparse random graphs is
tractable using an extremely small number of participants. We finally obtain a
lower bound on the number of participants required by any algorithm to
reconstruct the original random graph up to a given edit distance. We also
demonstrate that while consistent discovery is tractable for sparse random
graphs using a small number of participants, in general, there are graphs which
cannot be discovered by any algorithm even with a significant number of
participants, and with the availability of end-to-end information along all the
paths between the participants.Comment: A shorter version appears in ACM SIGMETRICS 2011. This version is
scheduled to appear in J. on Random Structures and Algorithm
The power of indirect social ties
While direct social ties have been intensely studied in the context of
computer-mediated social networks, indirect ties (e.g., friends of friends)
have seen little attention. Yet in real life, we often rely on friends of our
friends for recommendations (of good doctors, good schools, or good
babysitters), for introduction to a new job opportunity, and for many other
occasional needs. In this work we attempt to 1) quantify the strength of
indirect social ties, 2) validate it, and 3) empirically demonstrate its
usefulness for distributed applications on two examples. We quantify social
strength of indirect ties using a(ny) measure of the strength of the direct
ties that connect two people and the intuition provided by the sociology
literature. We validate the proposed metric experimentally by comparing
correlations with other direct social tie evaluators. We show via data-driven
experiments that the proposed metric for social strength can be used
successfully for social applications. Specifically, we show that it alleviates
known problems in friend-to-friend storage systems by addressing two previously
documented shortcomings: reduced set of storage candidates and data
availability correlations. We also show that it can be used for predicting the
effects of a social diffusion with an accuracy of up to 93.5%.Comment: Technical Repor
Scalable Inference of Customer Similarities from Interactions Data using Dirichlet Processes
Under the sociological theory of homophily, people who are similar to one
another are more likely to interact with one another. Marketers often have
access to data on interactions among customers from which, with homophily as a
guiding principle, inferences could be made about the underlying similarities.
However, larger networks face a quadratic explosion in the number of potential
interactions that need to be modeled. This scalability problem renders
probability models of social interactions computationally infeasible for all
but the smallest networks. In this paper we develop a probabilistic framework
for modeling customer interactions that is both grounded in the theory of
homophily, and is flexible enough to account for random variation in who
interacts with whom. In particular, we present a novel Bayesian nonparametric
approach, using Dirichlet processes, to moderate the scalability problems that
marketing researchers encounter when working with networked data. We find that
this framework is a powerful way to draw insights into latent similarities of
customers, and we discuss how marketers can apply these insights to
segmentation and targeting activities
A Tutorial on Time-Evolving Dynamical Bayesian Inference
In view of the current availability and variety of measured data, there is an
increasing demand for powerful signal processing tools that can cope
successfully with the associated problems that often arise when data are being
analysed. In practice many of the data-generating systems are not only
time-variable, but also influenced by neighbouring systems and subject to
random fluctuations (noise) from their environments. To encompass problems of
this kind, we present a tutorial about the dynamical Bayesian inference of
time-evolving coupled systems in the presence of noise. It includes the
necessary theoretical description and the algorithms for its implementation.
For general programming purposes, a pseudocode description is also given.
Examples based on coupled phase and limit-cycle oscillators illustrate the
salient features of phase dynamics inference. State domain inference is
illustrated with an example of coupled chaotic oscillators. The applicability
of the latter example to secure communications based on the modulation of
coupling functions is outlined. MatLab codes for implementation of the method,
as well as for the explicit examples, accompany the tutorial.Comment: Matlab codes can be found on http://py-biomedical.lancaster.ac.uk
- …