5,013 research outputs found
Importance Sketching of Influence Dynamics in Billion-scale Networks
The blooming availability of traces for social, biological, and communication
networks opens up unprecedented opportunities in analyzing diffusion processes
in networks. However, the sheer sizes of the nowadays networks raise serious
challenges in computational efficiency and scalability.
In this paper, we propose a new hyper-graph sketching framework for inflence
dynamics in networks. The central of our sketching framework, called SKIS, is
an efficient importance sampling algorithm that returns only non-singular
reverse cascades in the network. Comparing to previously developed sketches
like RIS and SKIM, our sketch significantly enhances estimation quality while
substantially reducing processing time and memory-footprint. Further, we
present general strategies of using SKIS to enhance existing algorithms for
influence estimation and influence maximization which are motivated by
practical applications like viral marketing. Using SKIS, we design high-quality
influence oracle for seed sets with average estimation error up to 10x times
smaller than those using RIS and 6x times smaller than SKIM. In addition, our
influence maximization using SKIS substantially improves the quality of
solutions for greedy algorithms. It achieves up to 10x times speed-up and 4x
memory reduction for the fastest RIS-based DSSA algorithm, while maintaining
the same theoretical guarantees.Comment: 12 pages, to appear in ICDM 2017 as a regular pape
Predicting Diffusion Reach Probabilities via Representation Learning on Social Networks
Diffusion reach probability between two nodes on a network is defined as the
probability of a cascade originating from one node reaching to another node. An
infinite number of cascades would enable calculation of true diffusion reach
probabilities between any two nodes. However, there exists only a finite number
of cascades and one usually has access only to a small portion of all available
cascades. In this work, we addressed the problem of estimating diffusion reach
probabilities given only a limited number of cascades and partial information
about underlying network structure. Our proposed strategy employs node
representation learning to generate and feed node embeddings into machine
learning algorithms to create models that predict diffusion reach
probabilities. We provide experimental analysis using synthetically generated
cascades on two real-world social networks. Results show that proposed method
is superior to using values calculated from available cascades when the portion
of cascades is small
Probing Limits of Information Spread with Sequential Seeding
We consider here information spread which propagates with certain probability
from nodes just activated to their not yet activated neighbors. Diffusion
cascades can be triggered by activation of even a small set of nodes. Such
activation is commonly performed in a single stage. A novel approach based on
sequential seeding is analyzed here resulting in three fundamental
contributions. First, we propose a coordinated execution of randomized choices
to enable precise comparison of different algorithms in general. We apply it
here when the newly activated nodes at each stage of spreading attempt to
activate their neighbors. Then, we present a formal proof that sequential
seeding delivers at least as large coverage as the single stage seeding does.
Moreover, we also show that, under modest assumptions, sequential seeding
achieves coverage provably better than the single stage based approach using
the same number of seeds and node ranking. Finally, we present experimental
results showing how single stage and sequential approaches on directed and
undirected graphs compare to the well-known greedy approach to provide the
objective measure of the sequential seeding benefits. Surprisingly, applying
sequential seeding to a simple degree-based selection leads to higher coverage
than achieved by the computationally expensive greedy approach currently
considered to be the best heuristic
- …