Diffusion and propagation of information, influence and diseases take place
over increasingly larger networks. We observe when a node copies information,
makes a decision or becomes infected but networks are often hidden or
unobserved. Since networks are highly dynamic, changing and growing rapidly, we
only observe a relatively small set of cascades before a network changes
significantly. Scalable network inference based on a small cascade set is then
necessary for understanding the rapidly evolving dynamics that govern
diffusion. In this article, we develop a scalable approximation algorithm with
provable near-optimal performance based on submodular maximization which
achieves a high accuracy in such scenario, solving an open problem first
introduced by Gomez-Rodriguez et al (2010). Experiments on synthetic and real
diffusion data show that our algorithm in practice achieves an optimal
trade-off between accuracy and running time.Comment: To appear in the 29th International Conference on Machine Learning
(ICML), 2012. Website:
http://www.stanford.edu/~manuelgr/network-inference-multitree