How can we localize the source of diffusion in a complex network? Due to the
tremendous size of many real networks--such as the Internet or the human social
graph--it is usually infeasible to observe the state of all nodes in a network.
We show that it is fundamentally possible to estimate the location of the
source from measurements collected by sparsely-placed observers. We present a
strategy that is optimal for arbitrary trees, achieving maximum probability of
correct localization. We describe efficient implementations with complexity
O(N^{\alpha}), where \alpha=1 for arbitrary trees, and \alpha=3 for arbitrary
graphs. In the context of several case studies, we determine how localization
accuracy is affected by various system parameters, including the structure of
the network, the density of observers, and the number of observed cascades.Comment: To appear in Physical Review Letters. Includes pre-print of main
paper, and supplementary materia