The history of infections and epidemics holds famous examples where
understanding, containing and ultimately treating an outbreak began with
understanding its mode of spread. Influenza, HIV and most computer viruses,
spread person to person, device to device, through contact networks; Cholera,
Cancer, and seasonal allergies, on the other hand, do not. In this paper we
study two fundamental questions of detection: first, given a snapshot view of a
(perhaps vanishingly small) fraction of those infected, under what conditions
is an epidemic spreading via contact (e.g., Influenza), distinguishable from a
"random illness" operating independently of any contact network (e.g., seasonal
allergies); second, if we do have an epidemic, under what conditions is it
possible to determine which network of interactions is the main cause of the
spread -- the causative network -- without any knowledge of the epidemic, other
than the identity of a minuscule subsample of infected nodes?
The core, therefore, of this paper, is to obtain an understanding of the
diagnostic power of network information. We derive sufficient conditions
networks must satisfy for these problems to be identifiable, and produce
efficient, highly scalable algorithms that solve these problems. We show that
the identifiability condition we give is fairly mild, and in particular, is
satisfied by two common graph topologies: the grid, and the Erdos-Renyi graphs