We present an algorithm for recovering planted solutions in two well-known
models, the stochastic block model and planted constraint satisfaction
problems, via a common generalization in terms of random bipartite graphs. Our
algorithm matches up to a constant factor the best-known bounds for the number
of edges (or constraints) needed for perfect recovery and its running time is
linear in the number of edges used. The time complexity is significantly better
than both spectral and SDP-based approaches.
The main contribution of the algorithm is in the case of unequal sizes in the
bipartition (corresponding to odd uniformity in the CSP). Here our algorithm
succeeds at a significantly lower density than the spectral approaches,
surpassing a barrier based on the spectral norm of a random matrix.
Other significant features of the algorithm and analysis include (i) the
critical use of power iteration with subsampling, which might be of independent
interest; its analysis requires keeping track of multiple norms of an evolving
solution (ii) it can be implemented statistically, i.e., with very limited
access to the input distribution (iii) the algorithm is extremely simple to
implement and runs in linear time, and thus is practical even for very large
instances