Graph neural networks (GNN) have become the default machine learning model
for relational datasets, including protein interaction networks, biological
neural networks, and scientific collaboration graphs. We use tools from
statistical physics and random matrix theory to precisely characterize
generalization in simple graph convolution networks on the contextual
stochastic block model. The derived curves are phenomenologically rich: they
explain the distinction between learning on homophilic and heterophilic graphs
and they predict double descent whose existence in GNNs has been questioned by
recent work. Our results are the first to accurately explain the behavior not
only of a stylized graph learning model but also of complex GNNs on messy
real-world datasets. To wit, we use our analytic insights about homophily and
heterophily to improve performance of state-of-the-art graph neural networks on
several heterophilic benchmarks by a simple addition of negative self-loop
filters