We have measured the angular correlation function of radio sources in the 1.4
GHz NVSS and FIRST surveys. Below ~6 arcminutes w(theta) is dominated by the
size distribution of radio galaxies. A model of the size distribution of radio
galaxies can account for this excess signal in w(theta). The amplitude of the
cosmological clustering of radio sources is roughly constant at A~0.001 from 3
to 40 mJy, but has increased to A~0.007 at 200 mJy. This can be explained if
powerful (FRII) radio galaxies probe more massive structures at z~1 compared to
average power radio galaxies, consistent with powerful high-z radio galaxies
generally having massive (forming) elliptical hosts in rich cluster
environments. For FRIIs we derive a spatial (comoving) correlation length of
r_0=14\pm3 h^{-1} Mpc. This is close to that measured for extremely red objects
(EROs) associated with a population of old elliptical galaxies at z~1 by Daddi
et al. (2001). Based on their similar clustering properties, we propose that
EROs and powerful radio galaxies may be the same systems seen at different
evolutionary stages. Their r_0 is ~2 times higher than that of QSOs at a
similar redshift, and comparable to that of bright ellipticals locally. This
suggests that r_0 (comoving) of these galaxies has changed little from z~1 to
z=0, in agreement with current LCDM hierarchical models for clustering
evolution of massive early-type galaxies. Alternatively, the clustering of
radio galaxies can be explained by the galaxy conservation model. This then
implies that radio galaxies of average power are the progenitors of the local
early-type field population, while the most powerful radio galaxies will evolve
into a present-day population with r_0 similar to that of local rich clusters.Comment: 21 pages, 17 figures. Accepted for publication in A&