Here we test Neutral models against the evolution of English word frequency
and vocabulary at the population scale, as recorded in annual word frequencies
from three centuries of English language books. Against these data, we test
both static and dynamic predictions of two neutral models, including the
relation between corpus size and vocabulary size, frequency distributions, and
turnover within those frequency distributions. Although a commonly used Neutral
model fails to replicate all these emergent properties at once, we find that
modified two-stage Neutral model does replicate the static and dynamic
properties of the corpus data. This two-stage model is meant to represent a
relatively small corpus (population) of English books, analogous to a `canon',
sampled by an exponentially increasing corpus of books in the wider population
of authors. More broadly, this mode -- a smaller neutral model within a larger
neutral model -- could represent more broadly those situations where mass
attention is focused on a small subset of the cultural variants.Comment: 12 pages, 5 figures, 1 tabl