Prevailing methods for assessing and comparing generative AIs incentivize
responses that serve a hypothetical representative individual. Evaluating
models in these terms presumes homogeneous preferences across the population
and engenders selection of agglomerative AIs, which fail to represent the
diverse range of interests across individuals. We propose an alternative
evaluation method that instead prioritizes inclusive AIs, which provably retain
the requisite knowledge not only for subsequent response customization to
particular segments of the population but also for utility-maximizing
decisions