2 research outputs found
Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting
A crucial challenge for generative large language models (LLMs) is diversity:
when a user's prompt is under-specified, models may follow implicit assumptions
while generating a response, which may result in homogenization of the
responses, as well as certain demographic groups being under-represented or
even erased from the generated responses. In this paper, we formalize diversity
of representation in generative LLMs. We present evaluation datasets and
propose metrics to measure diversity in generated responses along people and
culture axes. We find that LLMs understand the notion of diversity, and that
they can reason and critique their own responses for that goal. This finding
motivated a new prompting technique called collective-critique and self-voting
(CCSV) to self-improve people diversity of LLMs by tapping into its diversity
reasoning capabilities, without relying on handcrafted examples or prompt
tuning. Extensive empirical experiments with both human and automated
evaluations show that our proposed approach is effective at improving people
and culture diversity, and outperforms all baseline methods by a large margin.Comment: To appear at EMNLP 2023 main conferenc