Human groups are able to converge on more accurate beliefs through
deliberation, even in the presence of polarization and partisan bias -- a
phenomenon known as the "wisdom of partisan crowds." Generated agents powered
by Large Language Models (LLMs) are increasingly used to simulate human
collective behavior, yet few benchmarks exist for evaluating their dynamics
against the behavior of human groups. In this paper, we examine the extent to
which the wisdom of partisan crowds emerges in groups of LLM-based agents that
are prompted to role-play as partisan personas (e.g., Democrat or Republican).
We find that they not only display human-like partisan biases, but also
converge to more accurate beliefs through deliberation as humans do. We then
identify several factors that interfere with convergence, including the use of
chain-of-thought prompt and lack of details in personas. Conversely,
fine-tuning on human data appears to enhance convergence. These findings show
the potential and limitations of LLM-based agents as a model of human
collective intelligence