Effective conversation requires common ground: a shared understanding between
the participants. Common ground, however, does not emerge spontaneously in
conversation. Speakers and listeners work together to both identify and
construct a shared basis while avoiding misunderstanding. To accomplish
grounding, humans rely on a range of dialogue acts, like clarification (What do
you mean?) and acknowledgment (I understand.). In domains like teaching and
emotional support, carefully constructing grounding prevents misunderstanding.
However, it is unclear whether large language models (LLMs) leverage these
dialogue acts in constructing common ground. To this end, we curate a set of
grounding acts and propose corresponding metrics that quantify attempted
grounding. We study whether LLMs use these grounding acts, simulating them
taking turns from several dialogue datasets, and comparing the results to
humans. We find that current LLMs are presumptive grounders, biased towards
assuming common ground without using grounding acts. To understand the roots of
this behavior, we examine the role of instruction tuning and reinforcement
learning with human feedback (RLHF), finding that RLHF leads to less grounding.
Altogether, our work highlights the need for more research investigating
grounding in human-AI interaction.Comment: 16 pages, 2 figure