In the context-dependent Text-to-SQL task, the generated SQL statements are
refined iteratively based on the user input utterance from each interaction.
The input text from each interaction can be viewed as component modifications
to the previous SQL statements, which could be further extracted as the
modification patterns. Since these modification patterns could also be combined
with other SQL statements, the models are supposed to have the compositional
generalization to these novel combinations. This work is the first exploration
of compositional generalization in context-dependent Text-to-SQL scenarios. To
facilitate related studies, we constructed two challenging benchmarks named
\textsc{CoSQL-CG} and \textsc{SParC-CG} by recombining the modification
patterns and existing SQL statements. The following experiments show that all
current models struggle on our proposed benchmarks. Furthermore, we found that
better aligning the previous SQL statements with the input utterance could give
models better compositional generalization ability. Based on these
observations, we propose a method named \texttt{p-align} to improve the
compositional generalization of Text-to-SQL models. Further experiments
validate the effectiveness of our method. Source code and data are available.Comment: Accepted to ACL 2023 (Findings), Long Paper, 11 page