5 research outputs found
ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought
Recently Large Language Models (LLMs) have been proven to have strong
abilities in various domains and tasks. We study the problem of prompt
designing in the text-to-SQL task and attempt to improve the LLMs' reasoning
ability when generating SQL queries. Besides the trivial few-shot in-context
learning setting, we design our chain-of-thought (CoT) prompt with a similar
method to schema linking. We provide a method named ACT-SQL to automatically
generate auto-CoT exemplars and thus the whole process doesn't need manual
labeling. Our approach is cost-saving since we only use the LLMs' API call once
when generating one SQL query. Furthermore, we extend our in-context learning
method to the multi-turn text-to-SQL task. The experiment results show that the
LLMs' performance can benefit from our ACT-SQL approach. Our approach achieves
SOTA performance on the Spider dev set among existing in-context learning
approaches
ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL
Text-to-SQL aims to generate an executable SQL program given the user
utterance and the corresponding database schema. To ensure the well-formedness
of output SQLs, one prominent approach adopts a grammar-based recurrent decoder
to produce the equivalent SQL abstract syntax tree (AST). However, previous
methods mainly utilize an RNN-series decoder, which 1) is time-consuming and
inefficient and 2) introduces very few structure priors. In this work, we
propose an AST structure-aware Transformer decoder (ASTormer) to replace
traditional RNN cells. The structural knowledge, such as node types and
positions in the tree, is seamlessly incorporated into the decoder via both
absolute and relative position embeddings. Besides, the proposed framework is
compatible with different traversing orders even considering adaptive node
selection. Extensive experiments on five text-to-SQL benchmarks demonstrate the
effectiveness and efficiency of our structured decoder compared to competitive
baselines
On the Structural Generalization in Text-to-SQL
Exploring the generalization of a text-to-SQL parser is essential for a
system to automatically adapt the real-world databases. Previous works provided
investigations focusing on lexical diversity, including the influence of the
synonym and perturbations in both natural language questions and databases.
However, research on the structure variety of database schema~(DS) is
deficient. Specifically, confronted with the same input question, the target
SQL is probably represented in different ways when the DS comes to a different
structure. In this work, we provide in-deep discussions about the structural
generalization of text-to-SQL tasks. We observe that current datasets are too
templated to study structural generalization. To collect eligible test data, we
propose a framework to generate novel text-to-SQL data via automatic and
synchronous (DS, SQL) pair altering. In the experiments, significant
performance reduction when evaluating well-trained text-to-SQL models on the
synthetic samples demonstrates the limitation of current research regarding
structural generalization. According to comprehensive analysis, we suggest the
practical reason is the overfitting of (NL, SQL) patterns.Comment: The experiment results of T5 and T5-Picard in Table 5 and Table 6 are
not correct because we made mistakes in the evaluation code