5 research outputs found

    ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

    Full text link
    Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks. We study the problem of prompt designing in the text-to-SQL task and attempt to improve the LLMs' reasoning ability when generating SQL queries. Besides the trivial few-shot in-context learning setting, we design our chain-of-thought (CoT) prompt with a similar method to schema linking. We provide a method named ACT-SQL to automatically generate auto-CoT exemplars and thus the whole process doesn't need manual labeling. Our approach is cost-saving since we only use the LLMs' API call once when generating one SQL query. Furthermore, we extend our in-context learning method to the multi-turn text-to-SQL task. The experiment results show that the LLMs' performance can benefit from our ACT-SQL approach. Our approach achieves SOTA performance on the Spider dev set among existing in-context learning approaches

    ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

    Full text link
    Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema. To ensure the well-formedness of output SQLs, one prominent approach adopts a grammar-based recurrent decoder to produce the equivalent SQL abstract syntax tree (AST). However, previous methods mainly utilize an RNN-series decoder, which 1) is time-consuming and inefficient and 2) introduces very few structure priors. In this work, we propose an AST structure-aware Transformer decoder (ASTormer) to replace traditional RNN cells. The structural knowledge, such as node types and positions in the tree, is seamlessly incorporated into the decoder via both absolute and relative position embeddings. Besides, the proposed framework is compatible with different traversing orders even considering adaptive node selection. Extensive experiments on five text-to-SQL benchmarks demonstrate the effectiveness and efficiency of our structured decoder compared to competitive baselines

    On the Structural Generalization in Text-to-SQL

    Full text link
    Exploring the generalization of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases. Previous works provided investigations focusing on lexical diversity, including the influence of the synonym and perturbations in both natural language questions and databases. However, research on the structure variety of database schema~(DS) is deficient. Specifically, confronted with the same input question, the target SQL is probably represented in different ways when the DS comes to a different structure. In this work, we provide in-deep discussions about the structural generalization of text-to-SQL tasks. We observe that current datasets are too templated to study structural generalization. To collect eligible test data, we propose a framework to generate novel text-to-SQL data via automatic and synchronous (DS, SQL) pair altering. In the experiments, significant performance reduction when evaluating well-trained text-to-SQL models on the synthetic samples demonstrates the limitation of current research regarding structural generalization. According to comprehensive analysis, we suggest the practical reason is the overfitting of (NL, SQL) patterns.Comment: The experiment results of T5 and T5-Picard in Table 5 and Table 6 are not correct because we made mistakes in the evaluation code
    corecore