919 research outputs found

    Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases

    Full text link
    A critical challenge in constructing a natural language interface to database (NLIDB) is bridging the semantic gap between a natural language query (NLQ) and the underlying data. Two specific ways this challenge exhibits itself is through keyword mapping and join path inference. Keyword mapping is the task of mapping individual keywords in the original NLQ to database elements (such as relations, attributes or values). It is challenging due to the ambiguity in mapping the user's mental model and diction to the schema definition and contents of the underlying database. Join path inference is the process of selecting the relations and join conditions in the FROM clause of the final SQL query, and is difficult because NLIDB users lack the knowledge of the database schema or SQL and therefore cannot explicitly specify the intermediate tables and joins needed to construct a final SQL query. In this paper, we propose leveraging information from the SQL query log of a database to enhance the performance of existing NLIDBs with respect to these challenges. We present a system Templar that can be used to augment existing NLIDBs. Our extensive experimental evaluation demonstrates the effectiveness of our approach, leading up to 138% improvement in top-1 accuracy in existing NLIDBs by leveraging SQL query log information.Comment: Accepted to IEEE International Conference on Data Engineering (ICDE) 201

    A Detailed Study on Aggregation Methods used in Natural Language Interface to Databases (NLIDB)

    Get PDF
    Historically, databases have been the most crucial issue in the study of information systems, and they constitute an essential part of all information management systems. Since, it complicated due to restricting the number of potential users, particularly non-expert database users who must comprehend the database structure to submit such queries. Natural language interface (NLI), the simplest method to retrieve information, is one possibility for interacting with the database. The transformation of a natural language query into a Structured Query (SQL) in a database is known as a "Natural Language Interface to Database" (NLIDB). This study uses NLIDB to handle the works performed under various aggregations with aggregation functions, a grouping phrase, and a possessing clause. This study carefully examines the numerous systematic aggregation approaches utilized in the NLIDB. This review provides extensive information about the many methods, including query-based, pattern-based, general, keyword-based NLIDB, and grammar-based systems, to extract data for a dissertation from a generic module for use in such systems that support query execution utilizing aggregations

    Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

    Full text link
    The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models.Comment: 20 pages, 4 figures, 5 tables. Submitted to IEEE TKD

    Natural language interfaces to relational databases

    Get PDF
    Máster Universitario en Lógica, Computación e Inteligencia Artificia

    xDBTagger: Explainable Natural Language Interface to Databases Using Keyword Mappings and Schema Graph

    Full text link
    Translating natural language queries (NLQ) into structured query language (SQL) in interfaces to relational databases is a challenging task that has been widely studied by researchers from both the database and natural language processing communities. Numerous works have been proposed to attack the natural language interfaces to databases (NLIDB) problem either as a conventional pipeline-based or an end-to-end deep-learning-based solution. Nevertheless, regardless of the approach preferred, such solutions exhibit black-box nature, which makes it difficult for potential users targeted by these systems to comprehend the decisions made to produce the translated SQL. To this end, we propose xDBTagger, an explainable hybrid translation pipeline that explains the decisions made along the way to the user both textually and visually. We also evaluate xDBTagger quantitatively in three real-world relational databases. The evaluation results indicate that in addition to being fully interpretable, xDBTagger is effective in terms of accuracy and translates the queries more efficiently compared to other state-of-the-art pipeline-based systems up to 10000 times.Comment: 20 pages, 6 figures. This work is the extended version of arXiv:2101.04226 that appeared in PVLDB'2

    Maximizing User Domain Expertise to Clarify Oblique Specifications of Relational Queries

    Full text link
    While there is abundant access to data management technology today, working with data is still challenging for the average user. One common means of manipulating data is with SQL on relational databases, but this requires knowledge of SQL as well as the database's schema and contents. Consequently, previous work has proposed oblique query specification (OQS) methods such as natural language or programming-by-example to allow users to imprecisely specify their query intent. These methods, however, suffer from either low precision or low expressivity and, in addition, produce a list of candidate SQL queries that make it difficult for users to select their final target query. My thesis is that OQS systems should maximize user domain expertise to triangulate the user's desired query. First, I demonstrate how to leverage previously-issued SQL queries to improve the accuracy of natural language interfaces. Second, I propose a system allowing users to specify a query with both natural language and programming-by-example. Finally, I develop a system where users provide feedback on system-suggested tuples to select a SQL query from a set of candidate queries generated by an OQS system.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155114/1/cjbaik_1.pd
    corecore