1,377 research outputs found

    Knowledge Graph and Deep Learning-based Text-to-GQL Model for Intelligent Medical Consultation Chatbot

    Get PDF
    Text-to-GQL (Text2GQL) is a task that converts the user's questions into GQL (Graph Query Language) when a graph database is given. That is a task of semantic parsing that transforms natural language problems into logical expressions, which will bring more efficient direct communication between humans and machines. The existing related work mainly focuses on Text-to-SQL tasks, and there is no available semantic parsing method and data set for the graph database. In order to fill the gaps in this field to serve the medical Human–Robot Interactions (HRI) better, we propose this task and a pipeline solution for the Text2GQL task. This solution uses the Adapter pre-trained by “the linking of GQL schemas and the corresponding utterances" as an external knowledge introduction plug-in. By inserting the Adapter into the language model, the mapping between logical language and natural language can be introduced faster and more directly to better realize the end-to-end human–machine language translation task. In the study, the proposed Text2GQL task model is mainly constructed based on an improved pipeline composed of a Language Model, Pre-trained Adapter plug-in, and Pointer Network. This enables the model to copy objects' tokens from utterances, generate corresponding GQL statements for graph database retrieval, and builds an adjustment mechanism to improve the final output. And the experiments have proved that our proposed method has certain competitiveness on the counterpart datasets (Spider, ATIS, GeoQuery, and 39.net) converted from the Text2SQL task, and the proposed method is also practical in medical scenarios

    Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

    Full text link
    The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models.Comment: 20 pages, 4 figures, 5 tables. Submitted to IEEE TKD

    Selecting and Generating Computational Meaning Representations for Short Texts

    Full text link
    Language conveys meaning, so natural language processing (NLP) requires representations of meaning. This work addresses two broad questions: (1) What meaning representation should we use? and (2) How can we transform text to our chosen meaning representation? In the first part, we explore different meaning representations (MRs) of short texts, ranging from surface forms to deep-learning-based models. We show the advantages and disadvantages of a variety of MRs for summarization, paraphrase detection, and clustering. In the second part, we use SQL as a running example for an in-depth look at how we can parse text into our chosen MR. We examine the text-to-SQL problem from three perspectives—methodology, systems, and applications—and show how each contributes to a fuller understanding of the task.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143967/1/cfdollak_1.pd

    SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers

    Full text link
    This paper aims to improve the performance of text-to-SQL parsing by exploring the intrinsic uncertainties in the neural network based approaches (called SUN). From the data uncertainty perspective, it is indisputable that a single SQL can be learned from multiple semantically-equivalent questions.Different from previous methods that are limited to one-to-one mapping, we propose a data uncertainty constraint to explore the underlying complementary semantic information among multiple semantically-equivalent questions (many-to-one) and learn the robust feature representations with reduced spurious associations. In this way, we can reduce the sensitivity of the learned representations and improve the robustness of the parser. From the model uncertainty perspective, there is often structural information (dependence) among the weights of neural networks. To improve the generalizability and stability of neural text-to-SQL parsers, we propose a model uncertainty constraint to refine the query representations by enforcing the output representations of different perturbed encoding networks to be consistent with each other. Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms strong competitors and achieves new state-of-the-art results. For reproducibility, we release our code and data at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/sunsql.Comment: Accepted at COLING 202

    Natural Language Interfaces to Data

    Full text link
    Recent advances in NLU and NLP have resulted in renewed interest in natural language interfaces to data, which provide an easy mechanism for non-technical users to access and query the data. While early systems evolved from keyword search and focused on simple factual queries, the complexity of both the input sentences as well as the generated SQL queries has evolved over time. More recently, there has also been a lot of focus on using conversational interfaces for data analytics, empowering a line of non-technical users with quick insights into the data. There are three main challenges in natural language querying (NLQ): (1) identifying the entities involved in the user utterance, (2) connecting the different entities in a meaningful way over the underlying data source to interpret user intents, and (3) generating a structured query in the form of SQL or SPARQL. There are two main approaches for interpreting a user's NLQ. Rule-based systems make use of semantic indices, ontologies, and KGs to identify the entities in the query, understand the intended relationships between those entities, and utilize grammars to generate the target queries. With the advances in deep learning (DL)-based language models, there have been many text-to-SQL approaches that try to interpret the query holistically using DL models. Hybrid approaches that utilize both rule-based techniques as well as DL models are also emerging by combining the strengths of both approaches. Conversational interfaces are the next natural step to one-shot NLQ by exploiting query context between multiple turns of conversation for disambiguation. In this article, we review the background technologies that are used in natural language interfaces, and survey the different approaches to NLQ. We also describe conversational interfaces for data analytics and discuss several benchmarks used for NLQ research and evaluation.Comment: The full version of this manuscript, as published by Foundations and Trends in Databases, is available at http://dx.doi.org/10.1561/190000007

    Translating Natural Language Queries to SQL Using the T5 Model

    Full text link
    This paper presents the development process of a natural language to SQL model using the T5 model as the basis. The models, developed in August 2022 for an online transaction processing system and a data warehouse, have a 73\% and 84\% exact match accuracy respectively. These models, in conjunction with other work completed in the research project, were implemented for several companies and used successfully on a daily basis. The approach used in the model development could be implemented in a similar fashion for other database environments and with a more powerful pre-trained language model
    • …
    corecore