21,265 research outputs found
Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain
Text-to-SQL aims at generating SQL queries for the given natural language
questions and thus helping users to query databases. Prompt learning with large
language models (LLMs) has emerged as a recent approach, which designs prompts
to lead LLMs to understand the input question and generate the corresponding
SQL. However, it faces challenges with strict SQL syntax requirements. Existing
work prompts the LLMs with a list of demonstration examples (i.e. question-SQL
pairs) to generate SQL, but the fixed prompts can hardly handle the scenario
where the semantic gap between the retrieved demonstration and the input
question is large. In this paper, we propose a retrieval-augmented prompting
method for a LLM-based Text-to-SQL framework, involving sample-aware prompting
and a dynamic revision chain. Our approach incorporates sample-aware
demonstrations, which include the composition of SQL operators and fine-grained
information related to the given question. To retrieve questions sharing
similar intents with input questions, we propose two strategies for assisting
retrieval. Firstly, we leverage LLMs to simplify the original questions,
unifying the syntax and thereby clarifying the users' intentions. To generate
executable and accurate SQLs without human intervention, we design a dynamic
revision chain which iteratively adapts fine-grained feedback from the
previously generated SQL. Experimental results on three Text-to-SQL benchmarks
demonstrate the superiority of our method over strong baseline models
Evaluating Mixed-initiative Conversational Search Systems via User Simulation
Clarifying the underlying user information need by asking clarifying
questions is an important feature of modern conversational search system.
However, evaluation of such systems through answering prompted clarifying
questions requires significant human effort, which can be time-consuming and
expensive. In this paper, we propose a conversational User Simulator, called
USi, for automatic evaluation of such conversational search systems. Given a
description of an information need, USi is capable of automatically answering
clarifying questions about the topic throughout the search session. Through a
set of experiments, including automated natural language generation metrics and
crowdsourcing studies, we show that responses generated by USi are both inline
with the underlying information need and comparable to human-generated answers.
Moreover, we make the first steps towards multi-turn interactions, where
conversational search systems asks multiple questions to the (simulated) user
with a goal of clarifying the user need. To this end, we expand on currently
available datasets for studying clarifying questions, i.e., Qulac and ClariQ,
by performing a crowdsourcing-based multi-turn data acquisition. We show that
our generative, GPT2-based model, is capable of providing accurate and natural
answers to unseen clarifying questions in the single-turn setting and discuss
capabilities of our model in the multi-turn setting. We provide the code, data,
and the pre-trained model to be used for further research on the topic
Augmenting Ad-Hoc IR Dataset for Interactive Conversational Search
A peculiarity of conversational search systems is that they involve
mixed-initiatives such as system-generated query clarifying questions.
Evaluating those systems at a large scale on the end task of IR is very
challenging, requiring adequate datasets containing such interactions. However,
current datasets only focus on either traditional ad-hoc IR tasks or query
clarification tasks, the latter being usually seen as a reformulation task from
the initial query. The only two datasets known to us that contain both document
relevance judgments and the associated clarification interactions are Qulac and
ClariQ. Both are based on the TREC Web Track 2009-12 collection, but cover a
very limited number of topics (237 topics), far from being enough for training
and testing conversational IR models. To fill the gap, we propose a methodology
to automatically build large-scale conversational IR datasets from ad-hoc IR
datasets in order to facilitate explorations on conversational IR. Our
methodology is based on two processes: 1) generating query clarification
interactions through query clarification and answer generators, and 2)
augmenting ad-hoc IR datasets with simulated interactions. In this paper, we
focus on MsMarco and augment it with query clarification and answer
simulations. We perform a thorough evaluation showing the quality and the
relevance of the generated interactions for each initial query. This paper
shows the feasibility and utility of augmenting ad-hoc IR datasets for
conversational IR
ConvAI3: Generating Clarifying Questions for Open-Domain Dialogue Systems (ClariQ)
This document presents a detailed description of the challenge on clarifying
questions for dialogue systems (ClariQ). The challenge is organized as part of
the Conversational AI challenge series (ConvAI3) at Search Oriented
Conversational AI (SCAI) EMNLP workshop in 2020. The main aim of the
conversational systems is to return an appropriate answer in response to the
user requests. However, some user requests might be ambiguous. In IR settings
such a situation is handled mainly thought the diversification of the search
result page. It is however much more challenging in dialogue settings with
limited bandwidth. Therefore, in this challenge, we provide a common evaluation
framework to evaluate mixed-initiative conversations. Participants are asked to
rank clarifying questions in an information-seeking conversations. The
challenge is organized in two stages where in Stage 1 we evaluate the
submissions in an offline setting and single-turn conversations. Top
participants of Stage 1 get the chance to have their model tested by human
annotators
An In-depth Investigation of User Response Simulation for Conversational Search
Conversational search has seen increased recent attention in both the IR and
NLP communities. It seeks to clarify and solve a user's search need through
multi-turn natural language interactions. However, most existing systems are
trained and demonstrated with recorded or artificial conversation logs.
Eventually, conversational search systems should be trained, evaluated, and
deployed in an open-ended setting with unseen conversation trajectories. A key
challenge is that training and evaluating such systems both require a
human-in-the-loop, which is expensive and does not scale. One strategy for this
is to simulate users, thereby reducing the scaling costs. However, current user
simulators are either limited to only respond to yes-no questions from the
conversational search system, or unable to produce high quality responses in
general.
In this paper, we show that current state-of-the-art user simulation system
could be significantly improved by replacing it with a smaller but advanced
natural language generation model. But rather than merely reporting this new
state-of-the-art, we present an in-depth investigation of the task of
simulating user response for conversational search. Our goal is to supplement
existing works with an insightful hand-analysis of what challenges are still
unsolved by the advanced model, as well as to propose our solutions for them.
The challenges we identified include (1) dataset noise, (2) a blind spot that
is difficult for existing models to learn, and (3) a specific type of
misevaluation in the standard empirical setup. Except for the dataset noise
issue, we propose solutions to cover the training blind spot and to avoid the
misevaluation. Our proposed solutions lead to further improvements. Our best
system improves the previous state-of-the-art significantly.Comment: 9 page
Generating Synthetic Data for Neural Keyword-to-Question Models
Search typically relies on keyword queries, but these are often semantically
ambiguous. We propose to overcome this by offering users natural language
questions, based on their keyword queries, to disambiguate their intent. This
keyword-to-question task may be addressed using neural machine translation
techniques. Neural translation models, however, require massive amounts of
training data (keyword-question pairs), which is unavailable for this task. The
main idea of this paper is to generate large amounts of synthetic training data
from a small seed set of hand-labeled keyword-question pairs. Since natural
language questions are available in large quantities, we develop models to
automatically generate the corresponding keyword queries. Further, we introduce
various filtering mechanisms to ensure that synthetic training data is of high
quality. We demonstrate the feasibility of our approach using both automatic
and manual evaluation. This is an extended version of the article published
with the same title in the Proceedings of ICTIR'18.Comment: Extended version of ICTIR'18 full paper, 11 page
- …