53,638 research outputs found

    Usefulness, localizability, humanness, and language-benefit: additional evaluation criteria for natural language dialogue systems

    Get PDF
    Human–computer dialogue systems interact with human users using natural language. We used the ALICE/AIML chatbot architecture as a platform to develop a range of chatbots covering different languages, genres, text-types, and user-groups, to illustrate qualitative aspects of natural language dialogue system evaluation. We present some of the different evaluation techniques used in natural language dialogue systems, including black box and glass box, comparative, quantitative, and qualitative evaluation. Four aspects of NLP dialogue system evaluation are often overlooked: “usefulness” in terms of a user’s qualitative needs, “localizability” to new genres and languages, “humanness” or “naturalness” compared to human–human dialogues, and “language benefit” compared to alternative interfaces. We illustrated these aspects with respect to our work on machine-learnt chatbot dialogue systems; we believe these aspects are worthwhile in impressing potential new users and customers

    User Simulation in Dialogue Systems using Inverse Reinforcement Learning

    No full text
    International audienceSpoken Dialogue Systems (SDS) are man-machine interfaces which use natural language as the medium of interaction. Dialogue corpora collection for the purpose of training and evaluating dialogue systems is an expensive process. User simulators aim at simulating human users in order to generate synthetic data. Existing methods for user simulation mainly focus on generating data with the same statistical consistency as in some reference dialogue corpus. This paper outlines a novel approach for user simulation based on Inverse Reinforcement Learning (IRL). The task of building the user simulator is perceived as a task of imitation learning

    Statistical natural language generation for dialogue systems based on hierarchical models

    Get PDF
    Due to the increasing presence of natural-language interfaces in our life, natural language processing (NLP) is currently gaining more popularity every year. However, until recently, the main part of the research activity in this area was aimed to Natural Language Understanding (NLU), which is responsible for extracting meanings from natural language input. This is explained by a wider number of practical applications of NLU such as machine translation, etc., whereas Natural Language Generation is mainly used for providing output interfaces, which was considered more as a user interface problem rather than a functionality issue. Generally speaking, natural language generation (NLG) is the process of generating text from a semantic representation, which can be expressed in many different forms. The common application of NLG takes part in so called Spoken Dialogue System (SDS), where user interacts directly by voice with a computer- based system to receive information or perform a certain type of actions as, for example, buying a plane ticket or booking a table in a restaurant. Dialogue systems represent one of the most interesting applications within the field of speech technologies. Usually the NLG part in this kind of systems was provided by templates, only filling canned gaps with requested information. But nowadays, since SDS are increasing its complexity, more advanced and user-friendly interfaces should be provided, thereby creating a need for a more refined and adaptive approach. One of the solutions to be considered are the NLG models based on statistical frameworks, where the system’s response to user is generated in real-time, adjusting their response to the user performance, instead of just choosing a pertinent template. Due to the corpus-based approach, these systems are easy to adapt to the different tasks in a range of informational domain. The aim of this work is to present a statistical approach to the problem of utterance generation, which uses cooperation between two different language models (LM) in order to enhance the efficiency of NLG module. In the higher level, a class- based language model is used to build the syntactic structure of the sentence. Inthe second layer, a specific language model acts inside each class, dealing with the words. In the dialogue system described in this work, a user asks for an information regarding to a bus schedule, route schemes, fares and special information. Therefore in each dialogue the user has a specific dialogue goal, which needs to be met by the system. This could be used as one of the methods to measure the system performance, as well as the appropriate utterance generation and average dialogue length, which is important when speaking about an interactive information system. The work is organized as follows. In Section 2 the basic approaches to the NLG task are described, and their advantages and disadvantages are considered. Section 3 presents the objective of this work. In Section 4 the basic model and its novelty is explained. In Section 5 the details of the task features and the corpora employed are presented. Section 6 contains the experiments results and its explanation, as well as the evaluation of the obtained results. The Section 7 resumes the conclusions and the future investigation proposals

    Learning to Map Natural Language to Executable Programs Over Databases

    Get PDF
    Natural language is a fundamental form of information and communication and is becoming the next frontier in computer interfaces. As the amount of data available online has increased exponentially, so has the need for Natural Language Interfaces (NLIs, which is not used for natural language inference in this thesis) to connect the data and the user by easily using natural language, significantly promoting the possibility and efficiency of information access for many users besides data experts. All consumer-facing software will one day have a dialogue interface, and this is the next vital leap in the evolution of search engines. Such intelligent dialogue systems should understand the meaning of language grounded in various contexts and generate effective language responses in different forms for information requests and human-computer communication.Developing these intelligent systems is challenging due to (1) limited benchmarks to drive advancements, (2) alignment mismatches between natural language and formal programs, (3) lack of trustworthiness and interpretability, (4) context dependencies in both human conversational interactions and the target programs, and (5) joint language understanding between dialog questions and NLI environments (e.g. databases and knowledge graphs). This dissertation presents several datasets, neural algorithms, and language models to address these challenges for developing deep learning technologies for conversational natural language interfaces (more specifically, NLIs to Databases or NLIDB). First, to drive advancements towards neural-based conversational NLIs, we design and propose several complex and cross-domain NLI benchmarks, along with introducing several datasets. These datasets enable training large, deep learning models. The evaluation is done on unseen databases. (e.g., about course arrangement). Systems must generalize well to not only new SQL queries but also to unseen database schemas to perform well on these tasks. Furthermore, in real-world applications, users often access information in a multi-turn interaction with the system by asking a sequence of related questions. The users may explicitly refer to or omit previously mentioned entities and constraints and may introduce refinements, additions, or substitutions to what has already been said. Therefore, some of them require systems to model dialog dynamics and generate natural language explanations for user verification. The full dialogue interaction with the system’s responses is also important as this supports clarifying ambiguous questions, verifying returned results, and notifying users of unanswerable or unrelated questions. A robust dialogue-based NLI system that can engage with users by forming its responses has thus become an increasingly necessary component for the query process. Moreover, this thesis presents the development of scalable algorithms designed to parse complex and sequential questions to formal programs (e.g., mapping questions to SQL queries that can execute against databases). We propose a novel neural model that utilizes type information from knowledge graphs to better understand rare entities and numbers in natural language questions. We also introduce a neural model based on syntax tree neural networks, which was the first methodology proposed for generating complex programs from language. Finally, language modeling creates contextualized vector representations of words by training a model to predict the next word given context words, which are the basis of deep learning for NLP. Recently, pre-trained language models such as BERT and RoBERTa achieve tremendous success in many natural language processing tasks such as text understanding and reading comprehension. However, most language models are pre-trained only on free-text such as Wikipedia articles and Books. Given that language in semantic parsing is usually related to some formal representations such as logic forms and SQL queries and has to be grounded in structural environments (e.g., databases), we propose better language models for NLIs by enforcing such compositional interpolation in them. To show they could better jointly understand dialog questions and NLI environments (e.g. databases and knowledge graphs), we show that these language models achieve new state-of-the-art results for seven representative tasks on semantic parsing, dialogue state tracking, and question answering. Also, our proposed pre-training method is much more effective than other prior work

    Conversational OLAP in Action

    Get PDF
    The democratization of data access and the adoption of OLAP in scenarios requiring hand-free interfaces push towards the creation of smart OLAP interfaces. In this demonstration we present COOL, a tool supporting natural language COnversational OLap sessions. COOL interprets and translates a natural language dialogue into an OLAP session that starts with a GPSJ (Generalized Projection, Selection and Join) query. The interpretation relies on a formal grammar and a knowledge base storing metadata from a multidimensional cube. COOL is portable, robust, and requires minimal user intervention. It adopts an n-gram based model and a string similarity function to match known entities in the natural language description. In case of incomplete text description, COOL can obtain the correct query either through automatic inference or through interactions with the user to disambiguate the text. The goal of the demonstration is to let the audience evaluate the usability of COOL and its capabilities in assisting query formulation and ambiguity/error resolution

    Programming by Example and Text-to-Code Translation for Conversational Code Generation

    Full text link
    Dialogue systems is an increasingly popular task of natural language processing. However, the dialogue paths tend to be deterministic, restricted to the system rails, regardless of the given request or input text. Recent advances in program synthesis have led to systems which can synthesize programs from very general search spaces, e.g. Programming by Example, and to systems with very accessible interfaces for writing programs, e.g. text-to-code translation, but have not achieved both of these qualities in the same system. We propose Modular Programs for Text-guided Hierarchical Synthesis (MPaTHS), a method for integrating Programming by Example and text-to-code systems which offers an accessible natural language interface for synthesizing general programs. We present a program representation that allows our method to be applied to the problem of task-oriented dialogue. Finally, we demo MPaTHS using our program representation.Comment: 13 pages, 2 figures, conference preprin
    • …
    corecore