102,060 research outputs found

    ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

    Full text link
    Automatic speech recognition (ASR) via call is essential for various applications, including AI for contact center (AICC) services. Despite the advancement of ASR, however, most publicly available call-based speech corpora such as Switchboard are old-fashioned. Also, most existing call corpora are in English and mainly focus on open domain dialog or general scenarios such as audiobooks. Here we introduce a new large-scale Korean call-based speech corpus under a goal-oriented dialog scenario from more than 11,000 people, i.e., ClovaCall corpus. ClovaCall includes approximately 60,000 pairs of a short sentence and its corresponding spoken utterance in a restaurant reservation domain. We validate the effectiveness of our dataset with intensive experiments using two standard ASR models. Furthermore, we release our ClovaCall dataset and baseline source codes to be available via https://github.com/ClovaAI/ClovaCall.Comment: 5 pages, 2 figures, 4 tables, The first two authors equally contributed to this wor

    Evaluating Competing Agent Strategies for a Voice Email Agent

    Full text link
    This paper reports experimental results comparing a mixed-initiative to a system-initiative dialog strategy in the context of a personal voice email agent. To independently test the effects of dialog strategy and user expertise, users interact with either the system-initiative or the mixed-initiative agent to perform three successive tasks which are identical for both agents. We report performance comparisons across agent strategies as well as over tasks. This evaluation utilizes and tests the PARADISE evaluation framework, and discusses the performance function derivable from the experimental data.Comment: 6 pages latex, uses icassp91.sty, psfi

    An improved multi-agent simulation methodology for modelling and evaluating wireless communication systems resource allocation algorithms

    Get PDF
    Multi-Agent Systems (MAS) constitute a well known approach in modelling dynamical real world systems. Recently, this technology has been applied to Wireless Communication Systems (WCS), where efficient resource allocation is a primary goal, for modelling the physical entities involved, like Base Stations (BS), service providers and network operators. This paper presents a novel approach in applying MAS methodology to WCS resource allocation by modelling more abstract entities involved in WCS operation, and especially the concurrent network procedures (services). Due to the concurrent nature of a WCS, MAS technology presents a suitable modelling solution. Services such as new call admission, handoff, user movement and call termination are independent to one another and may occur at the same time for many different users in the network. Thus, the required network procedures for supporting the above services act autonomously, interact with the network environment (gather information such as interference conditions), take decisions (e.g. call establishment), etc, and can be modelled as agents. Based on this novel simulation approach, the agent cooperation in terms of negotiation and agreement becomes a critical issue. To this end, two negotiation strategies are presented and evaluated in this research effort and among them the distributed negotiation and communication scheme between network agents is presented to be highly efficient in terms of network performance. The multi-agent concept adapted to the concurrent nature of large scale WCS is, also, discussed in this paper

    Learning End-to-End Goal-Oriented Dialog with Multiple Answers

    Full text link
    In a dialog, there can be multiple valid next utterances at any point. The present end-to-end neural methods for dialog do not take this into account. They learn with the assumption that at any time there is only one correct next utterance. In this work, we focus on this problem in the goal-oriented dialog setting where there are different paths to reach a goal. We propose a new method, that uses a combination of supervised learning and reinforcement learning approaches to address this issue. We also propose a new and more effective testbed, permuted-bAbI dialog tasks, by introducing multiple valid next utterances to the original-bAbI dialog tasks, which allows evaluation of goal-oriented dialog systems in a more realistic setting. We show that there is a significant drop in performance of existing end-to-end neural methods from 81.5% per-dialog accuracy on original-bAbI dialog tasks to 30.3% on permuted-bAbI dialog tasks. We also show that our proposed method improves the performance and achieves 47.3% per-dialog accuracy on permuted-bAbI dialog tasks.Comment: EMNLP 2018. permuted-bAbI dialog tasks are available at - https://github.com/IBM/permuted-bAbI-dialog-task

    IMAGINE Final Report

    No full text

    Contextual Out-of-Domain Utterance Handling With Counterfeit Data Augmentation

    Full text link
    Neural dialog models often lack robustness to anomalous user input and produce inappropriate responses which leads to frustrating user experience. Although there are a set of prior approaches to out-of-domain (OOD) utterance detection, they share a few restrictions: they rely on OOD data or multiple sub-domains, and their OOD detection is context-independent which leads to suboptimal performance in a dialog. The goal of this paper is to propose a novel OOD detection method that does not require OOD data by utilizing counterfeit OOD turns in the context of a dialog. For the sake of fostering further research, we also release new dialog datasets which are 3 publicly available dialog corpora augmented with OOD turns in a controllable way. Our method outperforms state-of-the-art dialog models equipped with a conventional OOD detection mechanism by a large margin in the presence of OOD utterances.Comment: ICASSP 201
    corecore