64 research outputs found

    Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning

    Full text link
    When a natural language generation (NLG) component is implemented in a real-world task-oriented dialogue system, it is necessary to generate not only natural utterances as learned on training data but also utterances adapted to the dialogue environment (e.g., noise from environmental sounds) and the user (e.g., users with low levels of understanding ability). Inspired by recent advances in reinforcement learning (RL) for language generation tasks, we propose ANTOR, a method for Adaptive Natural language generation for Task-Oriented dialogue via Reinforcement learning. In ANTOR, a natural language understanding (NLU) module, which corresponds to the user's understanding of system utterances, is incorporated into the objective function of RL. If the NLG's intentions are correctly conveyed to the NLU, which understands a system's utterances, the NLG is given a positive reward. We conducted experiments on the MultiWOZ dataset, and we confirmed that ANTOR could generate adaptive utterances against speech recognition errors and the different vocabulary levels of users.Comment: Accepted by COLING 202

    Cognitive Architecture Toward Common Ground Sharing Among Humans and Generative AIs: Trial on Model-Model Interactions in Tangram Naming Task

    Full text link
    For generative AIs to be trustworthy, establishing transparent common grounding with humans is essential. As a preparation toward human-model common grounding, this study examines the process of model-model common grounding. In this context, common ground is defined as a cognitive framework shared among agents in communication, enabling the connection of symbols exchanged between agents to the meanings inherent in each agent. This connection is facilitated by a shared cognitive framework among the agents involved. In this research, we focus on the tangram naming task (TNT) as a testbed to examine the common-ground-building process. Unlike previous models designed for this task, our approach employs generative AIs to visualize the internal processes of the model. In this task, the sender constructs a metaphorical image of an abstract figure within the model and generates a detailed description based on this image. The receiver interprets the generated description from the partner by constructing another image and reconstructing the original abstract figure. Preliminary results from the study show an improvement in task performance beyond the chance level, indicating the effect of the common cognitive framework implemented in the models. Additionally, we observed that incremental backpropagations leveraging successful communication cases for a component of the model led to a statistically significant increase in performance. These results provide valuable insights into the mechanisms of common grounding made by generative AIs, improving human communication with the evolving intelligent machines in our future society.Comment: Proceedings of the 2023 AAAI Fall Symposium on Integrating Cognitive Architectures and Generative Model

    Speech and language resources for the development of dialogue systems and problems arising from their deployment

    Get PDF
    NTT CorporationNTT CorporationNTT CorporationNTT CorporationNTT Data CorporationNTT Data CorporationNTT CorporationLREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-09This paper introduces the dialogue systems (chat-oriented and argumentative dialogue systems) we have been developing at NTT together with the speech and language resources we used for building them. We also describe our field trials for deploying dialogue systems on actual premises, i.e., shops and banks. We found that the primary problem with dialogue systems is timing, which led to our current focus on multi-modal processing. We describe our multi-modal corpus as well as our recent research on multi-modal processing

    A discourse-based approach for Arabic question answering

    Get PDF
    The treatment of complex questions with explanatory answers involves searching for arguments in texts. Because of the prominent role that discourse relations play in reflecting text-producers’ intentions, capturing the underlying structure of text constitutes a good instructor in this issue. From our extensive review, a system for automatic discourse analysis that creates full rhetorical structures in large scale Arabic texts is currently unavailable. This is due to the high computational complexity involved in processing a large number of hypothesized relations associated with large texts. Therefore, more practical approaches should be investigated. This paper presents a new Arabic Text Parser oriented for question answering systems dealing with لماذا “why” and كيف “how to” questions. The Text Parser presented here considers the sentence as the basic unit of text and incorporates a set of heuristics to avoid computational explosion. With this approach, the developed question answering system reached a significant improvement over the baseline with a Recall of 68% and MRR of 0.62

    対話システムライブコンペティションから何が得られたか

    Get PDF
    日本電信電話株式会社NTTメディアインテリジェンス研究所京都大学電気通信大学(株)NTTドコモ(株)富士通研究所東北大学 / 理化学研究所国立国語研究所国立国語研究所日本電信電話株式会社NTTコミュニケーション科学基礎研究所NTT Media Intelligence Laboratories, NTT CorporationKyoto UniversityThe University of Electro-CommunicationsNTT DOCOMO INC.Fujitsu Laboratories, LTD.Tohoku University / RIKEN AIPNational Institute for Japanese Language and LinguisticsNational Institute for Japanese Language and LinguisticsNTT Communication Science Laboratorie

    Learning to generate naturalistic utterances using reviews in spoken dialogue systems

    No full text
    Spoken language generation for dialogue systems requires a dictionary of mappings between semantic representations of concepts the system wants to express and realizations of those concepts. Dictionary creation is a costly process; it is currently done by hand for each dialogue domain. We propose a novel unsupervised method for learning such mappings from user reviews in the target domain, and test it on restaurant reviews. We test the hypothesis that user reviews that provide individual ratings for distinguished attributes of the domain entity make it possible to map review sentences to their semantic representation with high precision. Experimental analyses show that the mappings learned cover most of the domain ontology, and provide good linguistic variation. A subjective user evaluation shows that the consistency between the semantic representations and the learned realizations is high and that the naturalness of the realizations is higher than a hand-crafted baseline.
    corecore