68 research outputs found
Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
When a natural language generation (NLG) component is implemented in a
real-world task-oriented dialogue system, it is necessary to generate not only
natural utterances as learned on training data but also utterances adapted to
the dialogue environment (e.g., noise from environmental sounds) and the user
(e.g., users with low levels of understanding ability). Inspired by recent
advances in reinforcement learning (RL) for language generation tasks, we
propose ANTOR, a method for Adaptive Natural language generation for
Task-Oriented dialogue via Reinforcement learning. In ANTOR, a natural language
understanding (NLU) module, which corresponds to the user's understanding of
system utterances, is incorporated into the objective function of RL. If the
NLG's intentions are correctly conveyed to the NLU, which understands a
system's utterances, the NLG is given a positive reward. We conducted
experiments on the MultiWOZ dataset, and we confirmed that ANTOR could generate
adaptive utterances against speech recognition errors and the different
vocabulary levels of users.Comment: Accepted by COLING 202
Cognitive Architecture Toward Common Ground Sharing Among Humans and Generative AIs: Trial on Model-Model Interactions in Tangram Naming Task
For generative AIs to be trustworthy, establishing transparent common
grounding with humans is essential. As a preparation toward human-model common
grounding, this study examines the process of model-model common grounding. In
this context, common ground is defined as a cognitive framework shared among
agents in communication, enabling the connection of symbols exchanged between
agents to the meanings inherent in each agent. This connection is facilitated
by a shared cognitive framework among the agents involved. In this research, we
focus on the tangram naming task (TNT) as a testbed to examine the
common-ground-building process. Unlike previous models designed for this task,
our approach employs generative AIs to visualize the internal processes of the
model. In this task, the sender constructs a metaphorical image of an abstract
figure within the model and generates a detailed description based on this
image. The receiver interprets the generated description from the partner by
constructing another image and reconstructing the original abstract figure.
Preliminary results from the study show an improvement in task performance
beyond the chance level, indicating the effect of the common cognitive
framework implemented in the models. Additionally, we observed that incremental
backpropagations leveraging successful communication cases for a component of
the model led to a statistically significant increase in performance. These
results provide valuable insights into the mechanisms of common grounding made
by generative AIs, improving human communication with the evolving intelligent
machines in our future society.Comment: Proceedings of the 2023 AAAI Fall Symposium on Integrating Cognitive
Architectures and Generative Model
Effects of Demonstrating Consensus Between Robots to Change User’s Opinion
The version of record of this article, first published in International Journal of Social Robotics, is available online at Publisher’s website: https://doi.org/10.1007/s12369-024-01151-z.In recent years, the research of humanoid robots that can change users’ opinions has been conducted extensively. In particular, two robots have been found to be able to improve their persuasiveness by cooperating with each other in a sophisticated manner. Previous studies have evaluated the changes in opinions when robots showed consensus building. However, users did not participate in the conversations, and the optimal strategy may change depending on their prior opinions. Therefore, in this study, we developed a system that adaptively changes conversations between robots based on user opinions. We investigate the effect on the change in opinions when the discussion converges to the same position as the user and when it converges to a different position. We conducted two subject experiments in which a user and virtual robotic agents talked to each other using buttons in a crowded setting. The results showed that users with confidence in their opinions increased their confidence when the robot agents’ opinions converged to the same position and decreased their confidence when the robot agents’ opinions converged to a different position. This will significantly contribute to persuasion research using multiple robots and the development of advanced dialogue coordination between robots
Speech and language resources for the development of dialogue systems and problems arising from their deployment
NTT CorporationNTT CorporationNTT CorporationNTT CorporationNTT Data CorporationNTT Data CorporationNTT CorporationLREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-09This paper introduces the dialogue systems (chat-oriented and argumentative dialogue systems) we have been developing at NTT together with the speech and language resources we used for building them. We also describe our field trials for deploying dialogue systems on actual premises, i.e., shops and banks. We found that the primary problem with dialogue systems is timing, which led to our current focus on multi-modal processing. We describe our multi-modal corpus as well as our recent research on multi-modal processing
A discourse-based approach for Arabic question answering
The treatment of complex questions with explanatory answers involves searching for arguments in texts. Because of the prominent role that discourse relations play in reflecting text-producers’ intentions, capturing the underlying structure of text constitutes a good instructor in this issue. From our extensive review, a system for automatic discourse analysis that creates full rhetorical structures in large scale Arabic texts is currently unavailable. This is due to the high computational complexity involved in processing a large number of hypothesized relations associated with large texts. Therefore, more practical approaches should be investigated. This paper presents a new Arabic Text Parser oriented for question answering systems dealing with لماذا “why” and كيف “how to” questions. The Text Parser presented here considers the sentence as the basic unit of text and incorporates a set of heuristics to avoid computational explosion. With this approach, the developed question answering system reached a significant improvement over the baseline with a Recall of 68% and MRR of 0.62
対話システムライブコンペティションから何が得られたか
日本電信電話株式会社NTTメディアインテリジェンス研究所京都大学電気通信大学(株)NTTドコモ(株)富士通研究所東北大学 / 理化学研究所国立国語研究所国立国語研究所日本電信電話株式会社NTTコミュニケーション科学基礎研究所NTT Media Intelligence Laboratories, NTT CorporationKyoto UniversityThe University of Electro-CommunicationsNTT DOCOMO INC.Fujitsu Laboratories, LTD.Tohoku University / RIKEN AIPNational Institute for Japanese Language and LinguisticsNational Institute for Japanese Language and LinguisticsNTT Communication Science Laboratorie
Learning to generate naturalistic utterances using reviews in spoken dialogue systems
Spoken language generation for dialogue systems requires a dictionary of mappings between semantic representations of concepts the system wants to express and realizations of those concepts. Dictionary creation is a costly process; it is currently done by hand for each dialogue domain. We propose a novel unsupervised method for learning such mappings from user reviews in the target domain, and test it on restaurant reviews. We test the hypothesis that user reviews that provide individual ratings for distinguished attributes of the domain entity make it possible to map review sentences to their semantic representation with high precision. Experimental analyses show that the mappings learned cover most of the domain ontology, and provide good linguistic variation. A subjective user evaluation shows that the consistency between the semantic representations and the learned realizations is high and that the naturalness of the realizations is higher than a hand-crafted baseline.
- …