165 research outputs found
ConceptNet infused DialoGPT for Underlying Commonsense Understanding and Reasoning in Dialogue Response Generation
The pre-trained conversational models still fail to capture the implicit
commonsense (CS) knowledge hidden in the dialogue interaction, even though they
were pre-trained with an enormous dataset. In order to build a dialogue agent
with CS capability, we firstly inject external knowledge into a pre-trained
conversational model to establish basic commonsense through efficient Adapter
tuning (Section 4). Secondly, we propose the ``two-way learning'' method to
enable the bidirectional relationship between CS knowledge and sentence pairs
so that the model can generate a sentence given the CS triplets, also generate
the underlying CS knowledge given a sentence (Section 5). Finally, we leverage
this integrated CS capability to improve open-domain dialogue response
generation so that the dialogue agent is capable of understanding the CS
knowledge hidden in dialogue history on top of inferring related other
knowledge to further guide response generation (Section 6). The experiment
results demonstrate that CS\_Adapter fusion helps DialoGPT to be able to
generate series of CS knowledge. And the DialoGPT+CS\_Adapter response model
adapted from CommonGen training can generate underlying CS triplets that fits
better to dialogue context.Comment: this is a long paper, the short version was accepted by SemDial 202
Unified Conversational Models with System-Initiated Transitions between Chit-Chat and Task-Oriented Dialogues
Spoken dialogue systems (SDSs) have been separately developed under two
different categories, task-oriented and chit-chat. The former focuses on
achieving functional goals and the latter aims at creating engaging social
conversations without special goals. Creating a unified conversational model
that can engage in both chit-chat and task-oriented dialogue is a promising
research topic in recent years. However, the potential ``initiative'' that
occurs when there is a change between dialogue modes in one dialogue has rarely
been explored. In this work, we investigate two kinds of dialogue scenarios,
one starts from chit-chat implicitly involving task-related topics and finally
switching to task-oriented requests; the other starts from task-oriented
interaction and eventually changes to casual chat after all requested
information is provided. We contribute two efficient prompt models which can
proactively generate a transition sentence to trigger system-initiated
transitions in a unified dialogue model. One is a discrete prompt model trained
with two discrete tokens, the other one is a continuous prompt model using
continuous prompt embeddings automatically generated by a classifier. We
furthermore show that the continuous prompt model can also be used to guide the
proactive transitions between particular domains in a multi-domain
task-oriented setting.Comment: accepted by CUI 202
System-Initiated Transitions from Chit-Chat to Task-Oriented Dialogues with Transition Info Extractor and Transition Sentence Generator
In this work, we study dialogue scenarios that start from chit-chat but eventually switch to task-related services, and investigate how a unified dialogue model, which can engage in both chit-chat and task-oriented dialogues, takes the initiative during the dialogue mode transition from chit-chat to task-oriented in a coherent and cooperative manner. We firstly build a transition info extractor (TIE) that keeps track of the preceding chit-chat interaction and detects the potential user intention to switch to a task-oriented service. Meanwhile, in the unified model, a transition sentence generator (TSG) is extended through efficient Adapter tuning and transition prompt learning. When the TIE successfully finds task-related information from the preceding chit-chat, such as a transition domain (“train” in Figure fig: system-initiated transition from chit-chat to task-oriented.), then the TSG is activated automatically in the unified model to initiate this transition by generating a transition sentence under the guidance of transition information extracted by TIE. The experimental results show promising performance regarding the proactive transitions. We achieve an additional large improvement on TIE model by utilizing Conditional Random Fields (CRF). The TSG can flexibly generate transition sentences while maintaining the unified capabilities of normal chit-chat and task-oriented response generation
Development of a Trust-Aware User Simulator for Statistical Proactive Dialog Modeling in Human-AI Teams
The concept of a Human-AI team has gained increasing attention in recent
years. For effective collaboration between humans and AI teammates, proactivity
is crucial for close coordination and effective communication. However, the
design of adequate proactivity for AI-based systems to support humans is still
an open question and a challenging topic. In this paper, we present the
development of a corpus-based user simulator for training and testing proactive
dialog policies. The simulator incorporates informed knowledge about proactive
dialog and its effect on user trust and simulates user behavior and personal
information, including socio-demographic features and personality traits. Two
different simulation approaches were compared, and a task-step-based approach
yielded better overall results due to enhanced modeling of sequential
dependencies. This research presents a promising avenue for exploring and
evaluating appropriate proactive strategies in a dialog game setting for
improving Human-AI teams.Comment: Preprint Version submitted to ACM UMA
When to Say What and How: Adapting the Elaborateness and Indirectness of Spoken Dialogue Systems
With the aim of designing a spoken dialogue system which has the ability to adapt to the user's communication idiosyncrasies, we investigate whether it is possible to carry over insights from the usage of communication styles in human-human interaction to human-computer interaction. In an extensive literature review, it is demonstrated that communication styles play an important role in human communication. Using a multi-lingual data set, we show that there is a significant correlation between the communication style of the system and the preceding communication style of the user. This is why two components that extend the standard architecture of spoken dialogue systems are presented: 1) a communication style classifier that automatically identifies the user communication style and 2) a communication style selection module that selects an appropriate system communication style. We consider the communication styles elaborateness and indirectness as it has been shown that they influence the user's satisfaction and the user's perception of a dialogue. We present a neural classification approach based on supervised learning for each task. Neural networks are trained and evaluated with features that can be automatically derived during an ongoing interaction in every spoken dialogue system. It is shown that both components yield solid results and outperform the baseline in form of a majority-class classifier
ProDial – an annotated proactive dialogue act corpus for conversational assistants using crowdsourcing
Proactive behaviour is an integral interaction concept of both human-human as well as human-computer cooperation. However, modelling proactive systems and appropriate interaction strategies are still an open quest. In this work, a parameterised and annotated dialogue corpus has been created. The corpus is based on human interactions with an autonomous agent embedded in a serious game setting. For modelling proactive dialogue behaviour, the agent was capable of selecting from four different proactive actions (None, Notification, Suggestion, Intervention) in order to serve as the user’s personal advisor in a sequential planning task. Data was collected online using crowdsourcing (308 participants) resulting in a total of 3696 system-user exchanges. Data was annotated with objective features as well as subjectively self-reported features for capturing the interplay between proactive behaviour and situational as well as user-dependent characteristics. The corpus is intended for building a user model for developing trustworthy proactive interaction strategies
Text categorization methods for automatic estimation of verbal intelligence
In this paper we investigate whether conventional text categorization methods may suffice to infer different verbal intelligence levels. This research goal relies on the hypothesis that the vocabulary that speakers make use of reflects their verbal intelligence levels. Automatic verbal intelligence estimation of users in a spoken language dialog system may be useful when defining an optimal dialog strategy by improving its adaptation capabilities. The work is based on a corpus containing descriptions (i.e. monologs) of a short film by test persons yielding different educational backgrounds and the verbal intelligence scores of the speakers. First, a one-way analysis of variance was performed to compare the monologs with the film transcription and to demonstrate that there are differences in the vocabulary used by the test persons yielding different verbal intelligence levels. Then, for the classification task, the monologs were represented as feature vectors using the classical TF–IDF weighting scheme. The Naive Bayes, k-nearest neighbors and Rocchio classifiers were tested. In this paper we describe and compare these classification approaches, define the optimal classification parameters and discuss the classification results obtained
Estimating Adaptacion of Dialogue Partners with Different Verbal Intelligence
This work investigates to what degree speakers with different verbal intelligence may adapt to each other. The work is based on a corpus consisting of 100 descriptions of a short film (monologues), 56 discussions about the same topic (dialogues), and verbal intelligence scores of the test participants. Adaptation between two dialogue partners was measured using cross-referencing, proportion of "I", "You" and "We" words, between-subject correlation and similarity of texts. It was shown that lower verbal intelligence speakers repeated more nouns and adjectives from the other and used the same linguistic categories more often than higher verbal intelligence speakers. In dialogues between strangers, participants with higher verbal intelligence showed a greater level of adaptation
Investigating verbal intelligence using the TF-IDF approach
In this paper we investigated differences in language use of speakers yielding different verbal intelligence when they describe the same event. The work is based on a corpus containing descriptions of a short film and verbal intelligence scores of the speakers. For analyzing the monologues and the film transcript, the number of reused words, lemmas, n-grams, cosine similarity and other features were calculated and compared to each other for different verbal intelligence groups. The results showed that the similarity of monologues of higher verbal intelligence speakers was greater than of lower and average verbal intelligence participants. A possible explanation of this phenomenon is that candidates yielding higher verbal intelligence have a better short-term memory. In this paper we also checked a hypothesis that differences in vocabulary of speakers yielding different verbal intelligence are sufficient enough for good classification results. For proving this hypothesis, the Nearest Neighbor classifier was trained using TF-IDF vocabulary measures. The maximum achieved accuracy was 92.86%
- …