117 research outputs found

    Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

    Full text link
    Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use the system by constraining the user behaviors within the system's capabilities via strict user goals, namely "user familiarity" bias. This data bias deepens when it combines with data-driven TOD systems, as it is impossible to fathom the effect of it with existing static evaluations. Hence, we conduct an interactive user study to unveil how vulnerable TOD systems are against realistic scenarios. In particular, we compare users with 1) detailed goal instructions that conform to the system boundaries (closed-goal) and 2) vague goal instructions that are often unsupported but realistic (open-goal). Our study reveals that conversations in open-goal settings lead to catastrophic failures of the system, in which 92% of the dialogues had significant issues. Moreover, we conduct a thorough analysis to identify distinctive features between the two settings through error annotation. From this, we discover a novel "pretending" behavior, in which the system pretends to handle the user requests even though they are beyond the system's capabilities. We discuss its characteristics and toxicity while emphasizing transparency and a fallback strategy for robust TOD systems

    End-to-End Goal-Oriented Conversational Agent for Risk Awareness

    Get PDF
    Traditional development of goal-oriented conversational agents typically require a lot of domain-specific handcrafting, which precludes scaling up to different domains; end-to-end systems would escape this limitation because they can be trained directly from dialogues. The very promising success recently obtained in end-to-end chatbots development could carry over to goal-oriented settings: applying deep learning models for building robust and scalable goal-oriented dialog systems directly from corpora of conversations is a challenging task and an open research area. For this reason, I decided that it would have been more relevant in the context of a master's thesis to experiment and get acquainted with these new promising methodologies - although not yet ready for production - rather than investing time in hand-crafting dialogue rules for a domain-specific solution. My thesis work had the following macro objectives: (i) investigate the latest research works concerning goal-oriented conversational agents development; (ii) choose a reference study, understand it and implement it with an appropriate technology; (iii) apply what learnt to a particular domain of interest. As a reference framework I chose the end-to-end memory networks (MemN2N) (Sukhbaatar et al., 2015) because it has proven to be particularly promising and has been used as a baseline for many recent works. Not having real dialogues available for training though, I took care of synthetically generating a corpora of conversations, taking a cue from the Dialog bAbI dataset for restaurant reservations (Bordes et al., 2016) and adapting it to the new domain of interest of risk awareness. Finally, I built a simple prototype which exploited the pre-trained dialog model in order to advise users about risk through an anthropomorphic talking avatar interface

    A Survey of Available Corpora For Building Data-Driven Dialogue Systems: The Journal Version

    Get PDF
    During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective
    corecore