1,358 research outputs found
Learning how to learn: an adaptive dialogue agent for incrementally learning visually grounded word meanings
We present an optimised multi-modal dialogue agent for interactive learning
of visually grounded word meanings from a human tutor, trained on real
human-human tutoring data. Within a life-long interactive learning period, the
agent, trained using Reinforcement Learning (RL), must be able to handle
natural conversations with human users and achieve good learning performance
(accuracy) while minimising human effort in the learning process. We train and
evaluate this system in interaction with a simulated human tutor, which is
built on the BURCHAK corpus -- a Human-Human Dialogue dataset for the visual
learning task. The results show that: 1) The learned policy can coherently
interact with the simulated user to achieve the goal of the task (i.e. learning
visual attributes of objects, e.g. colour and shape); and 2) it finds a better
trade-off between classifier accuracy and tutoring costs than hand-crafted
rule-based policies, including ones with dynamic policies.Comment: 10 pages, RoboNLP Workshop from ACL Conferenc
Training an adaptive dialogue policy for interactive learning of visually grounded word meanings
We present a multi-modal dialogue system for interactive learning of
perceptually grounded word meanings from a human tutor. The system integrates
an incremental, semantic parsing/generation framework - Dynamic Syntax and Type
Theory with Records (DS-TTR) - with a set of visual classifiers that are
learned throughout the interaction and which ground the meaning representations
that it produces. We use this system in interaction with a simulated human
tutor to study the effects of different dialogue policies and capabilities on
the accuracy of learned meanings, learning rates, and efforts/costs to the
tutor. We show that the overall performance of the learning agent is affected
by (1) who takes initiative in the dialogues; (2) the ability to express/use
their confidence level about visual attributes; and (3) the ability to process
elliptical and incrementally constructed dialogue turns. Ultimately, we train
an adaptive dialogue policy which optimises the trade-off between classifier
accuracy and tutoring costs.Comment: 11 pages, SIGDIAL 2016 Conferenc
- …