425 research outputs found
Visual Imitation Learning with Recurrent Siamese Networks
It would be desirable for a reinforcement learning (RL) based agent to learn
behaviour by merely watching a demonstration. However, defining rewards that
facilitate this goal within the RL paradigm remains a challenge. Here we
address this problem with Siamese networks, trained to compute distances
between observed behaviours and the agent's behaviours. Given a desired motion
such Siamese networks can be used to provide a reward signal to an RL agent via
the distance between the desired motion and the agent's motion. We experiment
with an RNN-based comparator model that can compute distances in space and time
between motion clips while training an RL policy to minimize this distance.
Through experimentation, we have had also found that the inclusion of
multi-task data and an additional image encoding loss helps enforce the
temporal consistency. These two components appear to balance reward for
matching a specific instance of behaviour versus that behaviour in general.
Furthermore, we focus here on a particularly challenging form of this problem
where only a single demonstration is provided for a given task -- the one-shot
learning setting. We demonstrate our approach on humanoid agents in both 2D
with degrees of freedom (DoF) and 3D with DoF.Comment: PrePrin
Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection
Off-topic spoken response detection, the task aiming at predicting whether a
response is off-topic for the corresponding prompt, is important for an
automated speaking assessment system. In many real-world educational
applications, off-topic spoken response detectors are required to achieve high
recall for off-topic responses not only on seen prompts but also on prompts
that are unseen during training. In this paper, we propose a novel approach for
off-topic spoken response detection with high off-topic recall on both seen and
unseen prompts. We introduce a new model, Gated Convolutional Bidirectional
Attention-based Model (GCBiA), which applies bi-attention mechanism and
convolutions to extract topic words of prompts and key-phrases of responses,
and introduces gated unit and residual connections between major layers to
better represent the relevance of responses and prompts. Moreover, a new
negative sampling method is proposed to augment training data. Experiment
results demonstrate that our novel approach can achieve significant
improvements in detecting off-topic responses with extremely high on-topic
recall, for both seen and unseen prompts.Comment: ACL2020 long pape
- …