Behavioural cloning is an imitation learning technique that teaches an agent
how to behave via expert demonstrations. Recent approaches use self-supervision
of fully-observable unlabelled snapshots of the states to decode state pairs
into actions. However, the iterative learning scheme employed by these
techniques is prone to get trapped into bad local minima. Previous work uses
goal-aware strategies to solve this issue. However, this requires manual
intervention to verify whether an agent has reached its goal. We address this
limitation by incorporating a discriminator into the original framework,
offering two key advantages and directly solving a learning problem previous
work had. First, it disposes of the manual intervention requirement. Second, it
helps in learning by guiding function approximation based on the state
transition of the expert's trajectories. Third, the discriminator solves a
learning issue commonly present in the policy model, which is to sometimes
perform a `no action' within the environment until the agent finally halts.Comment: This paper has been accepted in the International Joint Conference on
Neural Networks (IJCNN) 202