4 research outputs found

    One-Shot Observation Learning

    Get PDF
    Observation learning is the process of learning a task by observing an expert demonstrator. We present a robust observation learning method for robotic systems. Our principle contributions are in introducing a one shot learning method where only a single demonstration is needed for learning and in proposing a novel feature extraction method for extracting unique activity features from the demonstration. Reward values are then generated from these demonstrations. We use a learning algorithm with these rewards to learn the controls for a robotic manipulator to perform the demonstrated task. With simulation and real robot experiments, we show that the proposed method can be used to learn tasks from a single demonstration under varying conditions of viewpoints, object properties, morphology of manipulators and scene backgrounds

    Seeing to learn: Observational learning of robotic manipulation tasks

    Get PDF
    Learning new tasks has always been a challenging problem in robotics. Even though several approaches have been proposed, from manual programming to learning from demonstrations, the field has directions which require further research and development. This thesis focuses on one of these relatively unexplored areas: observational learning. We present O2A, a novel method for learning to perform robotic manipulation tasks from a single (one-shot) third-person demonstration video. The key novelty lies in pre-training a feature extractor for creating an abstract feature representation for actions that we call ‘action vectors’. The action vectors are extracted using a 3D-CNN network pre-trained for action recognition on a generic action dataset. The distance between the action vectors from the observed third-person demonstration and trial robot executions are used as a reward/cost for learning of the demonstrated task. We report on experiments in simulation and on a real robot, with changes in viewpoint of observation, properties of the objects involved, scene background and morphology of the manipulator between the demonstration and the learning domains. O2A outperforms baseline approaches under different domain shifts and has comparable performance with an oracle (that uses an ideal reward function). We also plot visualisation of trajectories and show that our method has high reward for desired trajectories. Finally, we present a framework for extending observational learning with multi modal observations. We report our initial experiments and results in the future works
    corecore