MQA: Answering the Question via Robotic Manipulation

Abstract

In this paper, we propose a novel task -- Manipulation Question Answering (MQA), where the robot is required to find the answer to the question by actively exploring the environment via manipulation. A framework consisting of a QA model and a manipulation model is proposed to solve this problem. For the QA model, we adopt the method of Visual Question Answering (VQA). For the manipulation model, a Deep Q Network (DQN) model is proposed to generate manipulations. By manipulating objects, the robot can continuously explore the bin until the answer to the question is found. Besides, a novel dataset for simulation that contains a variety of object models, complicated scenarios and corresponding question-answer pairs is established. Extensive experiments have been conducted to validate the effectiveness of the proposed framework

    Similar works

    Full text

    thumbnail-image

    Available Versions