Search CORE

23,517 research outputs found

Learning by Asking Questions

Author: Fergus Rob
Girshick Ross
Gupta Abhinav
Hebert Martial
Misra Ishan
van der Maaten Laurens
Publication venue
Publication date: 04/12/2017
Field of study

We introduce an interactive learning framework for the development and testing of intelligent visual systems, called learning-by-asking (LBA). We explore LBA in context of the Visual Question Answering (VQA) task. LBA differs from standard VQA training in that most questions are not observed during training time, and the learner must ask questions it wants answers to. Thus, LBA more closely mimics natural learning and has the potential to be more data-efficient than the traditional VQA setting. We present a model that performs LBA on the CLEVR dataset, and show that it automatically discovers an easy-to-hard curriculum when learning interactively from an oracle. Our LBA generated data consistently matches or outperforms the CLEVR train data and is more sample efficient. We also show that our model asks questions that generalize to state-of-the-art VQA models and to novel test time distributions

arXiv.org e-Print Archive

Crossref

Automatic Generation of Grounded Visual Questions

Author: Qu Lizhen
Yang Zhenglu
You Shaodi
Zhang Jiawan
Zhang Shijie
Publication venue
Publication date: 29/05/2017
Field of study

In this paper, we propose the first model to be able to generate visually grounded questions with diverse types for a single image. Visual question generation is an emerging topic which aims to ask questions in natural language based on visual input. To the best of our knowledge, it lacks automatic methods to generate meaningful questions with various types for the same visual input. To circumvent the problem, we propose a model that automatically generates visually grounded questions with varying types. Our model takes as input both images and the captions generated by a dense caption model, samples the most probable question types, and generates the questions in sequel. The experimental results on two real world datasets show that our model outperforms the strongest baseline in terms of both correctness and diversity with a wide margin.Comment: VQ

arXiv.org e-Print Archive

Crossref