23,517 research outputs found
Learning by Asking Questions
We introduce an interactive learning framework for the development and
testing of intelligent visual systems, called learning-by-asking (LBA). We
explore LBA in context of the Visual Question Answering (VQA) task. LBA differs
from standard VQA training in that most questions are not observed during
training time, and the learner must ask questions it wants answers to. Thus,
LBA more closely mimics natural learning and has the potential to be more
data-efficient than the traditional VQA setting. We present a model that
performs LBA on the CLEVR dataset, and show that it automatically discovers an
easy-to-hard curriculum when learning interactively from an oracle. Our LBA
generated data consistently matches or outperforms the CLEVR train data and is
more sample efficient. We also show that our model asks questions that
generalize to state-of-the-art VQA models and to novel test time distributions
Automatic Generation of Grounded Visual Questions
In this paper, we propose the first model to be able to generate visually
grounded questions with diverse types for a single image. Visual question
generation is an emerging topic which aims to ask questions in natural language
based on visual input. To the best of our knowledge, it lacks automatic methods
to generate meaningful questions with various types for the same visual input.
To circumvent the problem, we propose a model that automatically generates
visually grounded questions with varying types. Our model takes as input both
images and the captions generated by a dense caption model, samples the most
probable question types, and generates the questions in sequel. The
experimental results on two real world datasets show that our model outperforms
the strongest baseline in terms of both correctness and diversity with a wide
margin.Comment: VQ
- …