60,197 research outputs found
Revisiting Visual Question Answering Baselines
Visual question answering (VQA) is an interesting learning setting for
evaluating the abilities and shortcomings of current systems for image
understanding. Many of the recently proposed VQA systems include attention or
memory mechanisms designed to support "reasoning". For multiple-choice VQA,
nearly all of these systems train a multi-class classifier on image and
question features to predict an answer. This paper questions the value of these
common practices and develops a simple alternative model based on binary
classification. Instead of treating answers as competing choices, our model
receives the answer as input and predicts whether or not an
image-question-answer triplet is correct. We evaluate our model on the Visual7W
Telling and the VQA Real Multiple Choice tasks, and find that even simple
versions of our model perform competitively. Our best model achieves
state-of-the-art performance on the Visual7W Telling task and compares
surprisingly well with the most complex systems proposed for the VQA Real
Multiple Choice task. We explore variants of the model and study its
transferability between both datasets. We also present an error analysis of our
model that suggests a key problem of current VQA systems lies in the lack of
visual grounding of concepts that occur in the questions and answers. Overall,
our results suggest that the performance of current VQA systems is not
significantly better than that of systems designed to exploit dataset biases.Comment: European Conference on Computer Visio
Revisiting loss-specific training of filter-based MRFs for image restoration
It is now well known that Markov random fields (MRFs) are particularly
effective for modeling image priors in low-level vision. Recent years have seen
the emergence of two main approaches for learning the parameters in MRFs: (1)
probabilistic learning using sampling-based algorithms and (2) loss-specific
training based on MAP estimate. After investigating existing training
approaches, it turns out that the performance of the loss-specific training has
been significantly underestimated in existing work. In this paper, we revisit
this approach and use techniques from bi-level optimization to solve it. We
show that we can get a substantial gain in the final performance by solving the
lower-level problem in the bi-level framework with high accuracy using our
newly proposed algorithm. As a result, our trained model is on par with highly
specialized image denoising algorithms and clearly outperforms
probabilistically trained MRF models. Our findings suggest that for the
loss-specific training scheme, solving the lower-level problem with higher
accuracy is beneficial. Our trained model comes along with the additional
advantage, that inference is extremely efficient. Our GPU-based implementation
takes less than 1s to produce state-of-the-art performance.Comment: 10 pages, 2 figures, appear at 35th German Conference, GCPR 2013,
Saarbr\"ucken, Germany, September 3-6, 2013. Proceeding
Revisiting Precision and Recall Definition for Generative Model Evaluation
In this article we revisit the definition of Precision-Recall (PR) curves for
generative models proposed by Sajjadi et al. (arXiv:1806.00035). Rather than
providing a scalar for generative quality, PR curves distinguish mode-collapse
(poor recall) and bad quality (poor precision). We first generalize their
formulation to arbitrary measures, hence removing any restriction to finite
support. We also expose a bridge between PR curves and type I and type II error
rates of likelihood ratio classifiers on the task of discriminating between
samples of the two distributions. Building upon this new perspective, we
propose a novel algorithm to approximate precision-recall curves, that shares
some interesting methodological properties with the hypothesis testing
technique from Lopez-Paz et al (arXiv:1610.06545). We demonstrate the interest
of the proposed formulation over the original approach on controlled
multi-modal datasets.Comment: ICML 201
- …