7,742 research outputs found
Do We Train on Test Data? Purging CIFAR of Near-Duplicates
The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked
datasets in computer vision and are often used to evaluate novel methods and
model architectures in the field of deep learning. However, we find that 3.3%
and 10% of the images from the test sets of these datasets have duplicates in
the training set. These duplicates are easily recognizable by memorization and
may, hence, bias the comparison of image recognition techniques regarding their
generalization capability. To eliminate this bias, we provide the "fair CIFAR"
(ciFAIR) dataset, where we replaced all duplicates in the test sets with new
images sampled from the same domain. We then re-evaluate the classification
performance of various popular state-of-the-art CNN architectures on these new
test sets to investigate whether recent research has overfitted to memorizing
data instead of learning abstract concepts. We find a significant drop in
classification accuracy of between 9% and 14% relative to the original
performance on the duplicate-free test set. The ciFAIR dataset and pre-trained
models are available at https://cvjena.github.io/cifair/, where we also
maintain a leaderboard.Comment: Journal of Imagin
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
In this work, we tackle the problem of instance segmentation, the task of
simultaneously solving object detection and semantic segmentation. Towards this
goal, we present a model, called MaskLab, which produces three outputs: box
detection, semantic segmentation, and direction prediction. Building on top of
the Faster-RCNN object detector, the predicted boxes provide accurate
localization of object instances. Within each region of interest, MaskLab
performs foreground/background segmentation by combining semantic and direction
prediction. Semantic segmentation assists the model in distinguishing between
objects of different semantic classes including background, while the direction
prediction, estimating each pixel's direction towards its corresponding center,
allows separating instances of the same semantic class. Moreover, we explore
the effect of incorporating recent successful methods from both segmentation
and detection (i.e. atrous convolution and hypercolumn). Our proposed model is
evaluated on the COCO instance segmentation benchmark and shows comparable
performance with other state-of-art models.Comment: 10 pages including referenc
- …