273 research outputs found
Few-Shot Image Recognition by Predicting Parameters from Activations
In this paper, we are interested in the few-shot learning problem. In
particular, we focus on a challenging scenario where the number of categories
is large and the number of examples per novel category is very limited, e.g. 1,
2, or 3. Motivated by the close relationship between the parameters and the
activations in a neural network associated with the same category, we propose a
novel method that can adapt a pre-trained neural network to novel categories by
directly predicting the parameters from the activations. Zero training is
required in adaptation to novel categories, and fast inference is realized by a
single forward pass. We evaluate our method by doing few-shot image recognition
on the ImageNet dataset, which achieves the state-of-the-art classification
accuracy on novel categories by a significant margin while keeping comparable
performance on the large-scale categories. We also test our method on the
MiniImageNet dataset and it strongly outperforms the previous state-of-the-art
methods
ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond
Motivated by product detection in supermarkets, this paper studies the
problem of object proposal generation in supermarket images and other natural
images. We argue that estimation of object scales in images is helpful for
generating object proposals, especially for supermarket images where object
scales are usually within a small range. Therefore, we propose to estimate
object scales of images before generating object proposals. The proposed method
for predicting object scales is called ScaleNet. To validate the effectiveness
of ScaleNet, we build three supermarket datasets, two of which are real-world
datasets used for testing and the other one is a synthetic dataset used for
training. In short, we extend the previous state-of-the-art object proposal
methods by adding a scale prediction phase. The resulted method outperforms the
previous state-of-the-art on the supermarket datasets by a large margin. We
also show that the approach works for object proposal on other natural images
and it outperforms the previous state-of-the-art object proposal methods on the
MS COCO dataset. The supermarket datasets, the virtual supermarkets, and the
tools for creating more synthetic datasets will be made public
Single-Shot Object Detection with Enriched Semantics
We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216
- …
