13,499 research outputs found
Multilevel Context Representation for Improving Object Recognition
In this work, we propose the combined usage of low- and high-level blocks of
convolutional neural networks (CNNs) for improving object recognition. While
recent research focused on either propagating the context from all layers, e.g.
ResNet, (including the very low-level layers) or having multiple loss layers
(e.g. GoogLeNet), the importance of the features close to the higher layers is
ignored. This paper postulates that the use of context closer to the high-level
layers provides the scale and translation invariance and works better than
using the top layer only. In particular, we extend AlexNet and GoogLeNet by
additional connections in the top layers. In order to demonstrate the
effectiveness of the proposed approach, we evaluated it on the standard
ImageNet task. The relative reduction of the classification error is around
1-2% without affecting the computational cost. Furthermore, we show that this
approach is orthogonal to typical test data augmentation techniques, as
recently introduced by Szegedy et al. (leading to a runtime reduction of 144
during test time)
Learning Visual Clothing Style with Heterogeneous Dyadic Co-occurrences
With the rapid proliferation of smart mobile devices, users now take millions
of photos every day. These include large numbers of clothing and accessory
images. We would like to answer questions like `What outfit goes well with this
pair of shoes?' To answer these types of questions, one has to go beyond
learning visual similarity and learn a visual notion of compatibility across
categories. In this paper, we propose a novel learning framework to help answer
these types of questions. The main idea of this framework is to learn a feature
transformation from images of items into a latent space that expresses
compatibility. For the feature transformation, we use a Siamese Convolutional
Neural Network (CNN) architecture, where training examples are pairs of items
that are either compatible or incompatible. We model compatibility based on
co-occurrence in large-scale user behavior data; in particular co-purchase data
from Amazon.com. To learn cross-category fit, we introduce a strategic method
to sample training data, where pairs of items are heterogeneous dyads, i.e.,
the two elements of a pair belong to different high-level categories. While
this approach is applicable to a wide variety of settings, we focus on the
representative problem of learning compatible clothing style. Our results
indicate that the proposed framework is capable of learning semantic
information about visual style and is able to generate outfits of clothes, with
items from different categories, that go well together.Comment: ICCV 201
'Part'ly first among equals: Semantic part-based benchmarking for state-of-the-art object recognition systems
An examination of object recognition challenge leaderboards (ILSVRC,
PASCAL-VOC) reveals that the top-performing classifiers typically exhibit small
differences amongst themselves in terms of error rate/mAP. To better
differentiate the top performers, additional criteria are required. Moreover,
the (test) images, on which the performance scores are based, predominantly
contain fully visible objects. Therefore, `harder' test images, mimicking the
challenging conditions (e.g. occlusion) in which humans routinely recognize
objects, need to be utilized for benchmarking. To address the concerns
mentioned above, we make two contributions. First, we systematically vary the
level of local object-part content, global detail and spatial context in images
from PASCAL VOC 2010 to create a new benchmarking dataset dubbed PPSS-12.
Second, we propose an object-part based benchmarking procedure which quantifies
classifiers' robustness to a range of visibility and contextual settings. The
benchmarking procedure relies on a semantic similarity measure that naturally
addresses potential semantic granularity differences between the category
labels in training and test datasets, thus eliminating manual mapping. We use
our procedure on the PPSS-12 dataset to benchmark top-performing classifiers
trained on the ILSVRC-2012 dataset. Our results show that the proposed
benchmarking procedure enables additional differentiation among
state-of-the-art object classifiers in terms of their ability to handle missing
content and insufficient object detail. Given this capability for additional
differentiation, our approach can potentially supplement existing benchmarking
procedures used in object recognition challenge leaderboards.Comment: Extended version of our ACCV-2016 paper. Author formatting modifie
- …
