3,939 research outputs found
Compression-aware Training of Deep Networks
In recent years, great progress has been made in a variety of application
domains thanks to the development of increasingly deeper neural networks.
Unfortunately, the huge number of units of these networks makes them expensive
both computationally and memory-wise. To overcome this, exploiting the fact
that deep networks are over-parametrized, several compression strategies have
been proposed. These methods, however, typically start from a network that has
been trained in a standard manner, without considering such a future
compression. In this paper, we propose to explicitly account for compression in
the training process. To this end, we introduce a regularizer that encourages
the parameter matrix of each layer to have low rank during training. We show
that accounting for compression during training allows us to learn much more
compact, yet at least as effective, models than state-of-the-art compression
techniques.Comment: Accepted at NIPS 201
Class-Weighted Convolutional Features for Visual Instance Search
Image retrieval in realistic scenarios targets large dynamic datasets of
unlabeled images. In these cases, training or fine-tuning a model every time
new images are added to the database is neither efficient nor scalable.
Convolutional neural networks trained for image classification over large
datasets have been proven effective feature extractors for image retrieval. The
most successful approaches are based on encoding the activations of
convolutional layers, as they convey the image spatial information. In this
paper, we go beyond this spatial information and propose a local-aware encoding
of convolutional features based on semantic information predicted in the target
image. To this end, we obtain the most discriminative regions of an image using
Class Activation Maps (CAMs). CAMs are based on the knowledge contained in the
network and therefore, our approach, has the additional advantage of not
requiring external information. In addition, we use CAMs to generate object
proposals during an unsupervised re-ranking stage after a first fast search.
Our experiments on two public available datasets for instance retrieval,
Oxford5k and Paris6k, demonstrate the competitiveness of our approach
outperforming the current state-of-the-art when using off-the-shelf models
trained on ImageNet. The source code and model used in this paper are publicly
available at http://imatge-upc.github.io/retrieval-2017-cam/.Comment: To appear in the British Machine Vision Conference (BMVC), September
201
The Initial Screening Order Problem
In this paper we present the initial screening order problem, a crucial step
within candidate screening. It involves a human-like screener with an objective
to find the first k suitable candidates rather than the best k suitable
candidates in a candidate pool given an initial screening order. The initial
screening order represents the way in which the human-like screener arranges
the candidate pool prior to screening. The choice of initial screening order
has considerable effects on the selected set of k candidates. We prove that
under an unbalanced candidate pool (e.g., having more male than female
candidates), the human-like screener can suffer from uneven efforts that hinder
its decision-making over the protected, under-represented group relative to the
non-protected, over-represented group. Other fairness results are proven under
the human-like screener. This research is based on a collaboration with a large
company to better understand its hiring process for potential automation. Our
main contribution is the formalization of the initial screening order problem
which, we argue, opens the path for future extensions of the current works on
ranking algorithms, fairness, and automation for screening procedures
Bringing Background into the Foreground: Making All Classes Equal in Weakly-supervised Video Semantic Segmentation
Pixel-level annotations are expensive and time-consuming to obtain. Hence,
weak supervision using only image tags could have a significant impact in
semantic segmentation. Recent years have seen great progress in
weakly-supervised semantic segmentation, whether from a single image or from
videos. However, most existing methods are designed to handle a single
background class. In practical applications, such as autonomous navigation, it
is often crucial to reason about multiple background classes. In this paper, we
introduce an approach to doing so by making use of classifier heatmaps. We then
develop a two-stream deep architecture that jointly leverages appearance and
motion, and design a loss based on our heatmaps to train it. Our experiments
demonstrate the benefits of our classifier heatmaps and of our two-stream
architecture on challenging urban scene datasets and on the YouTube-Objects
benchmark, where we obtain state-of-the-art results.Comment: 11 pages, 4 figures, 7 tables, Accepted in ICCV 201
- …