1,854 research outputs found
Compact Bilinear Pooling
Bilinear models has been shown to achieve impressive performance on a wide
range of visual tasks, such as semantic segmentation, fine grained recognition
and face recognition. However, bilinear features are high dimensional,
typically on the order of hundreds of thousands to a few million, which makes
them impractical for subsequent analysis. We propose two compact bilinear
representations with the same discriminative power as the full bilinear
representation but with only a few thousand dimensions. Our compact
representations allow back-propagation of classification errors enabling an
end-to-end optimization of the visual recognition system. The compact bilinear
representations are derived through a novel kernelized analysis of bilinear
pooling which provide insights into the discriminative power of bilinear
pooling, and a platform for further research in compact pooling methods.
Experimentation illustrate the utility of the proposed representations for
image classification and few-shot learning across several datasets.Comment: Camera ready version for CVP
Describing Textures in the Wild
Patterns and textures are defining characteristics of many natural objects: a
shirt can be striped, the wings of a butterfly can be veined, and the skin of
an animal can be scaly. Aiming at supporting this analytical dimension in image
understanding, we address the challenging problem of describing textures with
semantic attributes. We identify a rich vocabulary of forty-seven texture terms
and use them to describe a large dataset of patterns collected in the wild.The
resulting Describable Textures Dataset (DTD) is the basis to seek for the best
texture representation for recognizing describable texture attributes in
images. We port from object recognition to texture recognition the Improved
Fisher Vector (IFV) and show that, surprisingly, it outperforms specialized
texture descriptors not only on our problem, but also in established material
recognition datasets. We also show that the describable attributes are
excellent texture descriptors, transferring between datasets and tasks; in
particular, combined with IFV, they significantly outperform the
state-of-the-art by more than 8 percent on both FMD and KTHTIPS-2b benchmarks.
We also demonstrate that they produce intuitive descriptions of materials and
Internet images.Comment: 13 pages; 12 figures Fixed misplaced affiliatio
Deep filter banks for texture recognition, description, and segmentation
Visual textures have played a key role in image understanding because they
convey important semantics of images, and because texture representations that
pool local image descriptors in an orderless manner have had a tremendous
impact in diverse applications. In this paper we make several contributions to
texture understanding. First, instead of focusing on texture instance and
material category recognition, we propose a human-interpretable vocabulary of
texture attributes to describe common texture patterns, complemented by a new
describable texture dataset for benchmarking. Second, we look at the problem of
recognizing materials and texture attributes in realistic imaging conditions,
including when textures appear in clutter, developing corresponding benchmarks
on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic
texture representations, including bag-of-visual-words and the Fisher vectors,
in the context of deep learning and show that these have excellent efficiency
and generalization properties if the convolutional layers of a deep model are
used as filter banks. We obtain in this manner state-of-the-art performance in
numerous datasets well beyond textures, an efficient method to apply deep
features to image regions, as well as benefit in transferring features from one
domain to another.Comment: 29 pages; 13 figures; 8 table
- …