2,051 research outputs found

    Revisiting the Fisher vector for fine-grained classification

    Get PDF
    International audienceThis paper describes the joint submission of Inria and Xerox to their joint participation to the FGCOMP'2013 challenge. Although the proposed system follows most of the standard Fisher classification pipeline, we describe a few key features and good practices that significantly improve the accuracy when specifically considering fine-grain classification tasks. In particular, we consider the late fusion of two systems both based on Fisher vectors, but for which we choose drastically design choices that make them very complementary. Moreover, we propose a simple yet effective filtering strategy, which significantly boosts the performance for several class domains

    DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition

    Full text link
    Being symmetric positive-definite (SPD), covariance matrix has traditionally been used to represent a set of local descriptors in visual recognition. Recent study shows that kernel matrix can give considerably better representation by modelling the nonlinearity in the local descriptor set. Nevertheless, neither the descriptors nor the kernel matrix is deeply learned. Worse, they are considered separately, hindering the pursuit of an optimal SPD representation. This work proposes a deep network that jointly learns local descriptors, kernel-matrix-based SPD representation, and the classifier via an end-to-end training process. We derive the derivatives for the mapping from a local descriptor set to the SPD representation to carry out backpropagation. Also, we exploit the Daleckii-Krein formula in operator theory to give a concise and unified result on differentiating SPD matrix functions, including the matrix logarithm to handle the Riemannian geometry of kernel matrix. Experiments not only show the superiority of kernel-matrix-based SPD representation with deep local descriptors, but also verify the advantage of the proposed deep network in pursuing better SPD representations for fine-grained image recognition tasks

    Cross-dimensional Weighting for Aggregated Deep Convolutional Features

    Full text link
    We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs. We first present a generalized framework that encompasses a broad family of approaches and includes cross-dimensional pooling and weighting steps. We then propose specific non-parametric schemes for both spatial- and channel-wise weighting that boost the effect of highly active spatial responses and at the same time regulate burstiness effects. We experiment on different public datasets for image search and show that our approach outperforms the current state-of-the-art for approaches based on pre-trained networks. We also provide an easy-to-use, open source implementation that reproduces our results.Comment: Accepted for publications at the 4th Workshop on Web-scale Vision and Social Media (VSM), ECCV 201

    Recurrent Pixel Embedding for Instance Grouping

    Full text link
    We introduce a differentiable, end-to-end trainable framework for solving pixel-level grouping problems such as instance segmentation consisting of two novel components. First, we regress pixels into a hyper-spherical embedding space so that pixels from the same group have high cosine similarity while those from different groups have similarity below a specified margin. We analyze the choice of embedding dimension and margin, relating them to theoretical results on the problem of distributing points uniformly on the sphere. Second, to group instances, we utilize a variant of mean-shift clustering, implemented as a recurrent neural network parameterized by kernel bandwidth. This recurrent grouping module is differentiable, enjoys convergent dynamics and probabilistic interpretability. Backpropagating the group-weighted loss through this module allows learning to focus on only correcting embedding errors that won't be resolved during subsequent clustering. Our framework, while conceptually simple and theoretically abundant, is also practically effective and computationally efficient. We demonstrate substantial improvements over state-of-the-art instance segmentation for object proposal generation, as well as demonstrating the benefits of grouping loss for classification tasks such as boundary detection and semantic segmentation
    • …
    corecore