5,734 research outputs found

    Robust Image Analysis by L1-Norm Semi-supervised Learning

    Full text link
    This paper presents a novel L1-norm semi-supervised learning algorithm for robust image analysis by giving new L1-norm formulation of Laplacian regularization which is the key step of graph-based semi-supervised learning. Since our L1-norm Laplacian regularization is defined directly over the eigenvectors of the normalized Laplacian matrix, we successfully formulate semi-supervised learning as an L1-norm linear reconstruction problem which can be effectively solved with sparse coding. By working with only a small subset of eigenvectors, we further develop a fast sparse coding algorithm for our L1-norm semi-supervised learning. Due to the sparsity induced by sparse coding, the proposed algorithm can deal with the noise in the data to some extent and thus has important applications to robust image analysis, such as noise-robust image classification and noise reduction for visual and textual bag-of-words (BOW) models. In particular, this paper is the first attempt to obtain robust image representation by sparse co-refinement of visual and textual BOW models. The experimental results have shown the promising performance of the proposed algorithm.Comment: This is an extension of our long paper in ACM MM 201

    Particular object retrieval with integral max-pooling of CNN activations

    Get PDF
    Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations. Yet such models are not compatible with geometry-aware re-ranking methods and still outperformed, on some particular object retrieval benchmarks, by traditional image search systems relying on precise descriptor matching, geometric re-ranking, or query expansion. This work revisits both retrieval stages, namely initial search and re-ranking, by employing the same primitive information derived from the CNN. We build compact feature vectors that encode several image regions without the need to feed multiple inputs to the network. Furthermore, we extend integral images to handle max-pooling on convolutional layer activations, allowing us to efficiently localize matching objects. The resulting bounding box is finally used for image re-ranking. As a result, this paper significantly improves existing CNN-based recognition pipeline: We report for the first time results competing with traditional methods on the challenging Oxford5k and Paris6k datasets
    • …
    corecore