2,353 research outputs found

    A comparative evaluation of interest point detectors and local descriptors for visual SLAM

    Get PDF
    Abstract In this paper we compare the behavior of different interest points detectors and descriptors under the conditions needed to be used as landmarks in vision-based simultaneous localization and mapping (SLAM). We evaluate the repeatability of the detectors, as well as the invariance and distinctiveness of the descriptors, under different perceptual conditions using sequences of images representing planar objects as well as 3D scenes. We believe that this information will be useful when selecting an appropriat

    From handcrafted to deep local features

    Full text link
    This paper presents an overview of the evolution of local features from handcrafted to deep-learning-based methods, followed by a discussion of several benchmarks and papers evaluating such local features. Our investigations are motivated by 3D reconstruction problems, where the precise location of the features is important. As we describe these methods, we highlight and explain the challenges of feature extraction and potential ways to overcome them. We first present handcrafted methods, followed by methods based on classical machine learning and finally we discuss methods based on deep-learning. This largely chronologically-ordered presentation will help the reader to fully understand the topic of image and region description in order to make best use of it in modern computer vision applications. In particular, understanding handcrafted methods and their motivation can help to understand modern approaches and how machine learning is used to improve the results. We also provide references to most of the relevant literature and code.Comment: Preprin

    Local Jet Pattern: A Robust Descriptor for Texture Classification

    Full text link
    Methods based on local image features have recently shown promise for texture classification tasks, especially in the presence of large intra-class variation due to illumination, scale, and viewpoint changes. Inspired by the theories of image structure analysis, this paper presents a simple, efficient, yet robust descriptor namely local jet pattern (LJP) for texture classification. In this approach, a jet space representation of a texture image is derived from a set of derivatives of Gaussian (DtGs) filter responses up to second order, so called local jet vectors (LJV), which also satisfy the Scale Space properties. The LJP is obtained by utilizing the relationship of center pixel with the local neighborhood information in jet space. Finally, the feature vector of a texture region is formed by concatenating the histogram of LJP for all elements of LJV. All DtGs responses up to second order together preserves the intrinsic local image structure, and achieves invariance to scale, rotation, and reflection. This allows us to develop a texture classification framework which is discriminative and robust. Extensive experiments on five standard texture image databases, employing nearest subspace classifier (NSC), the proposed descriptor achieves 100%, 99.92%, 99.75%, 99.16%, and 99.65% accuracy for Outex_TC-00010 (Outex_TC10), and Outex_TC-00012 (Outex_TC12), KTH-TIPS, Brodatz, CUReT, respectively, which are outperforms the state-of-the-art methods.Comment: Accepted in Multimedia Tools and Applications, Springe

    Learning Spread-out Local Feature Descriptors

    Full text link
    We propose a simple, yet powerful regularization technique that can be used to significantly improve both the pairwise and triplet losses in learning local feature descriptors. The idea is that in order to fully utilize the expressive power of the descriptor space, good local feature descriptors should be sufficiently "spread-out" over the space. In this work, we propose a regularization term to maximize the spread in feature descriptor inspired by the property of uniform distribution. We show that the proposed regularization with triplet loss outperforms existing Euclidean distance based descriptor learning techniques by a large margin. As an extension, the proposed regularization technique can also be used to improve image-level deep feature embedding.Comment: ICCV 2017. 9 pages, 7 figure

    A Review of Codebook Models in Patch-Based Visual Object Recognition

    No full text
    The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods

    SOSNet: Second Order Similarity Regularization for Local Descriptor Learning

    Full text link
    Despite the fact that Second Order Similarity (SOS) has been used with significant success in tasks such as graph matching and clustering, it has not been exploited for learning local descriptors. In this work, we explore the potential of SOS in the field of descriptor learning by building upon the intuition that a positive pair of matching points should exhibit similar distances with respect to other points in the embedding space. Thus, we propose a novel regularization term, named Second Order Similarity Regularization (SOSR), that follows this principle. By incorporating SOSR into training, our learned descriptor achieves state-of-the-art performance on several challenging benchmarks containing distinct tasks ranging from local patch retrieval to structure from motion. Furthermore, by designing a von Mises-Fischer distribution based evaluation method, we link the utilization of the descriptor space to the matching performance, thus demonstrating the effectiveness of our proposed SOSR. Extensive experimental results, empirical evidence, and in-depth analysis are provided, indicating that SOSR can significantly boost the matching performance of the learned descriptor

    Appearance Descriptors for Person Re-identification: a Comprehensive Review

    Full text link
    In video-surveillance, person re-identification is the task of recognising whether an individual has already been observed over a network of cameras. Typically, this is achieved by exploiting the clothing appearance, as classical biometric traits like the face are impractical in real-world video surveillance scenarios. Clothing appearance is represented by means of low-level \textit{local} and/or \textit{global} features of the image, usually extracted according to some part-based body model to treat different body parts (e.g. torso and legs) independently. This paper provides a comprehensive review of current approaches to build appearance descriptors for person re-identification. The most relevant techniques are described in detail, and categorised according to the body models and features used. The aim of this work is to provide a structured body of knowledge and a starting point for researchers willing to conduct novel investigations on this challenging topic

    From BoW to CNN: Two Decades of Texture Representation for Texture Classification

    Full text link
    Texture is a fundamental characteristic of many types of images, and texture representation is one of the essential and challenging problems in computer vision and pattern recognition which has attracted extensive research attention. Since 2000, texture representations based on Bag of Words (BoW) and on Convolutional Neural Networks (CNNs) have been extensively studied with impressive performance. Given this period of remarkable evolution, this paper aims to present a comprehensive survey of advances in texture representation over the last two decades. More than 200 major publications are cited in this survey covering different aspects of the research, which includes (i) problem description; (ii) recent advances in the broad categories of BoW-based, CNN-based and attribute-based methods; and (iii) evaluation issues, specifically benchmark datasets and state of the art results. In retrospect of what has been achieved so far, the survey discusses open challenges and directions for future research.Comment: Accepted by IJC

    Metric Learning in Codebook Generation of Bag-of-Words for Person Re-identification

    Full text link
    Person re-identification is generally divided into two part: first how to represent a pedestrian by discriminative visual descriptors and second how to compare them by suitable distance metrics. Conventional methods isolate these two parts, the first part usually unsupervised and the second part supervised. The Bag-of-Words (BoW) model is a widely used image representing descriptor in part one. Its codebook is simply generated by clustering visual features in Euclidian space. In this paper, we propose to use part two metric learning techniques in the codebook generation phase of BoW. In particular, the proposed codebook is clustered under Mahalanobis distance which is learned supervised. Extensive experiments prove that our proposed method is effective. With several low level features extracted on superpixel and fused together, our method outperforms state-of-the-art on person re-identification benchmarks including VIPeR, PRID450S, and Market1501

    A Review on Near Duplicate Detection of Images using Computer Vision Techniques

    Full text link
    Nowadays, digital content is widespread and simply redistributable, either lawfully or unlawfully. For example, after images are posted on the internet, other web users can modify them and then repost their versions, thereby generating near-duplicate images. The presence of near-duplicates affects the performance of the search engines critically. Computer vision is concerned with the automatic extraction, analysis and understanding of useful information from digital images. The main application of computer vision is image understanding. There are several tasks in image understanding such as feature extraction, object detection, object recognition, image cleaning, image transformation, etc. There is no proper survey in literature related to near duplicate detection of images. In this paper, we review the state-of-the-art computer vision-based approaches and feature extraction methods for the detection of near duplicate images. We also discuss the main challenges in this field and how other researchers addressed those challenges. This review provides research directions to the fellow researchers who are interested to work in this field.Comment: 37 Pages, 7 figures, "For online first version, see https://link.springer.com/article/10.1007/s11831-020-09400-w
    • …
    corecore