15,867 research outputs found

    Clique descriptor of affine invariant regions for robust wide baseline image matching

    Get PDF
    Assuming that the image distortion between corresponding regions of a stereo pair of images with wide baseline can be approximated as an affine transformation if the regions are reasonably small, recent image matching algorithms have focused on affine invariant region (IR) detection and its description to increase the robustness in matching. However, the distinctiveness of an intensity-based region descriptor tends to deteriorate when an image includes homogeneous texture or repetitive pattern. To address this problem, we investigated the geometry of a local IR cluster (also called a clique) and propose a new clique-based image matching method. In the proposed method, the clique of an IR is estimated by Delaunay triangulation in a local affine frame and the Hausdorff distance is adopted for matching an inexact number of multiple descriptor vectors. We also introduce two adaptively weighted clique distances, where the neighbour distance in a clique is appropriately weighted according to characteristics of the local feature distribution. Experimental results show the clique-based matching method produces more tentative correspondences than variants of the SIFT-based method

    An Autoencoder-Based Image Descriptor for Image Matching and Retrieval

    Get PDF
    Local image features are used in many computer vision applications. Many point detectors and descriptors have been proposed in recent years; however, creation of effective descriptors is still a topic of research. The Scale Invariant Feature Transform (SIFT) developed by David Lowe is widely used in image matching and image retrieval. SIFT detects interest points in an image based on Scale-Space analysis, which is invariant to change in image scale. A SIFT descriptor contains gradient information about an image patch centered at a point of interest. SIFT is found to provide a high matching rate, is robust to image transformations; however, it is found to be slow in image matching/retrieval. Autoencoder is a method for representation learning and is used in this project to construct a low-dimensional representation of a high-dimensional data while preserving the structure and geometry of the data. In many computer vision tasks, the high dimensionality of input data means a high computational cost. The main motivation in this project is to improve the speed and the distinctness of SIFT descriptors. To achieve this, a new descriptor is proposed that is based on Autoencoder. Our newly generated descriptors can reduce the size and complexity of SIFT descriptors, reducing the time required in image matching and image retrieval

    Novel color and local image descriptors for content-based image search

    Get PDF
    Content-based image classification, search and retrieval is a rapidly-expanding research area. With the advent of inexpensive digital cameras, cheap data storage, fast computing speeds and ever-increasing data transfer rates, millions of images are stored and shared over the Internet every day. This necessitates the development of systems that can classify these images into various categories without human intervention and on being presented a query image, can identify its contents in order to retrieve similar images. Towards that end, this dissertation focuses on investigating novel image descriptors based on texture, shape, color, and local information for advancing content-based image search. Specifically, first, a new color multi-mask Local Binary Patterns (mLBP) descriptor is presented to improve upon the traditional Local Binary Patterns (LBP) texture descriptor for better image classification performance. Second, the mLBP descriptors from different color spaces are fused to form the Color LBP Fusion (CLF) and Color Grayscale LBP Fusion (CGLF) descriptors that further improve image classification performance. Third, a new HaarHOG descriptor, which integrates the Haar wavelet transform and the Histograms of Oriented Gradients (HOG), is presented for extracting both shape and local information for image classification. Next, a novel three Dimensional Local Binary Patterns (3D-LBP) descriptor is proposed for color images by encoding both color and texture information for image search. Furthermore, the novel 3DLH and 3DLH-fusion descriptors are proposed, which combine the HaarHOG and the 3D-LBP descriptors by means of Principal Component Analysis (PCA) and are able to improve upon the individual HaarHOG and 3D-LBP descriptors for image search. Subsequently, the innovative H-descriptor, and the H-fusion descriptor are presented that improve upon the 3DLH descriptor. Finally, the innovative Bag of Words-LBP (BoWL) descriptor is introduced that combines the idea of LBP with a bag-of-words representation to further improve image classification performance. To assess the feasibility of the proposed new image descriptors, two classification frameworks are used. In one, the PCA and the Enhanced Fisher Model (EFM) are applied for feature extraction and the nearest neighbor classification rule for classification. In the other, a Support Vector Machine (SVM) is used for classification. The classification performance is tested on several widely used and publicly available image datasets. The experimental results show that the proposed new image descriptors achieve an image classification performance better than or comparable to other popular image descriptors, such as the Scale Invariant Feature Transform (SIFT), the Pyramid Histograms of visual Words (PHOW), the Pyramid Histograms of Oriented Gradients (PHOG), the Spatial Envelope (SE), the Color SIFT four Concentric Circles (C4CC), the Object Bank (OB), the Hierarchical Matching Pursuit (HMP), the Kernel Spatial Pyramid Matching (KSPM), the SIFT Sparse-coded Spatial Pyramid Matching (ScSPM), the Kernel Codebook (KC) and the LBP

    Investigation of new feature descriptors for image search and classification

    Get PDF
    Content-based image search, classification and retrieval is an active and important research area due to its broad applications as well as the complexity of the problem. Understanding the semantics and contents of images for recognition remains one of the most difficult and prevailing problems in the machine intelligence and computer vision community. With large variations in size, pose, illumination and occlusions, image classification is a very challenging task. A good classification framework should address the key issues of discriminatory feature extraction as well as efficient and accurate classification. Towards that end, this dissertation focuses on exploring new image descriptors by incorporating cues from the human visual system, and integrating local, texture, shape as well as color information to construct robust and effective feature representations for advancing content-based image search and classification. Based on the Gabor wavelet transformation, whose kernels are similar to the 2D receptive field profiles of the mammalian cortical simple cells, a series of new image descriptors is developed. Specifically, first, a new color Gabor-HOG (GHOG) descriptor is introduced by concatenating the Histograms of Oriented Gradients (HOG) of the component images produced by applying Gabor filters in multiple scales and orientations to encode shape information. Second, the GHOG descriptor is analyzed in six different color spaces and grayscale to propose different color GHOG descriptors, which are further combined to present a new Fused Color GHOG (FC-GHOG) descriptor. Third, a novel GaborPHOG (GPHOG) descriptor is proposed which improves upon the Pyramid Histograms of Oriented Gradients (PHOG) descriptor, and subsequently a new FC-GPHOG descriptor is constructed by combining the multiple color GPHOG descriptors and employing the Principal Component Analysis (PCA). Next, the Gabor-LBP (GLBP) is derived by accumulating the Local Binary Patterns (LBP) histograms of the local Gabor filtered images to encode texture and local information of an image. Furthermore, a novel Gabor-LBPPHOG (GLP) image descriptor is proposed which integrates the GLBP and the GPHOG descriptors as a feature set and an innovative Fused Color Gabor-LBP-PHOG (FC-GLP) is constructed by fusing the GLP from multiple color spaces. Subsequently, The GLBP and the GHOG descriptors are then combined to produce the Gabor-LBP-HOG (GLH) feature vector which performs well on different object and scene image categories. The six color GLH vectors are further concatenated to form the Fused Color GLH (FC-GLH) descriptor. Finally, the Wigner based Local Binary Patterns (WLBP) descriptor is proposed that combines multi-neighborhood LBP, Pseudo-Wigner distribution of images and the popular bag of words model to effectively classify scene images. To assess the feasibility of the proposed new image descriptors, two classification methods are used: one method applies the PCA and the Enhanced Fisher Model (EFM) for feature extraction and the nearest neighbor rule for classification, while the other method employs the Support Vector Machine (SVM). The classification performance of the proposed descriptors is tested on several publicly available popular image datasets. The experimental results show that the proposed new image descriptors achieve image search and classification results better than or at par with other popular image descriptors, such as the Scale Invariant Feature Transform (SIFT), the Pyramid Histograms of visual Words (PHOW), the Pyramid Histograms of Oriented Gradients (PHOG), the Spatial Envelope (SE), the Color SIFT four Concentric Circles (C4CC), the Object Bank (OB), the Context Aware Topic Model (CA-TM), the Hierarchical Matching Pursuit (HMP), the Kernel Spatial Pyramid Matching (KSPM), the SIFT Sparse-coded Spatial Pyramid Matching (Sc-SPM), the Kernel Codebook (KC) and the LBP

    Multimodal image registration technique based on improved local feature descriptors

    Get PDF
    Multimodal image registration has received significant research attention over the past decade, and the majority of the techniques are global in nature. Although local techniques are widely used for general image registration, there are only limited studies on them for multimodal image registration. Scale invariant feature transform (SIFT) is a well-known general image registration technique. However, SIFT descriptors are not invariant to multimodality. We propose a SIFT-based technique that is modality invariant and still retains the strengths of local techniques. Moreover, our proposed histogram weighting strategies also improve the accuracy of descriptor matching, which is an important image registration step. As a result, our proposed strategies can not only improve the multimodal registration accuracy but also have the potential to improve the performance of all SIFT-based applications, e.g., general image registration and object recognition

    A Novel Fast and Robust Binary Affine Invariant Descriptor for Image Matching

    Get PDF
    As the current binary descriptors have disadvantages of high computational complexity, no affine invariance, and the high false matching rate with viewpoint changes, a new binary affine invariant descriptor, called BAND, is proposed. Different from other descriptors, BAND has an irregular pattern, which is based on local affine invariant region surrounding a feature point, and it has five orientations, which are obtained by LBP effectively. Ultimately, a 256 bits binary string is computed by simple random sampling pattern. Experimental results demonstrate that BAND has a good matching result in the conditions of rotating, image zooming, noising, lighting, and small-scale perspective transformation. It has better matching performance compared with current mainstream descriptors, while it costs less time
    corecore