18,777 research outputs found
Investigation of new feature descriptors for image search and classification
Content-based image search, classification and retrieval is an active and important research area due to its broad applications as well as the complexity of the problem. Understanding the semantics and contents of images for recognition remains one of the most difficult and prevailing problems in the machine intelligence and computer vision community. With large variations in size, pose, illumination and occlusions, image classification is a very challenging task. A good classification framework should address the key issues of discriminatory feature extraction as well as efficient and accurate classification. Towards that end, this dissertation focuses on exploring new image descriptors by incorporating cues from the human visual system, and integrating local, texture, shape as well as color information to construct robust and effective feature representations for advancing content-based image search and classification.
Based on the Gabor wavelet transformation, whose kernels are similar to the 2D receptive field profiles of the mammalian cortical simple cells, a series of new image descriptors is developed. Specifically, first, a new color Gabor-HOG (GHOG) descriptor is introduced by concatenating the Histograms of Oriented Gradients (HOG) of the component images produced by applying Gabor filters in multiple scales and orientations to encode shape information. Second, the GHOG descriptor is analyzed in six different color spaces and grayscale to propose different color GHOG descriptors, which are further combined to present a new Fused Color GHOG (FC-GHOG) descriptor. Third, a novel GaborPHOG (GPHOG) descriptor is proposed which improves upon the Pyramid Histograms of Oriented Gradients (PHOG) descriptor, and subsequently a new FC-GPHOG descriptor is constructed by combining the multiple color GPHOG descriptors and employing the Principal Component Analysis (PCA). Next, the Gabor-LBP (GLBP) is derived by accumulating the Local Binary Patterns (LBP) histograms of the local Gabor filtered images to encode texture and local information of an image. Furthermore, a novel Gabor-LBPPHOG (GLP) image descriptor is proposed which integrates the GLBP and the GPHOG descriptors as a feature set and an innovative Fused Color Gabor-LBP-PHOG (FC-GLP) is constructed by fusing the GLP from multiple color spaces. Subsequently, The GLBP and the GHOG descriptors are then combined to produce the Gabor-LBP-HOG (GLH) feature vector which performs well on different object and scene image categories. The six color GLH vectors are further concatenated to form the Fused Color GLH (FC-GLH) descriptor. Finally, the Wigner based Local Binary Patterns (WLBP) descriptor is proposed that combines multi-neighborhood LBP, Pseudo-Wigner distribution of images and the popular bag of words model to effectively classify scene images.
To assess the feasibility of the proposed new image descriptors, two classification methods are used: one method applies the PCA and the Enhanced Fisher Model (EFM) for feature extraction and the nearest neighbor rule for classification, while the other method employs the Support Vector Machine (SVM). The classification performance of the proposed descriptors is tested on several publicly available popular image datasets. The experimental results show that the proposed new image descriptors achieve image search and classification results better than or at par with other popular image descriptors, such as the Scale Invariant Feature Transform (SIFT), the Pyramid Histograms of visual Words (PHOW), the Pyramid Histograms of Oriented Gradients (PHOG), the Spatial Envelope (SE), the Color SIFT four Concentric Circles (C4CC), the Object Bank (OB), the Context Aware Topic Model (CA-TM), the Hierarchical Matching Pursuit (HMP), the Kernel Spatial Pyramid Matching (KSPM), the SIFT Sparse-coded Spatial Pyramid Matching (Sc-SPM), the Kernel Codebook (KC) and the LBP
Identifying person re-occurrences for personal photo management applications
Automatic identification of "who" is present in individual digital images within a photo management system using only content-based analysis is an extremely difficult problem. The authors present a system which enables identification of person reoccurrences within a personal photo management application by combining image content-based analysis tools with context data from image capture. This combined system employs automatic face detection and body-patch matching techniques, which collectively facilitate identifying person re-occurrences within images grouped into events based on context data. The authors introduce a face detection approach combining a histogram-based skin detection model and a modified BDF face detection method to detect multiple frontal faces in colour images. Corresponding body patches are then automatically segmented relative to the size, location and orientation of the detected faces in the image. The authors investigate the suitability of using different colour descriptors, including MPEG-7 colour descriptors, color coherent vectors (CCV) and color correlograms for effective body-patch matching. The system has been successfully integrated into the MediAssist platform, a prototype Web-based system for personal photo management, and runs on over 13000 personal photos
Unsupervised Graph-based Rank Aggregation for Improved Retrieval
This paper presents a robust and comprehensive graph-based rank aggregation
approach, used to combine results of isolated ranker models in retrieval tasks.
The method follows an unsupervised scheme, which is independent of how the
isolated ranks are formulated. Our approach is able to combine arbitrary
models, defined in terms of different ranking criteria, such as those based on
textual, image or hybrid content representations.
We reformulate the ad-hoc retrieval problem as a document retrieval based on
fusion graphs, which we propose as a new unified representation model capable
of merging multiple ranks and expressing inter-relationships of retrieval
results automatically. By doing so, we claim that the retrieval system can
benefit from learning the manifold structure of datasets, thus leading to more
effective results. Another contribution is that our graph-based aggregation
formulation, unlike existing approaches, allows for encapsulating contextual
information encoded from multiple ranks, which can be directly used for
ranking, without further computations and post-processing steps over the
graphs. Based on the graphs, a novel similarity retrieval score is formulated
using an efficient computation of minimum common subgraphs. Finally, another
benefit over existing approaches is the absence of hyperparameters.
A comprehensive experimental evaluation was conducted considering diverse
well-known public datasets, composed of textual, image, and multimodal
documents. Performed experiments demonstrate that our method reaches top
performance, yielding better effectiveness scores than state-of-the-art
baseline methods and promoting large gains over the rankers being fused, thus
demonstrating the successful capability of the proposal in representing queries
based on a unified graph-based model of rank fusions
- ā¦