15,283 research outputs found
Effective and efficient kernel-based image representations for classification and retrieval
Image representation is a challenging task. In particular, in order to obtain better performances in different image processing applications such as video surveillance, autonomous driving, crime scene detection and automatic inspection, effective and efficient image representation is a fundamental need. The performance of these applications usually depends on how accurately images are classified into their corresponding groups or how precisely relevant images are retrieved from a database based on a query. Accuracy in image classification and precision in image retrieval depend on the effectiveness of image representation. Existing image representation methods have some limitations. For example, spatial pyramid matching, which is a popular method incorporating spatial information in image-level representation, has not been fully studied to date. In addition, the strengths of pyramid match kernel and spatial pyramid matching are not combined for better image matching. Kernel descriptors based on gradient, colour and shape overcome the limitations of histogram-based descriptors, but suffer from information loss, noise effects and high computational complexity. Furthermore, the combined performance of kernel descriptors has limitations related to computational complexity, higher dimensionality and lower effectiveness. Moreover, the potential of a global texture descriptor which is based on human visual perception has not been fully explored to date. Therefore, in this research project, kernel-based effective and efficient image representation methods are proposed to address the above limitations. An enhancement is made to spatial pyramid matching in terms of improved rotation invariance. This is done by investigating different partitioning schemes suitable to achieve rotation-invariant image representation and the proposal of a weight function for appropriate level contribution in image matching. In addition, the strengths of pyramid match kernel and spatial pyramid are combined to enhance matching accuracy between images. The existing kernel descriptors are modified and improved to achieve greater effectiveness, minimum noise effects, less dimensionality and lower computational complexity. A novel fusion approach is also proposed to combine the information related to all pixel attributes, before the descriptor extraction stage. Existing kernel descriptors are based only on gradient, colour and shape information. In this research project, a texture-based kernel descriptor is proposed by modifying an existing popular global texture descriptor. Finally, all the contributions are evaluated in an integrated system. The performances of the proposed methods are qualitatively and quantitatively evaluated on two to four different publicly available image databases. The experimental results show that the proposed methods are more effective and efficient in image representation than existing benchmark methods.Doctor of Philosoph
Linear Spatial Pyramid Matching Using Non-convex and non-negative Sparse Coding for Image Classification
Recently sparse coding have been highly successful in image classification
mainly due to its capability of incorporating the sparsity of image
representation. In this paper, we propose an improved sparse coding model based
on linear spatial pyramid matching(SPM) and Scale Invariant Feature Transform
(SIFT ) descriptors. The novelty is the simultaneous non-convex and
non-negative characters added to the sparse coding model. Our numerical
experiments show that the improved approach using non-convex and non-negative
sparse coding is superior than the original ScSPM[1] on several typical
databases
Temporal Extension of Scale Pyramid and Spatial Pyramid Matching for Action Recognition
Historically, researchers in the field have spent a great deal of effort to
create image representations that have scale invariance and retain spatial
location information. This paper proposes to encode equivalent temporal
characteristics in video representations for action recognition. To achieve
temporal scale invariance, we develop a method called temporal scale pyramid
(TSP). To encode temporal information, we present and compare two methods
called temporal extension descriptor (TED) and temporal division pyramid (TDP)
. Our purpose is to suggest solutions for matching complex actions that have
large variation in velocity and appearance, which is missing from most current
action representations. The experimental results on four benchmark datasets,
UCF50, HMDB51, Hollywood2 and Olympic Sports, support our approach and
significantly outperform state-of-the-art methods. Most noticeably, we achieve
65.0% mean accuracy and 68.2% mean average precision on the challenging HMDB51
and Hollywood2 datasets which constitutes an absolute improvement over the
state-of-the-art by 7.8% and 3.9%, respectively
Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification
Spatial Pyramid Matching (SPM) and its variants have achieved a lot of
success in image classification. The main difference among them is their
encoding schemes. For example, ScSPM incorporates Sparse Code (SC) instead of
Vector Quantization (VQ) into the framework of SPM. Although the methods
achieve a higher recognition rate than the traditional SPM, they consume more
time to encode the local descriptors extracted from the image. In this paper,
we propose using Low Rank Representation (LRR) to encode the descriptors under
the framework of SPM. Different from SC, LRR considers the group effect among
data points instead of sparsity. Benefiting from this property, the proposed
method (i.e., LrrSPM) can offer a better performance. To further improve the
generalizability and robustness, we reformulate the rank-minimization problem
as a truncated projection problem. Extensive experimental studies show that
LrrSPM is more efficient than its counterparts (e.g., ScSPM) while achieving
competitive recognition rates on nine image data sets.Comment: accepted into knowledge based systems, 201
Automatic Classification of Human Epithelial Type 2 Cell Indirect Immunofluorescence Images using Cell Pyramid Matching
This paper describes a novel system for automatic classification of images
obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial
type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The
IIF protocol on HEp-2 cells has been the hallmark method to identify the
presence of ANAs, due to its high sensitivity and the large range of antigens
that can be detected. However, it suffers from numerous shortcomings, such as
being subjective as well as time and labour intensive. Computer Aided
Diagnostic (CAD) systems have been developed to address these problems, which
automatically classify a HEp-2 cell image into one of its known patterns (eg.
speckled, homogeneous). Most of the existing CAD systems use handpicked
features to represent a HEp-2 cell image, which may only work in limited
scenarios. We propose a novel automatic cell image classification method termed
Cell Pyramid Matching (CPM), which is comprised of regional histograms of
visual words coupled with the Multiple Kernel Learning framework. We present a
study of several variations of generating histograms and show the efficacy of
the system on two publicly available datasets: the ICPR HEp-2 cell
classification contest dataset and the SNPHEp-2 dataset.Comment: arXiv admin note: substantial text overlap with arXiv:1304.126
Image retrieval with hierarchical matching pursuit
A novel representation of images for image retrieval is introduced in this
paper, by using a new type of feature with remarkable discriminative power.
Despite the multi-scale nature of objects, most existing models perform feature
extraction on a fixed scale, which will inevitably degrade the performance of
the whole system. Motivated by this, we introduce a hierarchical sparse coding
architecture for image retrieval to explore multi-scale cues. Sparse codes
extracted on lower layers are transmitted to higher layers recursively. With
this mechanism, cues from different scales are fused. Experiments on the
Holidays dataset show that the proposed method achieves an excellent retrieval
performance with a small code length.Comment: 5 pages, 6 figures, conferenc
- …