15,119 research outputs found
Linear Spatial Pyramid Matching Using Non-convex and non-negative Sparse Coding for Image Classification
Recently sparse coding have been highly successful in image classification
mainly due to its capability of incorporating the sparsity of image
representation. In this paper, we propose an improved sparse coding model based
on linear spatial pyramid matching(SPM) and Scale Invariant Feature Transform
(SIFT ) descriptors. The novelty is the simultaneous non-convex and
non-negative characters added to the sparse coding model. Our numerical
experiments show that the improved approach using non-convex and non-negative
sparse coding is superior than the original ScSPM[1] on several typical
databases
Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification
Spatial Pyramid Matching (SPM) and its variants have achieved a lot of
success in image classification. The main difference among them is their
encoding schemes. For example, ScSPM incorporates Sparse Code (SC) instead of
Vector Quantization (VQ) into the framework of SPM. Although the methods
achieve a higher recognition rate than the traditional SPM, they consume more
time to encode the local descriptors extracted from the image. In this paper,
we propose using Low Rank Representation (LRR) to encode the descriptors under
the framework of SPM. Different from SC, LRR considers the group effect among
data points instead of sparsity. Benefiting from this property, the proposed
method (i.e., LrrSPM) can offer a better performance. To further improve the
generalizability and robustness, we reformulate the rank-minimization problem
as a truncated projection problem. Extensive experimental studies show that
LrrSPM is more efficient than its counterparts (e.g., ScSPM) while achieving
competitive recognition rates on nine image data sets.Comment: accepted into knowledge based systems, 201
Temporal Extension of Scale Pyramid and Spatial Pyramid Matching for Action Recognition
Historically, researchers in the field have spent a great deal of effort to
create image representations that have scale invariance and retain spatial
location information. This paper proposes to encode equivalent temporal
characteristics in video representations for action recognition. To achieve
temporal scale invariance, we develop a method called temporal scale pyramid
(TSP). To encode temporal information, we present and compare two methods
called temporal extension descriptor (TED) and temporal division pyramid (TDP)
. Our purpose is to suggest solutions for matching complex actions that have
large variation in velocity and appearance, which is missing from most current
action representations. The experimental results on four benchmark datasets,
UCF50, HMDB51, Hollywood2 and Olympic Sports, support our approach and
significantly outperform state-of-the-art methods. Most noticeably, we achieve
65.0% mean accuracy and 68.2% mean average precision on the challenging HMDB51
and Hollywood2 datasets which constitutes an absolute improvement over the
state-of-the-art by 7.8% and 3.9%, respectively
Image retrieval with hierarchical matching pursuit
A novel representation of images for image retrieval is introduced in this
paper, by using a new type of feature with remarkable discriminative power.
Despite the multi-scale nature of objects, most existing models perform feature
extraction on a fixed scale, which will inevitably degrade the performance of
the whole system. Motivated by this, we introduce a hierarchical sparse coding
architecture for image retrieval to explore multi-scale cues. Sparse codes
extracted on lower layers are transmitted to higher layers recursively. With
this mechanism, cues from different scales are fused. Experiments on the
Holidays dataset show that the proposed method achieves an excellent retrieval
performance with a small code length.Comment: 5 pages, 6 figures, conferenc
- …