13,024 research outputs found
Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification
Spatial Pyramid Matching (SPM) and its variants have achieved a lot of
success in image classification. The main difference among them is their
encoding schemes. For example, ScSPM incorporates Sparse Code (SC) instead of
Vector Quantization (VQ) into the framework of SPM. Although the methods
achieve a higher recognition rate than the traditional SPM, they consume more
time to encode the local descriptors extracted from the image. In this paper,
we propose using Low Rank Representation (LRR) to encode the descriptors under
the framework of SPM. Different from SC, LRR considers the group effect among
data points instead of sparsity. Benefiting from this property, the proposed
method (i.e., LrrSPM) can offer a better performance. To further improve the
generalizability and robustness, we reformulate the rank-minimization problem
as a truncated projection problem. Extensive experimental studies show that
LrrSPM is more efficient than its counterparts (e.g., ScSPM) while achieving
competitive recognition rates on nine image data sets.Comment: accepted into knowledge based systems, 201
Unconstrained Scene Text and Video Text Recognition for Arabic Script
Building robust recognizers for Arabic has always been challenging. We
demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid
architecture in recognizing Arabic text in videos and natural scenes. We
outperform previous state-of-the-art on two publicly available video text
datasets - ALIF and ACTIV. For the scene text recognition task, we introduce a
new Arabic scene text dataset and establish baseline results. For scripts like
Arabic, a major challenge in developing robust recognizers is the lack of large
quantity of annotated data. We overcome this by synthesising millions of Arabic
text images from a large vocabulary of Arabic words and phrases. Our
implementation is built on top of the model introduced here [37] which is
proven quite effective for English scene text recognition. The model follows a
segmentation-free, sequence to sequence transcription approach. The network
transcribes a sequence of convolutional features from the input image to a
sequence of target labels. This does away with the need for segmenting input
image into constituent characters/glyphs, which is often difficult for Arabic
script. Further, the ability of RNNs to model contextual dependencies yields
superior recognition results.Comment: 5 page
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization
We consider the image classification problem via kernel collaborative
representation classification with locality constrained dictionary (KCRC-LCD).
Specifically, we propose a kernel collaborative representation classification
(KCRC) approach in which kernel method is used to improve the discrimination
ability of collaborative representation classification (CRC). We then measure
the similarities between the query and atoms in the global dictionary in order
to construct a locality constrained dictionary (LCD) for KCRC. In addition, we
discuss several similarity measure approaches in LCD and further present a
simple yet effective unified similarity measure whose superiority is validated
in experiments. There are several appealing aspects associated with LCD. First,
LCD can be nicely incorporated under the framework of KCRC. The LCD similarity
measure can be kernelized under KCRC, which theoretically links CRC and LCD
under the kernel method. Second, KCRC-LCD becomes more scalable to both the
training set size and the feature dimension. Example shows that KCRC is able to
perfectly classify data with certain distribution, while conventional CRC fails
completely. Comprehensive experiments on many public datasets also show that
KCRC-LCD is a robust discriminative classifier with both excellent performance
and good scalability, being comparable or outperforming many other
state-of-the-art approaches
Fast and Robust Archetypal Analysis for Representation Learning
We revisit a pioneer unsupervised learning technique called archetypal
analysis, which is related to successful data analysis methods such as sparse
coding and non-negative matrix factorization. Since it was proposed, archetypal
analysis did not gain a lot of popularity even though it produces more
interpretable models than other alternatives. Because no efficient
implementation has ever been made publicly available, its application to
important scientific problems may have been severely limited. Our goal is to
bring back into favour archetypal analysis. We propose a fast optimization
scheme using an active-set strategy, and provide an efficient open-source
implementation interfaced with Matlab, R, and Python. Then, we demonstrate the
usefulness of archetypal analysis for computer vision tasks, such as codebook
learning, signal classification, and large image collection visualization
- …