36,754 research outputs found
Analysis Dictionary Learning: An Efficient and Discriminative Solution
Discriminative Dictionary Learning (DL) methods have been widely advocated
for image classification problems. To further sharpen their discriminative
capabilities, most state-of-the-art DL methods have additional constraints
included in the learning stages. These various constraints, however, lead to
additional computational complexity. We hence propose an efficient
Discriminative Convolutional Analysis Dictionary Learning (DCADL) method, as a
lower cost Discriminative DL framework, to both characterize the image
structures and refine the interclass structure representations. The proposed
DCADL jointly learns a convolutional analysis dictionary and a universal
classifier, while greatly reducing the time complexity in both training and
testing phases, and achieving a competitive accuracy, thus demonstrating great
performance in many experiments with standard databases.Comment: ICASSP 201
Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
Recognizing scene text is a challenging problem, even more so than the
recognition of scanned documents. This problem has gained significant attention
from the computer vision community in recent years, and several methods based
on energy minimization frameworks and deep learning approaches have been
proposed. In this work, we focus on the energy minimization framework and
propose a model that exploits both bottom-up and top-down cues for recognizing
cropped words extracted from street images. The bottom-up cues are derived from
individual character detections from an image. We build a conditional random
field model on these detections to jointly model the strength of the detections
and the interactions between them. These interactions are top-down cues
obtained from a lexicon-based prior, i.e., language statistics. The optimal
word represented by the text image is obtained by minimizing the energy
function corresponding to the random field model. We evaluate our proposed
algorithm extensively on a number of cropped scene text benchmark datasets,
namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word,
and show better performance than comparable methods. We perform a rigorous
analysis of all the steps in our approach and analyze the results. We also show
that state-of-the-art convolutional neural network features can be integrated
in our framework to further improve the recognition performance
- …