5,315 research outputs found
Sparse representations of image gradient orientations for visual recognition and tracking
Recent results [18] have shown that sparse linear representations of a query object with respect to an overcomplete basis formed by the entire gallery of objects of interest can result in powerful image-based object recognition schemes. In this paper, we propose a framework for visual recognition and tracking based on sparse representations of image gradient orientations. We show that minimal `1 solutions to problems formulated with gradient orientations can be used for fast and robust object recognition even for probe objects corrupted by outliers. These solutions are obtained without the need for solving the extended problem considered in [18]. We further show that low-dimensional embeddings generated from gradient orientations perform equally well even when probe objects are corrupted by outliers, which, in turn, results in huge computational savings. We demonstrate experimentally that, compared to the baseline method in [18], our formulation results in better recognition rates without the need for block processing and even with smaller number of training samples. Finally, based on our results, we also propose a robust and efficient `1-based “tracking by detection” algorithm. We show experimentally that our tracker outperforms a recently proposed `1-based tracking algorithm in terms of robustness, accuracy and speed
Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval
In this paper we address the problem of learning robust cross-domain
representations for sketch-based image retrieval (SBIR). While most SBIR
approaches focus on extracting low- and mid-level descriptors for direct
feature matching, recent works have shown the benefit of learning coupled
feature representations to describe data from two related sources. However,
cross-domain representation learning methods are typically cast into non-convex
minimization problems that are difficult to optimize, leading to unsatisfactory
performance. Inspired by self-paced learning, a learning methodology designed
to overcome convergence issues related to local optima by exploiting the
samples in a meaningful order (i.e. easy to hard), we introduce the cross-paced
partial curriculum learning (CPPCL) framework. Compared with existing
self-paced learning methods which only consider a single modality and cannot
deal with prior knowledge, CPPCL is specifically designed to assess the
learning pace by jointly handling data from dual sources and modality-specific
prior information provided in the form of partial curricula. Additionally,
thanks to the learned dictionaries, we demonstrate that the proposed CPPCL
embeds robust coupled representations for SBIR. Our approach is extensively
evaluated on four publicly available datasets (i.e. CUFS, Flickr15K, QueenMary
SBIR and TU-Berlin Extension datasets), showing superior performance over
competing SBIR methods
Rotation-invariant features for multi-oriented text detection in natural images.
Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes
- …