389 research outputs found
Robust visual tracking via nonlocal regularized multi-view sparse representation
The multi-view sparse representation based visual tracking has attracted increasing attention because the sparse representations of different object features can complement with each other. Since the robustness of different object features is actually not the same in challenging video sequences, it may contain unreliable features (the features with low robustness) in multi-view sparse representation. In this case, how to highlight the useful information of unreliable features for proper multi-feature fusion has become a tough work. To solve this problem, we propose a multi-view discriminant sparse representation method for robust visual tracking, in which we firstly divide the multi-view observations into different groups, and then estimate the sparse representations of multi-view group projections for calculating the observation likelihood. The advantages of the proposed sparse representation method are two-folds: 1) It can properly fuse the observation groups with reliable and unreliable features by using an online updated discriminant matrix to explore the group similarity in multi-feature space. 2) It introduces a nonlocal regularizer to enforce the spatial smoothness among the sparse representations of different group projections, which can enhance the robustness of multi-view sparse representation. Experimental results show that our method can achieve a better tracking performance than state-of-the-art tracking methods d
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization
We consider the image classification problem via kernel collaborative
representation classification with locality constrained dictionary (KCRC-LCD).
Specifically, we propose a kernel collaborative representation classification
(KCRC) approach in which kernel method is used to improve the discrimination
ability of collaborative representation classification (CRC). We then measure
the similarities between the query and atoms in the global dictionary in order
to construct a locality constrained dictionary (LCD) for KCRC. In addition, we
discuss several similarity measure approaches in LCD and further present a
simple yet effective unified similarity measure whose superiority is validated
in experiments. There are several appealing aspects associated with LCD. First,
LCD can be nicely incorporated under the framework of KCRC. The LCD similarity
measure can be kernelized under KCRC, which theoretically links CRC and LCD
under the kernel method. Second, KCRC-LCD becomes more scalable to both the
training set size and the feature dimension. Example shows that KCRC is able to
perfectly classify data with certain distribution, while conventional CRC fails
completely. Comprehensive experiments on many public datasets also show that
KCRC-LCD is a robust discriminative classifier with both excellent performance
and good scalability, being comparable or outperforming many other
state-of-the-art approaches
Online multi-modal robust non-negative dictionary learning for visual tracking
© 2015 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Dictionary learning is a method of acquiring a collection of atoms for subsequent signal representation. Due to its excellent representation ability, dictionary learning has been widely applied in multimedia and computer vision. However, conventional dictionary learning algorithms fail to deal with multi-modal datasets. In this paper, we propose an online multi-modal robust non-negative dictionary learning (OMRNDL) algorithm to overcome this deficiency. Notably, OMRNDL casts visual tracking as a dictionary learning problem under the particle filter framework and captures the intrinsic knowledge about the target from multiple visual modalities, e.g., pixel intensity and texture information. To this end, OMRNDL adaptively learns an individual dictionary, i.e., template, for each modality from available frames, and then represents new particles over all the learned dictionaries by minimizing the fitting loss of data based on M-estimation. The resultant representation coefficient can be viewed as the common semantic representation of particles across multiple modalities, and can be utilized to track the target. OMRNDL incrementally learns the dictionary and the coefficient of each particle by using multiplicative update rules to respectively guarantee their non-negativity constraints. Experimental results on a popular challenging video benchmark validate the effectiveness of OMRNDL for visual tracking in both quantity and quality
Adaptive Image Denoising by Targeted Databases
We propose a data-dependent denoising procedure to restore noisy images.
Different from existing denoising algorithms which search for patches from
either the noisy image or a generic database, the new algorithm finds patches
from a database that contains only relevant patches. We formulate the denoising
problem as an optimal filter design problem and make two contributions. First,
we determine the basis function of the denoising filter by solving a group
sparsity minimization problem. The optimization formulation generalizes
existing denoising algorithms and offers systematic analysis of the
performance. Improvement methods are proposed to enhance the patch search
process. Second, we determine the spectral coefficients of the denoising filter
by considering a localized Bayesian prior. The localized prior leverages the
similarity of the targeted database, alleviates the intensive Bayesian
computation, and links the new method to the classical linear minimum mean
squared error estimation. We demonstrate applications of the proposed method in
a variety of scenarios, including text images, multiview images and face
images. Experimental results show the superiority of the new algorithm over
existing methods.Comment: 15 pages, 13 figures, 2 tables, journa
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
How to compare noisy patches? Patch similarity beyond Gaussian noise
International audienceMany tasks in computer vision require to match image parts. While higher-level methods consider image features such as edges or robust descriptors, low-level approaches (so-called image-based) compare groups of pixels (patches) and provide dense matching. Patch similarity is a key ingredient to many techniques for image registration, stereo-vision, change detection or denoising. Recent progress in natural image modeling also makes intensive use of patch comparison. A fundamental difficulty when comparing two patches from "real" data is to decide whether the differences should be ascribed to noise or intrinsic dissimilarity. Gaussian noise assumption leads to the classical definition of patch similarity based on the squared differences of intensities. For the case where noise departs from the Gaussian distribution, several similarity criteria have been proposed in the literature of image processing, detection theory and machine learning. By expressing patch (dis)similarity as a detection test under a given noise model, we introduce these criteria with a new one and discuss their properties. We then assess their performance for different tasks: patch discrimination, image denoising, stereo-matching and motion-tracking under gamma and Poisson noises. The proposed criterion based on the generalized likelihood ratio is shown to be both easy to derive and powerful in these diverse applications
- …