81,194 research outputs found
Learning Robust Representations for Computer Vision
Unsupervised learning techniques in computer vision often require learning
latent representations, such as low-dimensional linear and non-linear
subspaces. Noise and outliers in the data can frustrate these approaches by
obscuring the latent spaces.
Our main goal is deeper understanding and new development of robust
approaches for representation learning. We provide a new interpretation for
existing robust approaches and present two specific contributions: a new robust
PCA approach, which can separate foreground features from dynamic background,
and a novel robust spectral clustering method, that can cluster facial images
with high accuracy. Both contributions show superior performance to standard
methods on real-world test sets.Comment: 8 pages, 7 page
Kernelized Low Rank Representation on Grassmann Manifolds
Low rank representation (LRR) has recently attracted great interest due to
its pleasing efficacy in exploring low-dimensional subspace structures embedded
in data. One of its successful applications is subspace clustering which means
data are clustered according to the subspaces they belong to. In this paper, at
a higher level, we intend to cluster subspaces into classes of subspaces. This
is naturally described as a clustering problem on Grassmann manifold. The
novelty of this paper is to generalize LRR on Euclidean space onto an LRR model
on Grassmann manifold in a uniform kernelized framework. The new methods have
many applications in computer vision tasks. Several clustering experiments are
conducted on handwritten digit images, dynamic textures, human face clips and
traffic scene sequences. The experimental results show that the proposed
methods outperform a number of state-of-the-art subspace clustering methods.Comment: 13 page
Modeling of Facial Aging and Kinship: A Survey
Computational facial models that capture properties of facial cues related to
aging and kinship increasingly attract the attention of the research community,
enabling the development of reliable methods for age progression, age
estimation, age-invariant facial characterization, and kinship verification
from visual data. In this paper, we review recent advances in modeling of
facial aging and kinship. In particular, we provide an up-to date, complete
list of available annotated datasets and an in-depth analysis of geometric,
hand-crafted, and learned facial representations that are used for facial aging
and kinship characterization. Moreover, evaluation protocols and metrics are
reviewed and notable experimental results for each surveyed task are analyzed.
This survey allows us to identify challenges and discuss future research
directions for the development of robust facial models in real-world
conditions
Visual Tracking via Dynamic Graph Learning
Existing visual tracking methods usually localize a target object with a
bounding box, in which the performance of the foreground object trackers or
detectors is often affected by the inclusion of background clutter. To handle
this problem, we learn a patch-based graph representation for visual tracking.
The tracked object is modeled by with a graph by taking a set of
non-overlapping image patches as nodes, in which the weight of each node
indicates how likely it belongs to the foreground and edges are weighted for
indicating the appearance compatibility of two neighboring nodes. This graph is
dynamically learned and applied in object tracking and model updating. During
the tracking process, the proposed algorithm performs three main steps in each
frame. First, the graph is initialized by assigning binary weights of some
image patches to indicate the object and background patches according to the
predicted bounding box. Second, the graph is optimized to refine the patch
weights by using a novel alternating direction method of multipliers. Third,
the object feature representation is updated by imposing the weights of patches
on the extracted image features. The object location is predicted by maximizing
the classification score in the structured support vector machine. Extensive
experiments show that the proposed tracking algorithm performs well against the
state-of-the-art methods on large-scale benchmark datasets.Comment: Submitted to TPAMI 201
Recognizing Partial Biometric Patterns
Biometric recognition on partial captured targets is challenging, where only
several partial observations of objects are available for matching. In this
area, deep learning based methods are widely applied to match these partial
captured objects caused by occlusions, variations of postures or just partial
out of view in person re-identification and partial face recognition. However,
most current methods are not able to identify an individual in case that some
parts of the object are not obtainable, while the rest are specialized to
certain constrained scenarios. To this end, we propose a robust general
framework for arbitrary biometric matching scenarios without the limitations of
alignment as well as the size of inputs. We introduce a feature post-processing
step to handle the feature maps from FCN and a dictionary learning based
Spatial Feature Reconstruction (SFR) to match different sized feature maps in
this work. Moreover, the batch hard triplet loss function is applied to
optimize the model. The applicability and effectiveness of the proposed method
are demonstrated by the results from experiments on three person
re-identification datasets (Market1501, CUHK03, DukeMTMC-reID), two partial
person datasets (Partial REID and Partial iLIDS) and two partial face datasets
(CASIA-NIR-Distance and Partial LFW), on which state-of-the-art performance is
ensured in comparison with several state-of-the-art approaches. The code is
released online and can be found on the website:
https://github.com/lingxiao-he/Partial-Person-ReID.Comment: 13 pages, 11 figure
Low Rank Representation on Grassmann Manifolds: An Extrinsic Perspective
Many computer vision algorithms employ subspace models to represent data. The
Low-rank representation (LRR) has been successfully applied in subspace
clustering for which data are clustered according to their subspace structures.
The possibility of extending LRR on Grassmann manifold is explored in this
paper. Rather than directly embedding Grassmann manifold into a symmetric
matrix space, an extrinsic view is taken by building the self-representation of
LRR over the tangent space of each Grassmannian point. A new algorithm for
solving the proposed Grassmannian LRR model is designed and implemented.
Several clustering experiments are conducted on handwritten digits dataset,
dynamic texture video clips and YouTube celebrity face video data. The
experimental results show our method outperforms a number of existing methods.Comment: 9 page
Low-Rank Modeling and Its Applications in Image Analysis
Low-rank modeling generally refers to a class of methods that solve problems
by representing variables of interest as low-rank matrices. It has achieved
great success in various fields including computer vision, data mining, signal
processing and bioinformatics. Recently, much progress has been made in
theories, algorithms and applications of low-rank modeling, such as exact
low-rank matrix recovery via convex programming and matrix completion applied
to collaborative filtering. These advances have brought more and more
attentions to this topic. In this paper, we review the recent advance of
low-rank modeling, the state-of-the-art algorithms, and related applications in
image analysis. We first give an overview to the concept of low-rank modeling
and challenging problems in this area. Then, we summarize the models and
algorithms for low-rank matrix recovery and illustrate their advantages and
limitations with numerical experiments. Next, we introduce a few applications
of low-rank modeling in the context of image analysis. Finally, we conclude
this paper with some discussions.Comment: To appear in ACM Computing Survey
Leveraging the Power of Gabor Phase for Face Identification: A Block Matching Approach
Different from face verification, face identification is much more demanding.
To reach comparable performance, an identifier needs to be roughly N times
better than a verifier. To expect a breakthrough in face identification, we
need a fresh look at the fundamental building blocks of face recognition. In
this paper we focus on the selection of a suitable signal representation and
better matching strategy for face identification. We demonstrate how Gabor
phase could be leveraged to improve the performance of face identification by
using the Block Matching method. Compared to the existing approaches, the
proposed method features much lower algorithmic complexity: face images are
only filtered by a single-scale Gabor filter pair and the matching is performed
between any pairs of face images at hand without involving any training
process. Benchmark evaluations show that the proposed approach is totally
comparable to and even better than state-of-the-art algorithms, which are
typically based on more features extracted from a large set of Gabor faces
and/or rely on heavy training processes
A survey of sparse representation: algorithms and applications
Sparse representation has attracted much attention from researchers in fields
of signal processing, image processing, computer vision and pattern
recognition. Sparse representation also has a good reputation in both
theoretical research and practical applications. Many different algorithms have
been proposed for sparse representation. The main purpose of this article is to
provide a comprehensive study and an updated review on sparse representation
and to supply a guidance for researchers. The taxonomy of sparse representation
methods can be studied from various viewpoints. For example, in terms of
different norm minimizations used in sparsity constraints, the methods can be
roughly categorized into five groups: sparse representation with -norm
minimization, sparse representation with -norm (0p1) minimization,
sparse representation with -norm minimization and sparse representation
with -norm minimization. In this paper, a comprehensive overview of
sparse representation is provided. The available sparse representation
algorithms can also be empirically categorized into four groups: greedy
strategy approximation, constrained optimization, proximity algorithm-based
optimization, and homotopy algorithm-based sparse representation. The
rationales of different algorithms in each category are analyzed and a wide
range of sparse representation applications are summarized, which could
sufficiently reveal the potential nature of the sparse representation theory.
Specifically, an experimentally comparative study of these sparse
representation algorithms was presented. The Matlab code used in this paper can
be available at: http://www.yongxu.org/lunwen.html.Comment: Published on IEEE Access, Vol. 3, pp. 490-530, 201
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
- …