23,881 research outputs found
Taming Wild Faces: Web-Scale, Open-Universe Face Identification in Still and Video Imagery
With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities, while accurately rejecting those of no interest. Recent advancements in face recognition research has seen Sparse Representation-based Classification (SRC) advance to the forefront of competing methods. However, its drawbacks, slow speed and sensitivity to variations in pose, illumination, and occlusion, have hindered its wide-spread applicability. The contributions of this dissertation are three-fold: 1. For still-image data, we propose a novel Linearly Approximated Sparse Representation-based Classification (LASRC) algorithm that uses linear regression to perform sample selection for l1-minimization, thus harnessing the speed of least-squares and the robustness of SRC. On our large dataset collected from Facebook, LASRC performs equally to standard SRC with a speedup of 100-250x. 2. For video, applying the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive computationally, so we propose a new algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and employing the knowledge that the face track frames belong to the same individual. Employing MSSRC results in a speedup of 5x on average over SRC on a frame-by-frame basis. 3. Finally, we make the observation that MSSRC sometimes assigns inconsistent identities to the same individual in a scene that could be corrected based on their visual similarity. Therefore, we construct a probabilistic affinity graph combining appearance and co-occurrence similarities to model the relationship between face tracks in a video. Using this relationship graph, we employ random walk analysis to propagate strong class predictions among similar face tracks, while dampening weak predictions. Our method results in a performance gain of 15.8% in average precision over using MSSRC alone
Low-Rank Discriminative Least Squares Regression for Image Classification
Latest least squares regression (LSR) methods mainly try to learn slack
regression targets to replace strict zero-one labels. However, the difference
of intra-class targets can also be highlighted when enlarging the distance
between different classes, and roughly persuing relaxed targets may lead to the
problem of overfitting. To solve above problems, we propose a low-rank
discriminative least squares regression model (LRDLSR) for multi-class image
classification. Specifically, LRDLSR class-wisely imposes low-rank constraint
on the intra-class regression targets to encourage its compactness and
similarity. Moreover, LRDLSR introduces an additional regularization term on
the learned targets to avoid the problem of overfitting. These two improvements
are helpful to learn a more discriminative projection for regression and thus
achieving better classification performance. Experimental results over a range
of image databases demonstrate the effectiveness of the proposed LRDLSR method
Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction
It is difficult to find the optimal sparse solution of a manifold learning
based dimensionality reduction algorithm. The lasso or the elastic net
penalized manifold learning based dimensionality reduction is not directly a
lasso penalized least square problem and thus the least angle regression (LARS)
(Efron et al. \cite{LARS}), one of the most popular algorithms in sparse
learning, cannot be applied. Therefore, most current approaches take indirect
ways or have strict settings, which can be inconvenient for applications. In
this paper, we proposed the manifold elastic net or MEN for short. MEN
incorporates the merits of both the manifold learning based dimensionality
reduction and the sparse learning based dimensionality reduction. By using a
series of equivalent transformations, we show MEN is equivalent to the lasso
penalized least square problem and thus LARS is adopted to obtain the optimal
sparse solution of MEN. In particular, MEN has the following advantages for
subsequent classification: 1) the local geometry of samples is well preserved
for low dimensional data representation, 2) both the margin maximization and
the classification error minimization are considered for sparse projection
calculation, 3) the projection matrix of MEN improves the parsimony in
computation, 4) the elastic net penalty reduces the over-fitting problem, and
5) the projection matrix of MEN can be interpreted psychologically and
physiologically. Experimental evidence on face recognition over various popular
datasets suggests that MEN is superior to top level dimensionality reduction
algorithms.Comment: 33 pages, 12 figure
- …