40,524 research outputs found
Robust Face Representation and Recognition Under Low Resolution and Difficult Lighting Conditions
This dissertation focuses on different aspects of face image analysis for accurate face recognition under low resolution and poor lighting conditions. A novel resolution enhancement technique is proposed for enhancing a low resolution face image into a high resolution image for better visualization and improved feature extraction, especially in a video surveillance environment. This method performs kernel regression and component feature learning in local neighborhood of the face images. It uses directional Fourier phase feature component to adaptively lean the regression kernel based on local covariance to estimate the high resolution image. For each patch in the neighborhood, four directional variances are estimated to adapt the interpolated pixels. A Modified Local Binary Pattern (MLBP) methodology for feature extraction is proposed to obtain robust face recognition under varying lighting conditions. Original LBP operator compares pixels in a local neighborhood with the center pixel and converts the resultant binary string to 8-bit integer value. So, it is less effective under difficult lighting conditions where variation between pixels is negligible. The proposed MLBP uses a two stage encoding procedure which is more robust in detecting this variation in a local patch. A novel dimensionality reduction technique called Marginality Preserving Embedding (MPE) is also proposed for enhancing the face recognition accuracy. Unlike Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which project data in a global sense, MPE seeks for a local structure in the manifold. This is similar to other subspace learning techniques but the difference with other manifold learning is that MPE preserves marginality in local reconstruction. Hence it provides better representation in low dimensional space and achieves lower error rates in face recognition. Two new concepts for robust face recognition are also presented in this dissertation. In the first approach, a neural network is used for training the system where input vectors are created by measuring distance from each input to its class mean. In the second approach, half-face symmetry is used, realizing the fact that the face images may contain various expressions such as open/close eye, open/close mouth etc., and classify the top half and bottom half separately and finally fuse the two results. By performing experiments on several standard face datasets, improved results were observed in all the new proposed methodologies. Research is progressing in developing a unified approach for the extraction of features suitable for accurate face recognition in a long range video sequence in complex environments
Linear subspace methods in face recognition
Despite over 30 years of research, face recognition is still one of the most difficult problems in the field of Computer Vision. The challenge comes from many factors affecting the performance of a face recognition system: noisy input, training data collection, speed-accuracy trade-off, variations in expression, illumination, pose, or ageing. Although relatively successful attempts have been made for special cases, such as frontal faces, no satisfactory methods exist that work under completely unconstrained conditions. This thesis proposes solutions to three important problems: lack of training data, speed-accuracy requirement, and unconstrained environments.
The problem of lacking training data has been solved in the worst case: single sample per person. Whitened Principal Component Analysis is proposed as a simple but effective solution. Whitened PCA performs consistently well on multiple face datasets.
Speed-accuracy trade-off problem is the second focus of this thesis. Two solutions are proposed to tackle this problem. The first solution is a new feature extraction method called Compact Binary Patterns which is about three times faster than Local Binary Patterns. The second solution is a multi-patch classifier which performs much better than a single classifier without compromising speed.
Two metric learning methods are introduced to solve the problem of unconstrained face recognition. The first method called Indirect Neighourhood Component Analysis combines the best ideas from Neighourhood Component Analysis and One-shot learning. The second method, Cosine Similarity Metric Learning, uses Cosine Similarity instead of the more popular Euclidean distance to form the objective function in the learning process. This Cosine Similarity Metric Learning method produces the best result in the literature on the state-of-the-art face dataset: the Labelled Faces in the Wild dataset.
Finally, a full face verification system based on our real experience taking part in ICPR 2010 Face Verification contest is described. Many practical points are discussed
Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction
It is difficult to find the optimal sparse solution of a manifold learning
based dimensionality reduction algorithm. The lasso or the elastic net
penalized manifold learning based dimensionality reduction is not directly a
lasso penalized least square problem and thus the least angle regression (LARS)
(Efron et al. \cite{LARS}), one of the most popular algorithms in sparse
learning, cannot be applied. Therefore, most current approaches take indirect
ways or have strict settings, which can be inconvenient for applications. In
this paper, we proposed the manifold elastic net or MEN for short. MEN
incorporates the merits of both the manifold learning based dimensionality
reduction and the sparse learning based dimensionality reduction. By using a
series of equivalent transformations, we show MEN is equivalent to the lasso
penalized least square problem and thus LARS is adopted to obtain the optimal
sparse solution of MEN. In particular, MEN has the following advantages for
subsequent classification: 1) the local geometry of samples is well preserved
for low dimensional data representation, 2) both the margin maximization and
the classification error minimization are considered for sparse projection
calculation, 3) the projection matrix of MEN improves the parsimony in
computation, 4) the elastic net penalty reduces the over-fitting problem, and
5) the projection matrix of MEN can be interpreted psychologically and
physiologically. Experimental evidence on face recognition over various popular
datasets suggests that MEN is superior to top level dimensionality reduction
algorithms.Comment: 33 pages, 12 figure
Hyperspectral colon tissue cell classification
A novel algorithm to discriminate between normal and malignant tissue cells of the human colon is presented. The microscopic level images of human colon tissue cells were acquired using hyperspectral imaging technology at contiguous wavelength intervals of visible light. While hyperspectral imagery data provides a wealth of information, its large size normally means high computational processing complexity. Several methods exist to avoid the so-called curse of dimensionality and hence reduce the computational complexity. In this study, we experimented with Principal Component Analysis (PCA) and two modifications of Independent Component Analysis (ICA). In the first stage of the algorithm, the extracted components are used to separate four constituent parts of the colon tissue: nuclei, cytoplasm, lamina propria, and lumen. The segmentation is performed in an unsupervised fashion using the nearest centroid clustering algorithm. The segmented image is further used, in the second stage of the classification algorithm, to exploit the spatial relationship between the labeled constituent parts. Experimental results using supervised Support Vector Machines (SVM) classification based on multiscale morphological features reveal the discrimination between normal and malignant tissue cells with a reasonable degree of accuracy
View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation
The primate brain contains a hierarchy of visual areas, dubbed the ventral
stream, which rapidly computes object representations that are both specific
for object identity and relatively robust against identity-preserving
transformations like depth-rotations. Current computational models of object
recognition, including recent deep learning networks, generate these properties
through a hierarchy of alternating selectivity-increasing filtering and
tolerance-increasing pooling operations, similar to simple-complex cells
operations. While simulations of these models recapitulate the ventral stream's
progression from early view-specific to late view-tolerant representations,
they fail to generate the most salient property of the intermediate
representation for faces found in the brain: mirror-symmetric tuning of the
neural population to head orientation. Here we prove that a class of
hierarchical architectures and a broad set of biologically plausible learning
rules can provide approximate invariance at the top level of the network. While
most of the learning rules do not yield mirror-symmetry in the mid-level
representations, we characterize a specific biologically-plausible Hebb-type
learning rule that is guaranteed to generate mirror-symmetric tuning to faces
tuning at intermediate levels of the architecture
DCTNet : A Simple Learning-free Approach for Face Recognition
PCANet was proposed as a lightweight deep learning network that mainly
leverages Principal Component Analysis (PCA) to learn multistage filter banks
followed by binarization and block-wise histograming. PCANet was shown worked
surprisingly well in various image classification tasks. However, PCANet is
data-dependence hence inflexible. In this paper, we proposed a
data-independence network, dubbed DCTNet for face recognition in which we adopt
Discrete Cosine Transform (DCT) as filter banks in place of PCA. This is
motivated by the fact that 2D DCT basis is indeed a good approximation for high
ranked eigenvectors of PCA. Both 2D DCT and PCA resemble a kind of modulated
sine-wave patterns, which can be perceived as a bandpass filter bank. DCTNet is
free from learning as 2D DCT bases can be computed in advance. Besides that, we
also proposed an effective method to regulate the block-wise histogram feature
vector of DCTNet for robustness. It is shown to provide surprising performance
boost when the probe image is considerably different in appearance from the
gallery image. We evaluate the performance of DCTNet extensively on a number of
benchmark face databases and being able to achieve on par with or often better
accuracy performance than PCANet.Comment: APSIPA ASC 201
- …