836 research outputs found

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Fast ICA for Blind Source Separation and its Implementation

    Get PDF
    Independent Component Analysis (ICA) is a statistical signal processing technique having emerging new practical application areas, such as blind signal separation such as mixed voices or images, analysis of several types of data or feature extraction. Fast independent component analysis (Fast ICA ) is one of the most efficient ICA technique. Fast ICA algorithm separates the independent sources from their mixtures by measuring non-gaussian. Fast ICA is a common method to identify aircrafts and interference from their mixtures such as electroencephalogram (EEG), magnetoencephalography (MEG), and electrocardiogram (ECG). Therefore, it is valuable to implement Fast ICA for real-time signal processing. In this thesis, the Fast ICA algorithm is implemented by hand coding HDL code. In addition, in order to increase the number of precision, the floating point (FP) arithmetic units are also implemented by HDL coding.To verify the algorithm, MATLAB simulations are also performed for both off line signal rocessing and real-time signal processing

    Eye detection using discriminatory features and an efficient support vector machine

    Get PDF
    Accurate and efficient eye detection has broad applications in computer vision, machine learning, and pattern recognition. This dissertation presents a number of accurate and efficient eye detection methods using various discriminatory features and a new efficient Support Vector Machine (eSVM). This dissertation first introduces five popular image representation methods - the gray-scale image representation, the color image representation, the 2D Haar wavelet image representation, the Histograms of Oriented Gradients (HOG) image representation, and the Local Binary Patterns (LBP) image representation - and then applies these methods to derive five types of discriminatory features. Comparative assessments are then presented to evaluate the performance of these discriminatory features on the problem of eye detection. This dissertation further proposes two discriminatory feature extraction (DFE) methods for eye detection. The first DFE method, discriminant component analysis (DCA), improves upon the popular principal component analysis (PCA) method. The PCA method can derive the optimal features for data representation but not for classification. In contrast, the DCA method, which applies a new criterion vector that is defined on two novel measure vectors, derives the optimal discriminatory features in the whitened PCA space for two-class classification problems. The second DFE method, clustering-based discriminant analysis (CDA), improves upon the popular Fisher linear discriminant (FLD) method. A major disadvantage of the FLD is that it may not be able to extract adequate features in order to achieve satisfactory performance, especially for two-class problems. To address this problem, three CDA models (CDA-1, -2, and -3) are proposed by taking advantage of the clustering technique. For every CDA model anew between-cluster scatter matrix is defined. The CDA method thus can derive adequate features to achieve satisfactory performance for eye detection. Furthermore, the clustering nature of the three CDA models and the nonparametric nature of the CDA-2 and -3 models can further improve the detection performance upon the conventional FLD method. This dissertation finally presents a new efficient Support Vector Machine (eSVM) for eye detection that improves the computational efficiency of the conventional Support Vector Machine (SVM). The eSVM first defines a Θ set that consists of the training samples on the wrong side of their margin derived from the conventional soft-margin SVM. The Θ set plays an important role in controlling the generalization performance of the eSVM. The eSVM then introduces only a single slack variable for all the training samples in the Θ set, and as a result, only a very small number of those samples in the Θ set become support vectors. The eSVM hence significantly reduces the number of support vectors and improves the computational efficiency without sacrificing the generalization performance. A modified Sequential Minimal Optimization (SMO) algorithm is then presented to solve the large Quadratic Programming (QP) problem defined in the optimization of the eSVM. Three large-scale face databases, the Face Recognition Grand challenge (FRGC) version 2 database, the BioID database, and the FERET database, are applied to evaluate the proposed eye detection methods. Experimental results show the effectiveness of the proposed methods that improve upon some state-of-the-art eye detection methods

    Linear subspace methods in face recognition

    Get PDF
    Despite over 30 years of research, face recognition is still one of the most difficult problems in the field of Computer Vision. The challenge comes from many factors affecting the performance of a face recognition system: noisy input, training data collection, speed-accuracy trade-off, variations in expression, illumination, pose, or ageing. Although relatively successful attempts have been made for special cases, such as frontal faces, no satisfactory methods exist that work under completely unconstrained conditions. This thesis proposes solutions to three important problems: lack of training data, speed-accuracy requirement, and unconstrained environments. The problem of lacking training data has been solved in the worst case: single sample per person. Whitened Principal Component Analysis is proposed as a simple but effective solution. Whitened PCA performs consistently well on multiple face datasets. Speed-accuracy trade-off problem is the second focus of this thesis. Two solutions are proposed to tackle this problem. The first solution is a new feature extraction method called Compact Binary Patterns which is about three times faster than Local Binary Patterns. The second solution is a multi-patch classifier which performs much better than a single classifier without compromising speed. Two metric learning methods are introduced to solve the problem of unconstrained face recognition. The first method called Indirect Neighourhood Component Analysis combines the best ideas from Neighourhood Component Analysis and One-shot learning. The second method, Cosine Similarity Metric Learning, uses Cosine Similarity instead of the more popular Euclidean distance to form the objective function in the learning process. This Cosine Similarity Metric Learning method produces the best result in the literature on the state-of-the-art face dataset: the Labelled Faces in the Wild dataset. Finally, a full face verification system based on our real experience taking part in ICPR 2010 Face Verification contest is described. Many practical points are discussed

    Linear subspace methods in face recognition

    Get PDF
    Despite over 30 years of research, face recognition is still one of the most difficult problems in the field of Computer Vision. The challenge comes from many factors affecting the performance of a face recognition system: noisy input, training data collection, speed-accuracy trade-off, variations in expression, illumination, pose, or ageing. Although relatively successful attempts have been made for special cases, such as frontal faces, no satisfactory methods exist that work under completely unconstrained conditions. This thesis proposes solutions to three important problems: lack of training data, speed-accuracy requirement, and unconstrained environments. The problem of lacking training data has been solved in the worst case: single sample per person. Whitened Principal Component Analysis is proposed as a simple but effective solution. Whitened PCA performs consistently well on multiple face datasets. Speed-accuracy trade-off problem is the second focus of this thesis. Two solutions are proposed to tackle this problem. The first solution is a new feature extraction method called Compact Binary Patterns which is about three times faster than Local Binary Patterns. The second solution is a multi-patch classifier which performs much better than a single classifier without compromising speed. Two metric learning methods are introduced to solve the problem of unconstrained face recognition. The first method called Indirect Neighourhood Component Analysis combines the best ideas from Neighourhood Component Analysis and One-shot learning. The second method, Cosine Similarity Metric Learning, uses Cosine Similarity instead of the more popular Euclidean distance to form the objective function in the learning process. This Cosine Similarity Metric Learning method produces the best result in the literature on the state-of-the-art face dataset: the Labelled Faces in the Wild dataset. Finally, a full face verification system based on our real experience taking part in ICPR 2010 Face Verification contest is described. Many practical points are discussed

    Unsupervised spectral sub-feature learning for hyperspectral image classification

    Get PDF
    Spectral pixel classification is one of the principal techniques used in hyperspectral image (HSI) analysis. In this article, we propose an unsupervised feature learning method for classification of hyperspectral images. The proposed method learns a dictionary of sub-feature basis representations from the spectral domain, which allows effective use of the correlated spectral data. The learned dictionary is then used in encoding convolutional samples from the hyperspectral input pixels to an expanded but sparse feature space. Expanded hyperspectral feature representations enable linear separation between object classes present in an image. To evaluate the proposed method, we performed experiments on several commonly used HSI data sets acquired at different locations and by different sensors. Our experimental results show that the proposed method outperforms other pixel-wise classification methods that make use of unsupervised feature extraction approaches. Additionally, even though our approach does not use any prior knowledge, or labelled training data to learn features, it yields either advantageous, or comparable, results in terms of classification accuracy with respect to recent semi-supervised methods

    Biologically inspired feature extraction for rotation and scale tolerant pattern analysis

    Get PDF
    Biologically motivated information processing has been an important area of scientific research for decades. The central topic addressed in this dissertation is utilization of lateral inhibition and more generally, linear networks with recurrent connectivity along with complex-log conformal mapping in machine based implementations of information encoding, feature extraction and pattern recognition. The reasoning behind and method for spatially uniform implementation of inhibitory/excitatory network model in the framework of non-uniform log-polar transform is presented. For the space invariant connectivity model characterized by Topelitz-Block-Toeplitz matrix, the overall network response is obtained without matrix inverse operations providing the connection matrix generating function is bound by unity. It was shown that for the network with the inter-neuron connection function expandable in a Fourier series in polar angle, the overall network response is steerable. The decorrelating/whitening characteristics of networks with lateral inhibition are used in order to develop space invariant pre-whitening kernels specialized for specific category of input signals. These filters have extremely small memory footprint and are successfully utilized in order to improve performance of adaptive neural whitening algorithms. Finally, the method for feature extraction based on localized Independent Component Analysis (ICA) transform in log-polar domain and aided by previously developed pre-whitening filters is implemented. Since output codes produced by ICA are very sparse, a small number of non-zero coefficients was sufficient to encode input data and obtain reliable pattern recognition performance

    Features and statistical classifiers for face image analysis

    Get PDF
    This thesis presents the systematic analysis of feature spaces and classification schemes for face image processing. Linear discriminants, probabilistic classifiers, and nearest neighbour classifiers are applied to face/nonface classification in various feature spaces including original grayscale space, face-image-whitened space, anything-image-whitened space, and double-whitened space. According to the classification error rates, the probabilistic classifiers performed the best, followed by nearest neighbour classifiers, and then the linear discriminant classifier. However, the former two kinds of classifiers are more computationally demanding. No matter what kind of classifier is used, the whitened space with reduced dimensionality improves classification performance. -- A new feature extraction technique, named dominant feature extraction, is invented and applied to face/nonface classification with encouraging results. This technique extracts the features corresponding to the mean-difference and variance-difference of two classes. Other classification schemes, including the repeated Fisher's Linear Discriminant (FLD) and a moving-centre scheme, are newly proposed and tested. The Maximum Likelihood (ML) classifier based on hyperellipsoid distribution is applied for the first time to face/nonface classification. -- Face images are conventionally represented by grayscales. This work presents a new representation that includes motion vectors, obtained through optical flow analysis between an input image and a neutral template, and a deformation residue that is the difference between the deformed input image and the template. The face images compose a convex cluster in this representation space. The viability of this space is tested and demonstrated through classification experiments on face detection, expression analysis, pose estimation, and face recognition. When the FLD is applied to face/nonface classification and smiling/nonsmiling face classification, the new representation of face images outperforms the traditional grayscale representation. Face recognition experiments using the nearest neighbour classifier on the Olivetti and Oracle Research Laboratory (ORL) face database shows that the deformation residue representation is superior to all other representations. These promising results demonstrate that as a general-purpose space, the derived representation space is suitable for almost all aspects of face image processing
    corecore