6 research outputs found

    PLDANet: Reasonable Combination of PCA and LDA Convolutional Networks

    Get PDF
    Integrating deep learning with traditional machine learning methods is an intriguing research direction. For example, PCANet and LDANet adopts Principal Component Analysis (PCA) and Fisher Linear Discriminant Analysis (LDA) to learn convolutional kernels separately. It is not reasonable to adopt LDA to learn filter kernels in each convolutional layer, local features of images from different classes may be similar, such as background areas. Therefore, it is meaningful to adopt LDA to learn filter kernels only when all the patches carry information from the whole image. However, to our knowledge, there are no existing works that study how to combine PCA and LDA to learn convolutional kernels to achieve the best performance. In this paper, we propose the convolutional coverage theory. Furthermore, we propose the PLDANet model which adopts PCA and LDA reasonably in different convolutional layers based on the coverage theory. The experimental study has shown the effectiveness of the proposed PLDANet model

    Hyperparameter Optimization Of Deep Convolutional Neural Networks Architectures For Object Recognition

    Get PDF
    Recent advances in Convolutional Neural Networks (CNNs) have obtained promising results in difficult deep learning tasks. However, the success of a CNN depends on finding an architecture to fit a given problem. A hand-crafted architecture is a challenging, time-consuming process that requires expert knowledge and effort, due to a large number of architectural design choices. In this dissertation, we present an efficient framework that automatically designs a high-performing CNN architecture for a given problem. In this framework, we introduce a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet). The new objective function allows the hyperparameters of the CNN architecture to be optimized in a way that enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet. The actual optimization of the objective function is carried out via the Nelder-Mead Method (NMM). Further, our new objective function results in much faster convergence towards a better architecture. The proposed framework has the ability to explore a CNN architecture’s numerous design choices in an efficient way and also allows effective, distributed execution and synchronization via web services. Empirically, we demonstrate that the CNN architecture designed with our approach outperforms several existing approaches in terms of its error rate. Our results are also competitive with state-of-the-art results on the MNIST dataset and perform reasonably against the state-of-the-art results on CIFAR-10 and CIFAR-100 datasets. Our approach has a significant role in increasing the depth, reducing the size of strides, and constraining some convolutional layers not followed by pooling layers in order to find a CNN architecture that produces a high recognition performance. Moreover, we evaluate the effectiveness of reducing the size of the training set on CNNs using a variety of instance selection methods to speed up the training time. We then study how these methods impact classification accuracy. Many instance selection methods require a long run-time to obtain a subset of the representative dataset, especially if the training set is large and has a high dimensionality. One example of these algorithms is Random Mutation Hill Climbing (RMHC). We improve RMHC so that it performs faster than the original algorithm with the same accuracy

    High Performance Techniques for Face Recognition

    Get PDF
    The identification of individuals using face recognition techniques is a challenging task. This is due to the variations resulting from facial expressions, makeup, rotations, illuminations, gestures, etc. Also, facial images contain a great deal of redundant information, which negatively affects the performance of the recognition system. The dimensionality and the redundancy of the facial features have a direct effect on the face recognition accuracy. Not all the features in the feature vector space are useful. For example, non-discriminating features in the feature vector space not only degrade the recognition accuracy but also increase the computational complexity. In the field of computer vision, pattern recognition, and image processing, face recognition has become a popular research topic. This is due to its wide spread applications in security and control, which allow the identified individual to access secure areas, personal information, etc. The performance of any recognition system depends on three factors: 1) the storage requirements, 2) the computational complexity, and 3) the recognition rates. Two different recognition system families are presented and developed in this dissertation. Each family consists of several face recognition systems. Each system contains three main steps, namely, preprocessing, feature extraction, and classification. Several preprocessing steps, such as cropping, facial detection, dividing the facial image into sub-images, etc. are applied to the facial images. This reduces the effect of the irrelevant information (background) and improves the system performance. In this dissertation, either a Neural Network (NN) based classifier or Euclidean distance is used for classification purposes. Five widely used databases, namely, ORL, YALE, FERET, FEI, and LFW, each containing different facial variations, such as light condition, rotations, facial expressions, facial details, etc., are used to evaluate the proposed systems. The experimental results of the proposed systems are analyzed using K-folds Cross Validation (CV). In the family-1, Several systems are proposed for face recognition. Each system employs different integrated tools in the feature extraction step. These tools, Two Dimensional Discrete Multiwavelet Transform (2D DMWT), 2D Radon Transform (2D RT), 2D or 3D DWT, and Fast Independent Component Analysis (FastICA), are applied to the processed facial images to reduce the dimensionality and to obtain discriminating features. Each proposed system produces a unique representation, and achieves less storage requirements and better performance than the existing methods. For further facial compression, there are three face recognition systems in the second family. Each system uses different integrated tools to obtain better facial representation. The integrated tools, Vector Quantization (VQ), Discrete cosine Transform (DCT), and 2D DWT, are applied to the facial images for further facial compression and better facial representation. In the systems using the tools VQ/2D DCT and VQ/ 2D DWT, each pose in the databases is represented by one centroid with 4*4*16 dimensions. In the third system, VQ/ Facial Part Detection (FPD), each person in the databases is represented by four centroids with 4*Centroids (4*4*16) dimensions. The systems in the family-2 are proposed to further reduce the dimensions of the data compared to the systems in the family-1 while attaining comparable results. For example, in family-1, the integrated tools, FastICA/ 2D DMWT, applied to different combinations of sub-images in the FERET database with K-fold=5 (9 different poses used in the training mode), reduce the dimensions of the database by 97.22% and achieve 99% accuracy. In contrast, the integrated tools, VQ/ FPD, in the family-2 reduce the dimensions of the data by 99.31% and achieve 97.98% accuracy. In this example, the integrated tools, VQ/ FPD, accomplished further data compression and less accuracy compared to those reported by FastICA/ 2D DMWT tools. Various experiments and simulations using MATLAB are applied. The experimental results of both families confirm the improvements in the storage requirements, as well as the recognition rates as compared to some recently reported methods

    Face recognition using Deep PCA

    No full text
    corecore