5,720 research outputs found

    Novel image descriptors and learning methods for image classification applications

    Get PDF
    Image classification is an active and rapidly expanding research area in computer vision and machine learning due to its broad applications. With the advent of big data, the need for robust image descriptors and learning methods to process a large number of images for different kinds of visual applications has greatly increased. Towards that end, this dissertation focuses on exploring new image descriptors and learning methods by incorporating important visual aspects and enhancing the feature representation in the discriminative space for advancing image classification. First, an innovative sparse representation model using the complete marginal Fisher analysis (CMFA-SR) framework is proposed for improving the image classification performance. In particular, the complete marginal Fisher analysis method extracts the discriminatory features in both the column space of the local samples based within class scatter matrix and the null space of its transformed matrix. To further improve the classification capability, a discriminative sparse representation model is proposed by integrating a representation criterion such as the sparse representation and a discriminative criterion. Second, the discriminative dictionary distribution based sparse coding (DDSC) method is presented that utilizes both the discriminative and generative information to enhance the feature representation. Specifically, the dictionary distribution criterion reveals the class conditional probability of each dictionary item by using the dictionary distribution coefficients, and the discriminative criterion applies new within-class and between-class scatter matrices for discriminant analysis. Third, a fused color Fisher vector (FCFV) feature is developed by integrating the most expressive features of the DAISY Fisher vector (D-FV) feature, the WLD-SIFT Fisher vector (WS-FV) feature, and the SIFT-FV feature in different color spaces to capture the local, color, spatial, relative intensity, as well as the gradient orientation information. Furthermore, a sparse kernel manifold learner (SKML) method is applied to the FCFV features for learning a discriminative sparse representation by considering the local manifold structure and the label information based on the marginal Fisher criterion. Finally, a novel multiple anthropological Fisher kernel framework (M-AFK) is presented to extract and enhance the facial genetic features for kinship verification. The proposed method is derived by applying a novel similarity enhancement approach based on SIFT flow and learning an inheritable transformation on the multiple Fisher vector features that uses the criterion of minimizing the distance among the kinship samples and maximizing the distance among the non-kinship samples. The effectiveness of the proposed methods is assessed on numerous image classification tasks, such as face recognition, kinship verification, scene classification, object classification, and computational fine art painting categorization. The experimental results on popular image datasets show the feasibility of the proposed methods

    Mental state estimation for brain-computer interfaces

    Get PDF
    Mental state estimation is potentially useful for the development of asynchronous brain-computer interfaces. In this study, four mental states have been identified and decoded from the electrocorticograms (ECoGs) of six epileptic patients, engaged in a memory reach task. A novel signal analysis technique has been applied to high-dimensional, statistically sparse ECoGs recorded by a large number of electrodes. The strength of the proposed technique lies in its ability to jointly extract spatial and temporal patterns, responsible for encoding mental state differences. As such, the technique offers a systematic way of analyzing the spatiotemporal aspects of brain information processing and may be applicable to a wide range of spatiotemporal neurophysiological signals

    The Role of Riemannian Manifolds in Computer Vision: From Coding to Deep Metric Learning

    Get PDF
    A diverse number of tasks in computer vision and machine learning enjoy from representations of data that are compact yet discriminative, informative and robust to critical measurements. Two notable representations are offered by Region Covariance Descriptors (RCovD) and linear subspaces which are naturally analyzed through the manifold of Symmetric Positive Definite (SPD) matrices and the Grassmann manifold, respectively, two widely used types of Riemannian manifolds in computer vision. As our first objective, we examine image and video-based recognition applications where the local descriptors have the aforementioned Riemannian structures, namely the SPD or linear subspace structure. Initially, we provide a solution to compute Riemannian version of the conventional Vector of Locally aggregated Descriptors (VLAD), using geodesic distance of the underlying manifold as the nearness measure. Next, by having a closer look at the resulting codes, we formulate a new concept which we name Local Difference Vectors (LDV). LDVs enable us to elegantly expand our Riemannian coding techniques to any arbitrary metric as well as provide intrinsic solutions to Riemannian sparse coding and its variants when local structured descriptors are considered. We then turn our attention to two special types of covariance descriptors namely infinite-dimensional RCovDs and rank-deficient covariance matrices for which the underlying Riemannian structure, i.e. the manifold of SPD matrices is out of reach to great extent. %Generally speaking, infinite-dimensional RCovDs offer better discriminatory power over their low-dimensional counterparts. To overcome this difficulty, we propose to approximate the infinite-dimensional RCovDs by making use of two feature mappings, namely random Fourier features and the Nystrom method. As for the rank-deficient covariance matrices, unlike most existing approaches that employ inference tools by predefined regularizers, we derive positive definite kernels that can be decomposed into the kernels on the cone of SPD matrices and kernels on the Grassmann manifolds and show their effectiveness for image set classification task. Furthermore, inspired by attractive properties of Riemannian optimization techniques, we extend the recently introduced Keep It Simple and Straightforward MEtric learning (KISSME) method to the scenarios where input data is non-linearly distributed. To this end, we make use of the infinite dimensional covariance matrices and propose techniques towards projecting on the positive cone in a Reproducing Kernel Hilbert Space (RKHS). We also address the sensitivity issue of the KISSME to the input dimensionality. The KISSME algorithm is greatly dependent on Principal Component Analysis (PCA) as a preprocessing step which can lead to difficulties, especially when the dimensionality is not meticulously set. To address this issue, based on the KISSME algorithm, we develop a Riemannian framework to jointly learn a mapping performing dimensionality reduction and a metric in the induced space. Lastly, in line with the recent trend in metric learning, we devise end-to-end learning of a generic deep network for metric learning using our derivation

    Investigation of new learning methods for visual recognition

    Get PDF
    Visual recognition is one of the most difficult and prevailing problems in computer vision and pattern recognition due to the challenges in understanding the semantics and contents of digital images. Two major components of a visual recognition system are discriminatory feature representation and efficient and accurate pattern classification. This dissertation therefore focuses on developing new learning methods for visual recognition. Based on the conventional sparse representation, which shows its robustness for visual recognition problems, a series of new methods is proposed. Specifically, first, a new locally linear K nearest neighbor method, or LLK method, is presented. The LLK method derives a new representation, which is an approximation to the ideal representation, by optimizing an objective function based on a host of criteria for sparsity, locality, and reconstruction. The novel representation is further processed by two new classifiers, namely, an LLK based classifier (LLKc) and a locally linear nearest mean based classifier (LLNc), for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Second, a new generative and discriminative sparse representation (GDSR) method is proposed by taking advantage of both a coarse modeling of the generative information and a modeling of the discriminative information. The proposed GDSR method integrates two new criteria, namely, a discriminative criterion and a generative criterion, into the conventional sparse representation criterion. A new generative and discriminative sparse representation based classification (GDSRc) method is then presented based on the derived new representation. Finally, a new Score space based multiple Metric Learning (SML) method is presented for a challenging visual recognition application, namely, recognizing kinship relations or kinship verification. The proposed SML method, which goes beyond the conventional Mahalanobis distance metric learning, not only learns the distance metric but also models the generative process of features by taking advantage of the score space. The SML method is optimized by solving a constrained, non-negative, and weighted variant of the sparse representation problem. To assess the feasibility of the proposed new learning methods, several visual recognition tasks, such as face recognition, scene recognition, object recognition, computational fine art analysis, action recognition, fine grained recognition, as well as kinship verification are applied. The experimental results show that the proposed new learning methods achieve better performance than the other popular methods
    • …
    corecore