13,010 research outputs found

    Discriminative learning of local image descriptors

    Get PDF
    In this paper, we explore methods for learning local image descriptors from training data. We describe a set of building blocks for constructing descriptors which can be combined together and jointly optimized so as to minimize the error of a nearest-neighbor classifier. We consider both linear and nonlinear transforms with dimensionality reduction, and make use of discriminant learning techniques such as Linear Discriminant Analysis (LDA) and Powell minimization to solve for the parameters. Using these techniques, we obtain descriptors that exceed state-of-the-art performance with low dimensionality. In addition to new experiments and recommendations for descriptor learning, we are also making available a new and realistic ground truth data set based on multiview stereo data

    Supervised local descriptor learning for human action recognition

    Get PDF
    Local features have been widely used in computer vision tasks, e.g., human action recognition, but it tends to be an extremely challenging task to deal with large-scale local features of high dimensionality with redundant information. In this paper, we propose a novel fully supervised local descriptor learning algorithm called discriminative embedding method based on the image-to-class distance (I2CDDE) to learn compact but highly discriminative local feature descriptors for more accurate and efficient action recognition. By leveraging the advantages of the I2C distance, the proposed I2CDDE incorporates class labels to enable fully supervised learning of local feature descriptors, which achieves highly discriminative but compact local descriptors. The objective of our I2CDDE is to minimize the I2C distances from samples to their corresponding classes while maximizing the I2C distances to the other classes in the low-dimensional space. To further improve the performance, we propose incorporating a manifold regularization based on the graph Laplacian into the objective function, which can enhance the smoothness of the embedding by extracting the local intrinsic geometrical structure. The proposed I2CDDE for the first time achieves fully supervised learning of local feature descriptors. It significantly improves the performance of I2C-based methods by increasing the discriminative ability of local features while greatly reducing the computational burden by dimensionality reduction to handle large-scale data. We apply the proposed I2CDDE algorithm to human action recognition on four widely used benchmark datasets. The results have shown that I2CDDE can significantly improve I2C-based classifiers and achieves state-of-the-art performance

    Novel image descriptors and learning methods for image classification applications

    Get PDF
    Image classification is an active and rapidly expanding research area in computer vision and machine learning due to its broad applications. With the advent of big data, the need for robust image descriptors and learning methods to process a large number of images for different kinds of visual applications has greatly increased. Towards that end, this dissertation focuses on exploring new image descriptors and learning methods by incorporating important visual aspects and enhancing the feature representation in the discriminative space for advancing image classification. First, an innovative sparse representation model using the complete marginal Fisher analysis (CMFA-SR) framework is proposed for improving the image classification performance. In particular, the complete marginal Fisher analysis method extracts the discriminatory features in both the column space of the local samples based within class scatter matrix and the null space of its transformed matrix. To further improve the classification capability, a discriminative sparse representation model is proposed by integrating a representation criterion such as the sparse representation and a discriminative criterion. Second, the discriminative dictionary distribution based sparse coding (DDSC) method is presented that utilizes both the discriminative and generative information to enhance the feature representation. Specifically, the dictionary distribution criterion reveals the class conditional probability of each dictionary item by using the dictionary distribution coefficients, and the discriminative criterion applies new within-class and between-class scatter matrices for discriminant analysis. Third, a fused color Fisher vector (FCFV) feature is developed by integrating the most expressive features of the DAISY Fisher vector (D-FV) feature, the WLD-SIFT Fisher vector (WS-FV) feature, and the SIFT-FV feature in different color spaces to capture the local, color, spatial, relative intensity, as well as the gradient orientation information. Furthermore, a sparse kernel manifold learner (SKML) method is applied to the FCFV features for learning a discriminative sparse representation by considering the local manifold structure and the label information based on the marginal Fisher criterion. Finally, a novel multiple anthropological Fisher kernel framework (M-AFK) is presented to extract and enhance the facial genetic features for kinship verification. The proposed method is derived by applying a novel similarity enhancement approach based on SIFT flow and learning an inheritable transformation on the multiple Fisher vector features that uses the criterion of minimizing the distance among the kinship samples and maximizing the distance among the non-kinship samples. The effectiveness of the proposed methods is assessed on numerous image classification tasks, such as face recognition, kinship verification, scene classification, object classification, and computational fine art painting categorization. The experimental results on popular image datasets show the feasibility of the proposed methods
    • …
    corecore