329,892 research outputs found

    Novel image descriptors and learning methods for image classification applications

    Get PDF
    Image classification is an active and rapidly expanding research area in computer vision and machine learning due to its broad applications. With the advent of big data, the need for robust image descriptors and learning methods to process a large number of images for different kinds of visual applications has greatly increased. Towards that end, this dissertation focuses on exploring new image descriptors and learning methods by incorporating important visual aspects and enhancing the feature representation in the discriminative space for advancing image classification. First, an innovative sparse representation model using the complete marginal Fisher analysis (CMFA-SR) framework is proposed for improving the image classification performance. In particular, the complete marginal Fisher analysis method extracts the discriminatory features in both the column space of the local samples based within class scatter matrix and the null space of its transformed matrix. To further improve the classification capability, a discriminative sparse representation model is proposed by integrating a representation criterion such as the sparse representation and a discriminative criterion. Second, the discriminative dictionary distribution based sparse coding (DDSC) method is presented that utilizes both the discriminative and generative information to enhance the feature representation. Specifically, the dictionary distribution criterion reveals the class conditional probability of each dictionary item by using the dictionary distribution coefficients, and the discriminative criterion applies new within-class and between-class scatter matrices for discriminant analysis. Third, a fused color Fisher vector (FCFV) feature is developed by integrating the most expressive features of the DAISY Fisher vector (D-FV) feature, the WLD-SIFT Fisher vector (WS-FV) feature, and the SIFT-FV feature in different color spaces to capture the local, color, spatial, relative intensity, as well as the gradient orientation information. Furthermore, a sparse kernel manifold learner (SKML) method is applied to the FCFV features for learning a discriminative sparse representation by considering the local manifold structure and the label information based on the marginal Fisher criterion. Finally, a novel multiple anthropological Fisher kernel framework (M-AFK) is presented to extract and enhance the facial genetic features for kinship verification. The proposed method is derived by applying a novel similarity enhancement approach based on SIFT flow and learning an inheritable transformation on the multiple Fisher vector features that uses the criterion of minimizing the distance among the kinship samples and maximizing the distance among the non-kinship samples. The effectiveness of the proposed methods is assessed on numerous image classification tasks, such as face recognition, kinship verification, scene classification, object classification, and computational fine art painting categorization. The experimental results on popular image datasets show the feasibility of the proposed methods

    Combining case based reasoning with neural networks

    Get PDF
    This paper presents a neural network based technique for mapping problem situations to problem solutions for Case-Based Reasoning (CBR) applications. Both neural networks and CBR are instance-based learning techniques, although neural nets work with numerical data and CBR systems work with symbolic data. This paper discusses how the application scope of both paradigms could be enhanced by the use of hybrid concepts. To make the use of neural networks possible, the problem's situation and solution features are transformed into continuous features, using techniques similar to CBR's definition of similarity metrics. Radial Basis Function (RBF) neural nets are used to create a multivariable, continuous input-output mapping. As the mapping is continuous, this technique also provides generalisation between cases, replacing the domain specific solution adaptation techniques required by conventional CBR. This continuous representation also allows, as in fuzzy logic, an associated membership measure to be output with each symbolic feature, aiding the prioritisation of various possible solutions. A further advantage is that, as the RBF neurons are only active in a limited area of the input space, the solution can be accompanied by local estimates of accuracy, based on the sufficiency of the cases present in that area as well as the results measured during testing. We describe how the application of this technique could be of benefit to the real world problem of sales advisory systems, among others

    Improving 3D Reconstruction using Deep Learning Priors

    Get PDF
    Modeling the 3D geometry of shapes and the environment around us has many practical applications in mapping, navigation, virtual/ augmented reality, and autonomous robots. In general, the acquisition of 3D models relies on using passive images or using active depth sensors such as structured light systems that use external infrared projectors. Although active methods provide very robust and reliable depth information, they have limited use cases and heavy power requirements, which makes passive techniques more suitable for day-to-day user applications. Image-based depth acquisition systems usually face challenges representing thin, textureless, or specular surfaces and regions in shadows or low-light environments. While scene depth information can be extracted from the set of passive images, fusion of depth information from several views into a consistent 3D representation remains a challenging task. The most common challenges in 3D environment capture include the use of efficient scene representation that preserves the details, thin structures, and ensures overall completeness of the reconstruction. In this thesis, we illustrate the use of deep learning techniques to resolve some of the challenges of image-based depth acquisition and 3D scene representation. We use a deep learning framework to learn priors over scene geometry and scene global context for solving several ambiguous and ill-posed problems such as estimating depth on textureless surfaces and producing complete 3D reconstruction for partially observed scenes. More specifically, we propose that using deep learning priors, a simple stereo camera system can be used to reconstruct a typical apartment size indoor scene environments with the fidelity that approaches the quality of a much more expensive state-of-the-art active depth-sensing system. Furthermore, we describe how deep learning priors on local shapes can represent 3D environments more efficiently than with traditional systems while at the same time preserving details and completing surfaces.Doctor of Philosoph
    • …
    corecore