725 research outputs found

    Interpretable Hyperspectral AI: When Non-Convex Modeling meets Hyperspectral Remote Sensing

    Full text link
    Hyperspectral imaging, also known as image spectrometry, is a landmark technique in geoscience and remote sensing (RS). In the past decade, enormous efforts have been made to process and analyze these hyperspectral (HS) products mainly by means of seasoned experts. However, with the ever-growing volume of data, the bulk of costs in manpower and material resources poses new challenges on reducing the burden of manual labor and improving efficiency. For this reason, it is, therefore, urgent to develop more intelligent and automatic approaches for various HS RS applications. Machine learning (ML) tools with convex optimization have successfully undertaken the tasks of numerous artificial intelligence (AI)-related applications. However, their ability in handling complex practical problems remains limited, particularly for HS data, due to the effects of various spectral variabilities in the process of HS imaging and the complexity and redundancy of higher dimensional HS signals. Compared to the convex models, non-convex modeling, which is capable of characterizing more complex real scenes and providing the model interpretability technically and theoretically, has been proven to be a feasible solution to reduce the gap between challenging HS vision tasks and currently advanced intelligent data processing models

    Robust Face Recognition based on Color and Depth Information

    Get PDF
    One of the most important advantages of automatic human face recognition is its nonintrusiveness property. Face images can sometime be acquired without user's knowledge or explicit cooperation. However, face images acquired in an uncontrolled environment can appear with varying imaging conditions. Traditionally, researchers focus on tackling this problem using 2D gray-scale images due to the wide availability of 2D cameras and the low processing and storage cost of gray-scale data. Nevertheless, face recognition can not be performed reliably with 2D gray-scale data due to insu_cient information and its high sensitivity to pose, expression and illumination variations. Recent rapid development in hardware makes acquisition and processing of color and 3D data feasible. This thesis aims to improve face recognition accuracy and robustness using color and 3D information.In terms of color information usage, this thesis proposes several improvements over existing approaches. Firstly, the Block-wise Discriminant Color Space is proposed, which learns the discriminative color space based on local patches of a human face image instead of the holistic image, as human faces display different colors in different parts. Secondly, observing that most of the existing color spaces consist of at most three color components, while complementary information can be found in multiple color components across multiple color spaces and therefore the Multiple Color Fusion model is proposed to search and utilize multiple color components effectively. Lastly, two robust color face recognition algorithms are proposed. The Color Sparse Coding method can deal with face images with noise and occlusion. The Multi-linear Color Tensor Discriminant method harnesses multi-linear technique to handle non-linear data. Experiments show that all the proposed methods outperform their existing competitors.In terms of 3D information utilization, this thesis investigates the feasibility of face recognition using Kinect. Unlike traditional 3D scanners which are too slow in speed and too expensive in cost for broad face recognition applications, Kinect trades data quality for high speed and low cost. An algorithm is proposed to show that Kinect data can be used for face recognition despite its noisy nature. In order to fully utilize Kinect data, a more sophisticated RGB-D face recognition algorithm is developed which harnesses theColor Sparse Coding framework and 3D information to perform accurate face recognition robustly even under simultaneous varying conditions of poses, illuminations, expressionsand disguises

    MDLatLRR: A novel decomposition method for infrared and visible image fusion

    Get PDF
    Image decomposition is crucial for many image processing tasks, as it allows to extract salient features from source images. A good image decomposition method could lead to a better performance, especially in image fusion tasks. We propose a multi-level image decomposition method based on latent low-rank representation(LatLRR), which is called MDLatLRR. This decomposition method is applicable to many image processing fields. In this paper, we focus on the image fusion task. We develop a novel image fusion framework based on MDLatLRR, which is used to decompose source images into detail parts(salient features) and base parts. A nuclear-norm based fusion strategy is used to fuse the detail parts, and the base parts are fused by an averaging strategy. Compared with other state-of-the-art fusion methods, the proposed algorithm exhibits better fusion performance in both subjective and objective evaluation.Comment: IEEE Trans. Image Processing 2020, 14 pages, 17 figures, 3 table

    Robust subspace learning for static and dynamic affect and behaviour modelling

    Get PDF
    Machine analysis of human affect and behavior in naturalistic contexts has witnessed a growing attention in the last decade from various disciplines ranging from social and cognitive sciences to machine learning and computer vision. Endowing machines with the ability to seamlessly detect, analyze, model, predict as well as simulate and synthesize manifestations of internal emotional and behavioral states in real-world data is deemed essential for the deployment of next-generation, emotionally- and socially-competent human-centered interfaces. In this thesis, we are primarily motivated by the problem of modeling, recognizing and predicting spontaneous expressions of non-verbal human affect and behavior manifested through either low-level facial attributes in static images or high-level semantic events in image sequences. Both visual data and annotations of naturalistic affect and behavior naturally contain noisy measurements of unbounded magnitude at random locations, commonly referred to as ‘outliers’. We present here machine learning methods that are robust to such gross, sparse noise. First, we deal with static analysis of face images, viewing the latter as a superposition of mutually-incoherent, low-complexity components corresponding to facial attributes, such as facial identity, expressions and activation of atomic facial muscle actions. We develop a robust, discriminant dictionary learning framework to extract these components from grossly corrupted training data and combine it with sparse representation to recognize the associated attributes. We demonstrate that our framework can jointly address interrelated classification tasks such as face and facial expression recognition. Inspired by the well-documented importance of the temporal aspect in perceiving affect and behavior, we direct the bulk of our research efforts into continuous-time modeling of dimensional affect and social behavior. Having identified a gap in the literature which is the lack of data containing annotations of social attitudes in continuous time and scale, we first curate a new audio-visual database of multi-party conversations from political debates annotated frame-by-frame in terms of real-valued conflict intensity and use it to conduct the first study on continuous-time conflict intensity estimation. Our experimental findings corroborate previous evidence indicating the inability of existing classifiers in capturing the hidden temporal structures of affective and behavioral displays. We present here a novel dynamic behavior analysis framework which models temporal dynamics in an explicit way, based on the natural assumption that continuous- time annotations of smoothly-varying affect or behavior can be viewed as outputs of a low-complexity linear dynamical system when behavioral cues (features) act as system inputs. A novel robust structured rank minimization framework is proposed to estimate the system parameters in the presence of gross corruptions and partially missing data. Experiments on prediction of dimensional conflict and affect as well as multi-object tracking from detection validate the effectiveness of our predictive framework and demonstrate that for the first time that complex human behavior and affect can be learned and predicted based on small training sets of person(s)-specific observations.Open Acces
    • …
    corecore