20 research outputs found

    From representation learning to thematic classification - Application to hierarchical analysis of hyperspectral images

    Get PDF
    Numerous frameworks have been developed in order to analyze the increasing amount of available image data. Among those methods, supervised classification has received considerable attention leading to the development of state-of-the-art classification methods. These methods aim at inferring the class of each observation given a specific class nomenclature by exploiting a set of labeled observations. Thanks to extensive research efforts of the community, classification methods have become very efficient. Nevertheless, the results of a classification remains a highlevel interpretation of the scene since it only gives a single class to summarize all information in a given pixel. Contrary to classification methods, representation learning methods are model-based approaches designed especially to handle high-dimensional data and extract meaningful latent variables. By using physic-based models, these methods allow the user to extract very meaningful variables and get a very detailed interpretation of the considered image. The main objective of this thesis is to develop a unified framework for classification and representation learning. These two methods provide complementary approaches allowing to address the problem using a hierarchical modeling approach. The representation learning approach is used to build a low-level model of the data whereas classification is used to incorporate supervised information and may be seen as a high-level interpretation of the data. Two different paradigms, namely Bayesian models and optimization approaches, are explored to set up this hierarchical model. The proposed models are then tested in the specific context of hyperspectral imaging where the representation learning task is specified as a spectral unmixing proble

    Unmixing dynamic PET images with variable specific binding kinetics

    Get PDF
    To analyze dynamic positron emission tomography (PET) images, various generic multivariate data analysis techniques have been considered in the literature, such as principal component analysis (PCA), independent component analysis (ICA), factor analysis and nonnegative matrix factorization (NMF). Nevertheless, these conventional approaches neglect any possible nonlinear variations in the time activity curves describing the kinetic behavior of tissues with specific binding, which limits their ability to recover a reliable, understandable and interpretable description of the data. This paper proposes an alternative analysis paradigm that accounts for spatial fluctuations in the exchange rate of the tracer between a free compartment and a specifically bound ligand compartment. The method relies on the concept of linear unmixing, usually applied on the hyperspectral domain, which combines NMF with a sum-to-one constraint that ensures an exhaustive description of the mixtures. The spatial variability of the signature corresponding to the specific binding tissue is explicitly modeled through a perturbed component. The performance of the method is assessed on both synthetic and real data and is shown to compete favorably when compared to other conventional analysis methods

    Regularization approaches to hyperspectral unmixing

    Get PDF
    We consider a few different approaches to hyperspectral unmixing of remotely sensed imagery which exploit and extend recent advances in sparse statistical regularization, handling of constraints and dictionary reduction. Hyperspectral unmixing methods often use a conventional least-squares based lasso which assumes that the data follows the Gaussian distribution, we use this as a starting point. In addition, we consider a robust approach to sparse spectral unmixing of remotely sensed imagery which reduces the sensitivity of the estimator to outliers. Due to water absorption and atmospheric effects that affect data collection, hyperspectral images are prone to have large outliers. The framework comprises of several well-principled penalties. A non-convex, hyper-Laplacian prior is incorporated to induce sparsity in the number of active pure spectral components, and total variation regularizer is included to exploit the spatial-contextual information of hyperspectral images. Enforcing the sum-to-one and non-negativity constraint on the models parameters is essential for obtaining realistic estimates. We consider two approaches to account for this: an iterative heuristic renormalization and projection onto the positive orthant, and a reparametrization of the coefficients which gives rise to a theoretically founded method. Since the large size of modern spectral libraries cannot only present computational challenges but also introduce collinearities between regressors, we introduce a library reduction step. This uses the multiple signal classi fication (MUSIC) array processing algorithm, which both speeds up unmixing and yields superior results in scenarios where the library size is extensive. We show that although these problems are non-convex, they can be solved by a properly de fined algorithm based on either trust region optimization or iteratively reweighted least squares. The performance of the different approaches is validated in several simulated and real hyperspectral data experiments

    Exploring Structural Consistency in Graph Regularized Joint Spectral-Spatial Sparse Coding for Hyperspectral Image Classification

    Get PDF
    In hyperspectral image classification, both spectral and spatial data distributions are important in describing and identifying different materials and objects in the image. Furthermore, consistent spatial structures across bands can be useful in capturing inherent structural information of objects. These imply that three properties should be considered when reconstructing an image using sparse coding methods. First, the distribution of different ground objects leads to different coding coefficients across the spatial locations. Second, local spatial structures change slightly across bands due to different reflectance properties of various object materials. Finally and more importantly, some sort of structural consistency shall be enforced across bands to reflect the fact that the same object appears at the same spatial location in all bands of an image. Based on these considerations, we propose a novel joint spectral-spatial sparse coding model that explores structural consistency for hyperspectral image classification. For each band image, we adopt a sparse coding step to reconstruct the structures in the band image. This allows different dictionaries be generated to characterize the band-wise image variation. At the same time, we enforce the same coding coefficients at the same spatial location in different bands so as to maintain consistent structures across bands. To further promote the discriminating power of the model, we incorporate a graph Laplacian sparsity constraint into the model to ensure spectral consistency in the dictionary generation step. Experimental results show that the proposed method outperforms some state-of-the-art spectral-spatial sparse coding methods

    Low-Rank and Sparse Decomposition for Hyperspectral Image Enhancement and Clustering

    Get PDF
    In this dissertation, some new algorithms are developed for hyperspectral imaging analysis enhancement. Tensor data format is applied in hyperspectral dataset sparse and low-rank decomposition, which could enhance the classification and detection performance. And multi-view learning technique is applied in hyperspectral imaging clustering. Furthermore, kernel version of multi-view learning technique has been proposed, which could improve clustering performance. Most of low-rank and sparse decomposition algorithms are based on matrix data format for HSI analysis. As HSI contains high spectral dimensions, tensor based extended low-rank and sparse decomposition (TELRSD) is proposed in this dissertation for better performance of HSI classification with low-rank tensor part, and HSI detection with sparse tensor part. With this tensor based method, HSI is processed in 3D data format, and information between spectral bands and pixels maintain integrated during decomposition process. This proposed algorithm is compared with other state-of-art methods. And the experiment results show that TELRSD has the best performance among all those comparison algorithms. HSI clustering is an unsupervised task, which aims to group pixels into different groups without labeled information. Low-rank sparse subspace clustering (LRSSC) is the most popular algorithms for this clustering task. The spatial-spectral based multi-view low-rank sparse subspace clustering (SSMLC) algorithms is proposed in this dissertation, which extended LRSSC with multi-view learning technique. In this algorithm, spectral and spatial views are created to generate multi-view dataset of HSI, where spectral partition, morphological component analysis (MCA) and principle component analysis (PCA) are applied to create others views. Furthermore, kernel version of SSMLC (k-SSMLC) also has been investigated. The performance of SSMLC and k-SSMLC are compared with sparse subspace clustering (SSC), low-rank sparse subspace clustering (LRSSC), and spectral-spatial sparse subspace clustering (S4C). It has shown that SSMLC could improve the performance of LRSSC, and k-SSMLC has the best performance. The spectral clustering has been proved that it equivalent to non-negative matrix factorization (NMF) problem. In this case, NMF could be applied to the clustering problem. In order to include local and nonlinear features in data source, orthogonal NMF (ONMF), graph-regularized NMF (GNMF) and kernel NMF (k-NMF) has been proposed for better clustering performance. The non-linear orthogonal graph NMF combine both kernel, orthogonal and graph constraints in NMF (k-OGNMF), which push up the clustering performance further. In the HSI domain, kernel multi-view based orthogonal graph NMF (k-MOGNMF) is applied for subspace clustering, where k-OGNMF is extended with multi-view algorithm, and it has better performance and computation efficiency

    Factor analysis of dynamic PET images

    Get PDF
    Thanks to its ability to evaluate metabolic functions in tissues from the temporal evolution of a previously injected radiotracer, dynamic positron emission tomography (PET) has become an ubiquitous analysis tool to quantify biological processes. Several quantification techniques from the PET imaging literature require a previous estimation of global time-activity curves (TACs) (herein called \textit{factors}) representing the concentration of tracer in a reference tissue or blood over time. To this end, factor analysis has often appeared as an unsupervised learning solution for the extraction of factors and their respective fractions in each voxel. Inspired by the hyperspectral unmixing literature, this manuscript addresses two main drawbacks of general factor analysis techniques applied to dynamic PET. The first one is the assumption that the elementary response of each tissue to tracer distribution is spatially homogeneous. Even though this homogeneity assumption has proven its effectiveness in several factor analysis studies, it may not always provide a sufficient description of the underlying data, in particular when abnormalities are present. To tackle this limitation, the models herein proposed introduce an additional degree of freedom to the factors related to specific binding. To this end, a spatially-variant perturbation affects a nominal and common TAC representative of the high-uptake tissue. This variation is spatially indexed and constrained with a dictionary that is either previously learned or explicitly modelled with convolutional nonlinearities affecting non-specific binding tissues. The second drawback is related to the noise distribution in PET images. Even though the positron decay process can be described by a Poisson distribution, the actual noise in reconstructed PET images is not expected to be simply described by Poisson or Gaussian distributions. Therefore, we propose to consider a popular and quite general loss function, called the β\beta-divergence, that is able to generalize conventional loss functions such as the least-square distance, Kullback-Leibler and Itakura-Saito divergences, respectively corresponding to Gaussian, Poisson and Gamma distributions. This loss function is applied to three factor analysis models in order to evaluate its impact on dynamic PET images with different reconstruction characteristics

    Deep Image Prior for Disentangling Mixed Pixels

    Get PDF
    A mixed pixel in remotely sensed images measures the reflectance and emission from multiple target types (e.g., tree, grass, and building) from a certain area. Mixed pixels exist commonly in spaceborne hyper-/multi-spectral images due to sensor limitations, causing the signature ambiguity problem and impeding high-resolution remote sensing mapping. Disentangling mixed pixels into the underlying constituent components is a challenging ill-posed inverse problem, which requires efficient modeling of spatial prior information and other application-dependent prior knowledge concerning the mixed pixel generation process. The recent deep image prior (DIP) approach and other application-dependent prior information are integrated into a Bayesian framework in the research, which allows comprehensive usage of different prior knowledge. The research improves mixed pixel disentangling using the Bayesian DIP in three key applications: spectral unmixing (SU), subpixel mapping (SPM), and soil moisture product downscaling (SMD). The main contributions are summarized as follows. First, to improve the decomposition of mixed pixels into pure material spectra (i.e., endmembers) and their constituting fractions (i.e., abundances) in SU, a designed deep fully convolutional neural network (DCNN) and a new spectral mixture model (SMM) with heterogeneous noise are integrated into a Bayesian framework that is efficiently solved by a new iterative optimization algorithm. Second, to improve the decomposition of mixed pixels into class labels of subpixels in SPM, a dedicated DCNN architecture and a new discrete SMM are integrated into the Bayesian framework to allow the use of both spatial prior and the forward model. Third, to improve the decomposition of mixed pixels into soil moisture concentrations of subpixels in SMD, a new DIP architecture and a forward degradation model are integrated into the Bayesian framework that is solved by the stochastic gradient descent approach. These new Bayesian approaches improve the state-of-the-art in their respective applications (i.e., SU, SPM, and SMD), which can be potentially utilized for solving other ill-posed inverse problems where simultaneously modeling of the spatial prior and other prior knowledge is needed

    Multisource and Multitemporal Data Fusion in Remote Sensing

    Get PDF
    The sharp and recent increase in the availability of data captured by different sensors combined with their considerably heterogeneous natures poses a serious challenge for the effective and efficient processing of remotely sensed data. Such an increase in remote sensing and ancillary datasets, however, opens up the possibility of utilizing multimodal datasets in a joint manner to further improve the performance of the processing approaches with respect to the application at hand. Multisource data fusion has, therefore, received enormous attention from researchers worldwide for a wide variety of applications. Moreover, thanks to the revisit capability of several spaceborne sensors, the integration of the temporal information with the spatial and/or spectral/backscattering information of the remotely sensed data is possible and helps to move from a representation of 2D/3D data to 4D data structures, where the time variable adds new information as well as challenges for the information extraction algorithms. There are a huge number of research works dedicated to multisource and multitemporal data fusion, but the methods for the fusion of different modalities have expanded in different paths according to each research community. This paper brings together the advances of multisource and multitemporal data fusion approaches with respect to different research communities and provides a thorough and discipline-specific starting point for researchers at different levels (i.e., students, researchers, and senior researchers) willing to conduct novel investigations on this challenging topic by supplying sufficient detail and references

    Multi-frame reconstruction using super-resolution, inpainting, segmentation and codecs

    Get PDF
    In this thesis, different aspects of video and light field reconstruction are considered such as super-resolution, inpainting, segmentation and codecs. For this purpose, each of these strategies are analyzed based on a specific goal and a specific database. Accordingly, databases which are relevant to film industry, sport videos, light fields and hyperspectral videos are used for the sake of improvement. This thesis is constructed around six related manuscripts, in which several approaches are proposed for multi-frame reconstruction. Initially, a novel multi-frame reconstruction strategy is proposed for lightfield super-resolution in which graph-based regularization is applied along with edge preserving filtering for improving the spatio-angular quality of lightfield. Second, a novel video reconstruction is proposed which is built based on compressive sensing (CS), Gaussian mixture models (GMM) and sparse 3D transform-domain block matching. The motivation of the proposed technique is the improvement in visual quality performance of the video frames and decreasing the reconstruction error in comparison with the former video reconstruction methods. In the next approach, student-t mixture models and edge preserving filtering are applied for the purpose of video super-resolution. Student-t mixture model has a heavy tail which makes it robust and suitable as a video frame patch prior and rich in terms of log likelihood for information retrieval. In another approach, a hyperspectral video database is considered, and a Bayesian dictionary learning process is used for hyperspectral video super-resolution. To that end, Beta process is used in Bayesian dictionary learning and a sparse coding is generated regarding the hyperspectral video super-resolution. The spatial super-resolution is followed by a spectral video restoration strategy, and the whole process leveraged two different dictionary learnings, in which the first one is trained for spatial super-resolution and the second one is trained for the spectral restoration. Furthermore, in another approach, a novel framework is proposed for replacing advertisement contents in soccer videos in an automatic way by using deep learning strategies. For this purpose, a UNET architecture is applied (an image segmentation convolutional neural network technique) for content segmentation and detection. Subsequently, after reconstructing the segmented content in the video frames (considering the apparent loss in detection), the unwanted content is replaced by new one using a homography mapping procedure. In addition, in another research work, a novel video compression framework is presented using autoencoder networks that encode and decode videos by using less chroma information than luma information. For this purpose, instead of converting Y'CbCr 4:2:2/4:2:0 videos to and from RGB 4:4:4, the video is kept in Y'CbCr 4:2:2/4:2:0 and merged the luma and chroma channels after the luma is downsampled to match the chroma size. An inverse function is performed for the decoder. The performance of these models is evaluated by using CPSNR, MS-SSIM, and VMAF metrics. The experiments reveal that, as compared to video compression involving conversion to and from RGB 4:4:4, the proposed method increases the video quality by about 5.5% for Y'CbCr 4:2:2 and 8.3% for Y'CbCr 4:2:0 while reducing the amount of computation by nearly 37% for Y'CbCr 4:2:2 and 40% for Y'CbCr 4:2:0. The thread that ties these approaches together is reconstruction of the video and light field frames based on different aspects of problems such as having loss of information, blur in the frames, existing noise after reconstruction, existing unpleasant content, excessive size of information and high computational overhead. In three of the proposed approaches, we have used Plug-and-Play ADMM model for the first time regarding reconstruction of videos and light fields in order to address both information retrieval in the frames and tackling noise/blur at the same time. In two of the proposed models, we applied sparse dictionary learning to reduce the data dimension and demonstrate them as an efficient linear combination of basis frame patches. Two of the proposed approaches are developed in collaboration with industry, in which deep learning frameworks are used to handle large set of features and to learn high-level features from the data
    corecore