thesis

Kernel Feature Extraction Methods for Remote Sensing Data Analysis

Abstract

Technological advances in the last decades have improved our capabilities of collecting and storing high data volumes. However, this makes that in some fields, such as remote sensing several problems are generated in the data processing due to the peculiar characteristics of their data. High data volume, high dimensionality, heterogeneity and their nonlinearity, make that the analysis and extraction of relevant information from these images could be a bottleneck for many real applications. The research applying image processing and machine learning techniques along with feature extraction, allows the reduction of the data dimensionality while keeps the maximum information. Therefore, developments and applications of feature extraction methodologies using these techniques have increased exponentially in remote sensing. This improves the data visualization and the knowledge discovery. Several feature extraction methods have been addressed in the literature depending on the data availability, which can be classified in supervised, semisupervised and unsupervised. In particular, feature extraction can use in combination with kernel methods (nonlinear). The process for obtaining a space that keeps greater information content is facilitated by this combination. One of the most important properties of the combination is that can be directly used for general tasks including classification, regression, clustering, ranking, compression, or data visualization. In this Thesis, we address the problems of different nonlinear feature extraction approaches based on kernel methods for remote sensing data analysis. Several improvements to the current feature extraction methods are proposed to transform the data in order to make high dimensional data tasks easier, such as classification or biophysical parameter estimation. This Thesis focus on three main objectives to reach these improvements in the current feature extraction methods: The first objective is to include invariances into supervised kernel feature extraction methods. Throughout these invariances it is possible to generate virtual samples that help to mitigate the problem of the reduced number of samples in supervised methods. The proposed algorithm is a simple method that essentially generates new (synthetic) training samples from available labeled samples. These samples along with original samples should be used in feature extraction methods obtaining more independent features between them that without virtual samples. The introduction of prior knowledge by means of the virtual samples could obtain classification and biophysical parameter estimation methods more robust than without them. The second objective is to use the generative kernels, i.e. probabilistic kernels, that directly learn by means of clustering techniques from original data by finding local-to-global similarities along the manifold. The proposed kernel is useful for general feature extraction purposes. Furthermore, the kernel attempts to improve the current methods because the kernel not only contains labeled data information but also uses the unlabeled information of the manifold. Moreover, the proposed kernel is parameter free in contrast with the parameterized functions such as, the radial basis function (RBF). Using probabilistic kernels is sought to obtain new unsupervised and semisupervised methods in order to reduce the number and cost of labeled data in remote sensing. Third objective is to develop new kernel feature extraction methods for improving the features obtained by the current methods. Optimizing the functional could obtain improvements in new algorithm. For instance, the Optimized Kernel Entropy Component Analysis (OKECA) method. The method is based on the Independent Component Analysis (ICA) framework resulting more efficient than the standard Kernel Entropy Component Analysis (KECA) method in terms of dimensionality reduction. In this Thesis, the methods are focused on remote sensing data analysis. Nevertheless, feature extraction methods are used to analyze data of several research fields whereas data are multidimensional. For these reasons, the results are illustrated into experimental sequence. First, the projections are analyzed by means of Toy examples. The algorithms are tested through standard databases with supervised information to proceed to the last step, the analysis of remote sensing images by the proposed methods

    Similar works