    Clothing and carrying condition invariant gait recognition based on rotation forest

    This paper proposes a gait recognition method which is invariant to maximum number of challenging factors of gait recognition mainly unpredictable variation in clothing and carrying conditions. The method introduces an averaged gait key-phase image (AGKI) which is computed by averaging each of the five key-phases of the gait periods of a gait sequence. It analyses the AGKIs using high-pass and low-pass Gaussian filters, each at three cut-off frequencies to achieve robustness against unpredictable variation in clothing and carrying conditions in addition to other covariate factors, e.g., walking speed, segmentation noise, shadows under feet and change in hair style and ground surface. The optimal cut-off frequencies of the Gaussian filters are determined based on an analysis of the focus values of filtered human subject’s silhouettes. The method applies rotation forest ensemble learning recognition to enhance both individual accuracy and diversity within the ensemble for improved identification rate. Extensive experiments on public datasets demonstrate the efficacy of the proposed method

    A new hybridized dimensionality reduction approach using genetic algorithm and folded linear discriminant analysis applied to hyperspectral imaging for effective rice seed classification

    Hyperspectral imaging (HSI) has been reported to produce promising results in the classification of rice seeds. However, HSI data often require the use of dimensionality reduction techniques for the removal of redundant data. Folded linear discriminant analysis (F-LDA) is an extension of linear discriminant analysis (LDA, a commonly used technique for dimensionality reduction), and was recently proposed to address the limitations of LDA, particularly its poor performance when dealing with a small number of training samples which is a usual scenario in HSI applications. This article presents an improved version of F-LDA, exploring the feasibility of hybridizing a genetic algorithm (GA) and F-LDA for effective dimensionality reduction in HSI-based rice seeds classification. The proposed approach, inspired by the previous combination of GA with principle component analysis, is evaluated on rice seed datasets containing 256 spectral bands. Experimental results show that, in addition to attaining promising classification accuracies of up to 96.21%, this novel combination of GA and F-LDA (GA + F-LDA) can further reduce the computational complexity and memory requirement in the standalone F-LDA. It is worth noting that these benefits are not without a slight reduction in classification accuracy when evaluated against those reported for the standard F-LDA (up to 96.99%)

    A New Hybridized Dimensionality Reduction Approach Using Genetic Algorithm and Folded Linear Discriminant Analysis Applied to Hyperspectral Imaging for Effective Rice Seed Classification

    Mixture of Latent Variable Models for Remotely Sensed Image Processing

    The processing of remotely sensed data is innately an inverse problem where properties of spatial processes are inferred from the observations based on a generative model. Meaningful data inversion relies on well-defined generative models that capture key factors in the relationship between the underlying physical process and the measurements. Unfortunately, as two mainstream data processing techniques, both mixture models and latent variables models (LVM) are inadequate in describing the complex relationship between the spatial process and the remote sensing data. Consequently, mixture models, such as K-Means, Gaussian Mixture Model (GMM), Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA), characterize a class by statistics in the original space, ignoring the fact that a class can be better represented by discriminative signals in the hidden/latent feature space, while LVMs, such as Principal Component Analysis (PCA), Independent Component Analysis (ICA) and Sparse Representation (SR), seek representational signals in the whole image scene that involves multiple spatial processes, neglecting the fact that signal discovery for individual processes is more efficient. Although the combined use of mixture model and LVMs is required for remote sensing data analysis, there is still a lack of systematic exploration on this important topic in remote sensing literature. Driven by the above considerations, this thesis therefore introduces a mixture of LVM (MLVM) framework for combining the mixture models and LVMs, under which three models are developed in order to address different aspects of remote sensing data processing: (1) a mixture of probabilistic SR (MPSR) is proposed for supervised classification of hyperspectral remote sensing imagery, considering that SR is an emerging and powerful technique for feature extraction and data representation; (2) a mixture model of K “Purified” means (K-P-Means) is proposed for addressing the spectral endmember estimation, which is a fundamental issue in remote sensing data analysis; (3) and a clustering-based PCA model is introduced for SAR image denoising. Under a unified optimization scheme, all models are solved via Expectation and Maximization (EM) algorithm, by iteratively estimating the two groups of parameters, i.e., the labels of pixels and the latent variables. Experiments on simulated data and real remote sensing data demonstrate the advantages of the proposed models in the respective applications

    The use of machine learning algorithms to assess the impacts of droughts on commercial forests in KwaZulu-Natal, South Africa.

    The use of machine learning algorithms to assess the impacts of droughts on commercial forests in KwaZulu-Natal, South Africa.

Masters Degree. University of KwaZulu-Natal, Pietermaritzburg.Droughts are a non-selective natural disaster in that their occurrence can be in both high and low precipitation areas. However, this study acknowledged that droughts are more recurrent and a regular feature in arid and semi-arid climates such as that of Southern Africa. Some of these countries rely strongly on commercial forests for their gross domestic product (GDP), especially South Africa and Mozambique which means droughts pose a significant threat to their economy and the society that depends on this economy. The risks associated with droughts have consequently created an increased demand for an efficient method of analysing and investigating droughts and the impacts they impose on forest vegetation. Therefore, this study aimed to examine the effects of droughts on all commercial forests within the province of KwaZulu-Natal (KZN) at a catchment and provincial scale by employing Kernel Support Vector Machine (Kernel –SVM), Rotation Forests (RTF) and Extreme Gradient Boosting (XGBoost) algorithms. These were based on Landsat and MODIS derived vegetation and conditional drought indices. The main aim of this study was achieved by the following objectives: (i) to improve methods for classifying droughts; (ii) to achieve medium spatial resolution drought analysis using Landsat sensors; (iii) to determine the accuracy of machine learning algorithms (MLAs) when employed on remote sensing data and (iv) to improve the usability of conditional drought indices and vegetation indices. The results obtained there-after demonstrated that the objectives of this study were met. With the MLAs performing better when using conditional drought indices compared to vegetation indices, therefore, highlighting drawbacks already associated with vegetation indices. Where at the catchment scale, Kernel – support vector machine (SVM) produced an overall accuracy (OA) of 94.44% when based on conditional drought indices compared to 81.48% when based on vegetation indices. On the same scale, Rotation forests (RTF) produced 96.30% and 81.84% when using conditional drought indices and vegetation indices, respectively. At a provincial scale, RTF produced an OA of 76.6% and 70.7% when using conditional drought indices and vegetation indices respectively. This was compared to extreme gradient boosting (XGBoost) which produced an OA of 81.9% and 69.3% when using conditional drought indices and vegetation indices respectively. These results also indicate that it is possible to analyse droughts at provincial and catchment scale. Although the results presented in this study were promising, more research is still required to improve the applicability of MLAs in drought analysis.

    A machine learning-remote sensing framework for modelling water stress in Shiraz vineyards

    Thesis (MA)--Stellenbosch University, 2018.ENGLISH ABSTRACT: Water is a limited natural resource and a major environmental constraint for crop production in viticulture. The unpredictability of rainfall patterns, combined with the potentially catastrophic effects of climate change, further compound water scarcity, presenting dire future scenarios of undersupplied irrigation systems. Major water shortages could lead to devastating loses in grape production, which would negatively affect job security and national income. It is, therefore, imperative to develop management schemes and farming practices that optimise water usage and safeguard grape production. Hyperspectral remote sensing techniques provide a solution for the monitoring of vineyard water status. Hyperspectral data, combined with the quantitative analysis of machine learning ensembles, enables the detection of water-stressed vines, thereby facilitating precision irrigation practices and ensuring quality crop yields. To this end, the thesis set out to develop a machine learning–remote sensing framework for modelling water stress in a Shiraz vineyard. The thesis comprises two components. Component one assesses the utility of terrestrial hyperspectral imagery and machine learning ensembles to detect water-stressed Shiraz vines. The Random Forest (RF) and Extreme Gradient Boosting (XGBoost) ensembles were employed to discriminate between water-stressed and non-stressed Shiraz vines. Results showed that both ensemble learners could effectively discriminate between water-stressed and non-stressed vines. When using all wavebands (p = 176), RF yielded a test accuracy of 83.3% (KHAT = 0.67), with XGBoost producing a test accuracy of 80.0% (KHAT = 0.6). Component two explores semi-automated feature selection approaches and hyperparameter value optimisation to improve the developed framework. The utility of the Kruskal-Wallis (KW) filter, Sequential Floating Forward Selection (SFFS) wrapper, and a Filter-Wrapper (FW) approach, was evaluated. When using optimised hyperparameter values, an increase in test accuracy ranging from 0.8% to 5.0% was observed for both RF and XGBoost. In general, RF was found to outperform XGBoost. In terms of predictive competency and computational efficiency, the developed FW approach was the most successful feature selection method implemented. The developed machine learning–remote sensing framework warrants further investigation to confirm its efficacy. However, the thesis answered key research questions, with the developed framework providing a point of departure for future studies.AFRIKAANSE OPSOMMING: Water is 'n beperkte natuurlike hulpbron en 'n groot omgewingsbeperking vir gewasproduksie in wingerdkunde. Die onvoorspelbaarheid van reënvalpatrone, gekombineer met die potensiële katastrofiese gevolge van klimaatsverandering, voorspel ‘n toekoms van water tekorte vir besproeiingstelsels. Groot water tekorte kan lei tot groot verliese in druiweproduksie, wat 'n negatiewe uitwerking op werksekuriteit en nasionale inkomste sal hê. Dit is dus noodsaaklik om bestuurskemas en boerderypraktyke te ontwikkel wat die gebruik van water optimaliseer en druiweproduksie beskerm. Hyperspectrale afstandswaarnemingstegnieke bied 'n oplossing vir die monitering van wingerd water status. Hiperspektrale data, gekombineer met die kwantitatiewe analise van masjienleer klassifikasies, fasiliteer die opsporing van watergestresde wingerdstokke. Sodoende verseker dit presiese besproeiings praktyke en kwaliteit gewasopbrengs. Vir hierdie doel het die tesis probeer 'n masjienleer-afstandswaarnemings raamwerk ontwikkel vir die modellering van waterstres in 'n Shiraz-wingerd. Die tesis bestaan uit twee komponente. Komponent 1 het die nut van terrestriële hiperspektrale beelde en masjienleer klassifikasies gebruik om watergestresde Shiraz-wingerde op te spoor. Die Ewekansige Woud (RF) en Ekstreme Gradiënt Bevordering (XGBoost) algoritme was gebruik om te onderskei tussen watergestresde en nie-gestresde Shiraz-wingerde. Resultate het getoon dat beide RF en XGBoost effektief kan diskrimineer tussen watergestresde en nie-gestresde wingerdstokke. Met die gebruik van alle golfbande (p = 176) het RF 'n toets akkuraatheid van 83.3% (KHAT = 0.67) behaal en XGBoost het 'n toets akkuraatheid van 80.0% (KHAT = 0.6) gelewer. Komponent twee het die gebruik van semi-outomatiese veranderlike seleksie benaderings en hiperparameter waarde optimalisering ondersoek om die ontwikkelde raamwerk te verbeter. Die nut van die Kruskal-Wallis (KW) filter, sekwensiële drywende voorkoms seleksie (SFFS) wrapper en 'n Filter-Wrapper (FW) benadering is geëvalueer. Die gebruik van optimaliseerde hiperparameter waardes het gelei tot 'n toename in toets akkuraatheid (van 0.8% tot 5.0%) vir beide RF en XGBoost. In die algeheel het RF beter presteer as XGBoost. In terme van voorspellende bevoegdheid en berekenings doeltreffendheid was die ontwikkelde FW benadering die mees suksesvolle veranderlike seleksie metode. Die ontwikkelde masjienleer-afstandwaarnemende raamwerk benodig verder navorsing om sy doeltreffendheid te bevestig. Die tesis het egter sleutelnavorsingsvrae beantwoord, met die ontwikkelde raamwerk wat 'n vertrekpunt vir toekomstige studies verskaf.Master

    Automated Remote Sensing Image Interpretation with Limited Labeled Training Data

    Automated Remote Sensing Image Interpretation with Limited Labeled Training Data

Automated remote sensing image interpretation has been investigated for more than a decade. In early years, most work was based on the assumption that there are sufficient labeled samples to be used for training. However, ground-truth collection is a very tedious and time-consuming task and sometimes very expensive, especially in the field of remote sensing that usually relies on field surveys to collect ground truth. In recent years, as the development of advanced machine learning techniques, remote sensing image interpretation with limited ground-truth has caught the attention of researchers in the fields of both remote sensing and computer science. Three approaches that focus on different aspects of the interpretation process, i.e., feature extraction, classification, and segmentation, are proposed to deal with the limited ground truth problem. First, feature extraction techniques, which usually serve as a pre-processing step for remote sensing image classification are explored. Instead of only focusing on feature extraction, a joint feature extraction and classification framework is proposed based on ensemble local manifold learning. Second, classifiers in the case of limited labeled training data are investigated, and an enhanced ensemble learning method that outperforms state-of-the-art classification methods is proposed. Third, image segmentation techniques are investigated, with the aid of unlabeled samples and spatial information. A semi-supervised self-training method is proposed, which is capable of expanding the number of training samples by its own and hence improving classification performance iteratively. Experiments show that the proposed approaches outperform state-of-the-art techniques in terms of classification accuracy on benchmark remote sensing datasets.