129 research outputs found

    Distributed Spacing Stochastic Feature Selection and its Application to Textile Classification

    Get PDF
    Many situations require the need to quickly and accurately locate dismounted individuals in a variety of environments. In conjunction with other dismount detection techniques, being able to detect and classify clothing (textiles) provides a more comprehensive and complete dismount characterization capability. Because textile classification depends on distinguishing between different material types, hyperspectral data, which consists of several hundred spectral channels sampled from a continuous electromagnetic spectrum, is used as a data source. However, a hyperspectral image generates vast amounts of information and can be computationally intractable to analyze. A primary means to reduce the computational complexity is to use feature selection to identify a reduced set of features that effectively represents a specific class. While many feature selection methods exist, applying them to continuous data results in closely clustered feature sets that offer little redundancy and fail in the presence of noise. This dissertation presents a novel feature selection method that limits feature redundancy and improves classification. This method uses a stochastic search algorithm in conjunction with a heuristic that combines measures of distance and dependence to select features. Comparison testing between the presented feature selection method and existing methods uses hyperspectral data and image wavelet decompositions. The presented method produces feature sets with an average correlation of 0.40-0.54. This is significantly lower than the 0.70-0.99 of the existing feature selection methods. In terms of classification accuracy, the feature sets produced outperform those of other methods, to a significance of 0.025, and show greater robustness under noise representative of a hyperspectral imaging system

    Feature Selection on Hyperspectral Data for Dismount Skin Analysis

    Get PDF
    Many security applications require the ability to accurately identify dismounts based on their distinctive identification properties. A dismount can be identified by many personal characteristics to include clothing, height, and gait. In particular, a dismount\u27s skin can be used as an identifying feature because of the vast variability of skin pigmentation amongst individuals. Hyperspectral data, which is comprised of hundreds of spectral channels sampled from a nearly contiguous electromagnetic spectrum, is used to detect skin spectral variability amongst dismounts. However, hyperspectral data is often highly correlated and computationally expensive to process. Feature selection methods can be employed to reduce the data to a manageable size. This thesis presents the results of applying the fast correlation based filter (FCFB) [51] to a data set that contains hyperspectral data from the forearms of 62 subjects. The reduced data is used to train an artificial neural network (ANN) to discriminate a dismount of interest (DOI) amongst a group of 4 non-DOI\u27s. The trained model is then tested to find the same DOI amongst a group of 62 new non-DOI\u27s. The FCBF selected four features (1014, 1024, 1033, and 1348nm) to discriminate amongst the dismounts. Using these four features, the ANN on average misclassified dismounts amongst four separate DOI validation tests. More specifically, the amount of possible DOI suspects was reduced from 62 to 4 dismounts. The FCBF outperformed three other feature selection methods with 4 times less misclassified instances

    A scalable saliency-based Feature selection method with instance level information

    Get PDF
    Classic feature selection techniques remove those features that are either irrelevant or redundant, achieving a subset of relevant features that help to provide a better knowledge extraction. This allows the creation of compact models that are easier to interpret. Most of these techniques work over the whole dataset, but they are unable to provide the user with successful information when only instance information is needed. In short, given any example, classic feature selection algorithms do not give any information about which the most relevant information is, regarding this sample. This work aims to overcome this handicap by developing a novel feature selection method, called Saliency-based Feature Selection (SFS), based in deep-learning saliency techniques. Our experimental results will prove that this algorithm can be successfully used not only in Neural Networks, but also under any given architecture trained by using Gradient Descent techniques

    Textile Fingerprinting for Dismount Analysis in the Visible, Near, and Shortwave Infrared Domain

    Get PDF
    The ability to accurately and quickly locate an individual, or a dismount, is useful in a variety of situations and environments. A dismount\u27s characteristics such as their gender, height, weight, build, and ethnicity could be used as discriminating factors. Hyperspectral imaging (HSI) is widely used in efforts to identify materials based on their spectral signatures. More specifically, HSI has been used for skin and clothing classification and detection. The ability to detect textiles (clothing) provides a discriminating factor that can aid in a more comprehensive detection of dismounts. This thesis demonstrates the application of several feature selection methods (i.e., support vector machines with recursive feature reduction, fast correlation based filter) in highly dimensional data collected from a spectroradiometer. The classification of the data is accomplished with the selected features and artificial neural networks. A model for uniquely identifying (fingerprinting) textiles are designed, where color and composition are determined in order to fingerprint a specific textile. An artificial neural network is created based on the knowledge of the textile\u27s color and composition, providing a uniquely identifying fingerprinting of a textile. Results show 100% accuracy for color and composition classification, and 98% accuracy for the overall textile fingerprinting process

    Spectral Optimization of Airborne Multispectral Camera for Land Cover Classification: Automatic Feature Selection and Spectral Band Clustering

    Get PDF
    Hyperspectral imagery consists of hundreds of contiguous spectral bands. However, most of them are redundant. Thus a subset of well-chosen bands is generally sufficient for a specific problem, enabling to design adapted superspectral sensors dedicated to specific land cover classification. Related both to feature selection and extraction, spectral optimization identifies the most relevant band subset for specific applications, involving a band subset relevance score as well as a method to optimize it. This study first focuses on the choice of such relevance score. Several criteria are compared through both quantitative and qualitative analyses. To have a fair comparison, all tested criteria are compared to classic hyperspectral data sets using the same optimization heuristics: an incremental one to assess the impact of the number of selected bands and a stochastic one to obtain several possible good band subsets and to derive band importance measures out of intermediate good band subsets. Last, a specific approach is proposed to cope with the optimization of bandwidth. It consists in building a hierarchy of groups of adjacent bands, according to a score to decide which adjacent bands must be merged, before band selection is performed at the different levels of this hierarchy

    Spectral Detection of Acute Mental Stress with VIS-SWIR Hyperspectral Imagery

    Get PDF
    The ability to identify a stressed person is becoming an important aspect across different work environments. Especially in higher-stress career fields, such as first responders and air traffic controllers, mental stress can inhibit a person\u27s ability to accomplish their job. A person\u27s efficiency and psychological state in the work environment can be impeded due to poor mental health. Stress can result in harmful effects on the body, both physically and mentally, including depression, lack of sleep, and fatigue, which can lead to reduced work productivity. Research is being conducted to detect stress in workload-intensive environments. This thesis implements an imaging approach that utilizes hyperspectral data across the visible through shortwave infrared electromagnetic spectrum. The data is applied to the feature selection algorithms ReliefF, Support Vector Machine Attribute Evaluator (SVM AE), and Non-Correlated Aided Simulated Annealing Feature Selection-Integrated Distribution Function (NASAFS-IDF) to obtain features that discriminate between the classes, stress and non-stress. This data is classified using naive Bayes, Support Vector Machine (SVM), and decision tree methodologies. The feature set and classifier that produce the highest classification results are calculated using percent accuracy and area under the curve (AUC). The reported results are divided into contact and non-contact (NC) validation sets. The contact validation returned a high accuracy of 96.30% and high AUC of 0.979. Validation on NC models returned a high accuracy of 99.64% and high AUC of 0.998

    Integration of Spatial and Spectral Information for Hyperspectral Image Classification

    Get PDF
    Hyperspectral imaging has become a powerful tool in biomedical and agriculture fields in the recent years and the interest amongst researchers has increased immensely. Hyperspectral imaging combines conventional imaging and spectroscopy to acquire both spatial and spectral information from an object. Consequently, a hyperspectral image data contains not only spectral information of objects, but also the spatial arrangement of objects. Information captured in neighboring locations may provide useful supplementary knowledge for analysis. Therefore, this dissertation investigates the integration of information from both the spectral and spatial domains to enhance hyperspectral image classification performance. The major impediment to the combined spatial and spectral approach is that most spatial methods were only developed for single image band. Based on the traditional singleimage based local Geary measure, this dissertation successfully proposes a Multidimensional Local Spatial Autocorrelation (MLSA) for hyperspectral image data. Based on the proposed spatial measure, this research work develops a collaborative band selection strategy that combines both the spectral separability measure (divergence) and spatial homogeneity measure (MLSA) for hyperspectral band selection task. In order to calculate the divergence more efficiently, a set of recursive equations for the calculation of divergence with an additional band is derived to overcome the computational restrictions. Moreover, this dissertation proposes a collaborative classification method which integrates the spectral distance and spatial autocorrelation during the decision-making process. Therefore, this method fully utilizes the spatial-spectral relationships inherent in the data, and thus improves the classification performance. In addition, the usefulness of the proposed band selection and classification method is evaluated with four case studies. The case studies include detection and identification of tumor on poultry carcasses, fecal on apple surface, cancer on mouse skin and crop in agricultural filed using hyperspectral imagery. Through the case studies, the performances of the proposed methods are assessed. It clearly shows the necessity and efficiency of integrating spatial information for hyperspectral image processing

    Hyperspectral Image Analysis of Food for Nutritional Intake

    Full text link
    The primary object of this dissertation is to investigate the application of hyperspectral technology to accommodate for the growing demand in the automatic dietary assessment applications. Food intake is one of the main factors that contribute to human health. In other words, it is necessary to get information about the amount of nutrition and vitamins that a human body requires through a daily diet. Manual dietary assessments are time-consuming and are also not precise enough, especially when the information is used for the care and treatment of hospitalized patients. Moreover, the data must be analyzed by nutritional experts. Therefore, researchers have developed various semiautomatic or automatic dietary assessment systems; most of them are based on the conventional color images such as RGB. The main disadvantage of such systems is their inability to differentiate foods of similar color or same ingredients in various colors, or different forms such as cooked or mixed forms. Although adding features such as shape, size and texture improve the overall performance, they are sensitive to changes in the illumination, rotation, scale, etc. A balance between quality and quantity of features representation, and system efficiency must also be considered. Hyperspectral technology combines conventional imaging technology with spectroscopy in a three-dimensional data-cube to obtain both the spatial and spectral information of the objects. However, the high dimensionality of hyperspectral data in addition to the redundancy between spectral bands limits performance, especially in online or onboard data processing applications. Thus, various features selection/extraction are also used to select the optimal feature subsets. The results are promising and verify the feasibility of using hyperspectral technology in dietary assessment applications

    On the selection of dimension reduction techniques for scientific applications

    Full text link

    A machine learning-remote sensing framework for modelling water stress in Shiraz vineyards

    Get PDF
    Thesis (MA)--Stellenbosch University, 2018.ENGLISH ABSTRACT: Water is a limited natural resource and a major environmental constraint for crop production in viticulture. The unpredictability of rainfall patterns, combined with the potentially catastrophic effects of climate change, further compound water scarcity, presenting dire future scenarios of undersupplied irrigation systems. Major water shortages could lead to devastating loses in grape production, which would negatively affect job security and national income. It is, therefore, imperative to develop management schemes and farming practices that optimise water usage and safeguard grape production. Hyperspectral remote sensing techniques provide a solution for the monitoring of vineyard water status. Hyperspectral data, combined with the quantitative analysis of machine learning ensembles, enables the detection of water-stressed vines, thereby facilitating precision irrigation practices and ensuring quality crop yields. To this end, the thesis set out to develop a machine learning–remote sensing framework for modelling water stress in a Shiraz vineyard. The thesis comprises two components. Component one assesses the utility of terrestrial hyperspectral imagery and machine learning ensembles to detect water-stressed Shiraz vines. The Random Forest (RF) and Extreme Gradient Boosting (XGBoost) ensembles were employed to discriminate between water-stressed and non-stressed Shiraz vines. Results showed that both ensemble learners could effectively discriminate between water-stressed and non-stressed vines. When using all wavebands (p = 176), RF yielded a test accuracy of 83.3% (KHAT = 0.67), with XGBoost producing a test accuracy of 80.0% (KHAT = 0.6). Component two explores semi-automated feature selection approaches and hyperparameter value optimisation to improve the developed framework. The utility of the Kruskal-Wallis (KW) filter, Sequential Floating Forward Selection (SFFS) wrapper, and a Filter-Wrapper (FW) approach, was evaluated. When using optimised hyperparameter values, an increase in test accuracy ranging from 0.8% to 5.0% was observed for both RF and XGBoost. In general, RF was found to outperform XGBoost. In terms of predictive competency and computational efficiency, the developed FW approach was the most successful feature selection method implemented. The developed machine learning–remote sensing framework warrants further investigation to confirm its efficacy. However, the thesis answered key research questions, with the developed framework providing a point of departure for future studies.AFRIKAANSE OPSOMMING: Water is 'n beperkte natuurlike hulpbron en 'n groot omgewingsbeperking vir gewasproduksie in wingerdkunde. Die onvoorspelbaarheid van reënvalpatrone, gekombineer met die potensiële katastrofiese gevolge van klimaatsverandering, voorspel ‘n toekoms van water tekorte vir besproeiingstelsels. Groot water tekorte kan lei tot groot verliese in druiweproduksie, wat 'n negatiewe uitwerking op werksekuriteit en nasionale inkomste sal hê. Dit is dus noodsaaklik om bestuurskemas en boerderypraktyke te ontwikkel wat die gebruik van water optimaliseer en druiweproduksie beskerm. Hyperspectrale afstandswaarnemingstegnieke bied 'n oplossing vir die monitering van wingerd water status. Hiperspektrale data, gekombineer met die kwantitatiewe analise van masjienleer klassifikasies, fasiliteer die opsporing van watergestresde wingerdstokke. Sodoende verseker dit presiese besproeiings praktyke en kwaliteit gewasopbrengs. Vir hierdie doel het die tesis probeer 'n masjienleer-afstandswaarnemings raamwerk ontwikkel vir die modellering van waterstres in 'n Shiraz-wingerd. Die tesis bestaan uit twee komponente. Komponent 1 het die nut van terrestriële hiperspektrale beelde en masjienleer klassifikasies gebruik om watergestresde Shiraz-wingerde op te spoor. Die Ewekansige Woud (RF) en Ekstreme Gradiënt Bevordering (XGBoost) algoritme was gebruik om te onderskei tussen watergestresde en nie-gestresde Shiraz-wingerde. Resultate het getoon dat beide RF en XGBoost effektief kan diskrimineer tussen watergestresde en nie-gestresde wingerdstokke. Met die gebruik van alle golfbande (p = 176) het RF 'n toets akkuraatheid van 83.3% (KHAT = 0.67) behaal en XGBoost het 'n toets akkuraatheid van 80.0% (KHAT = 0.6) gelewer. Komponent twee het die gebruik van semi-outomatiese veranderlike seleksie benaderings en hiperparameter waarde optimalisering ondersoek om die ontwikkelde raamwerk te verbeter. Die nut van die Kruskal-Wallis (KW) filter, sekwensiële drywende voorkoms seleksie (SFFS) wrapper en 'n Filter-Wrapper (FW) benadering is geëvalueer. Die gebruik van optimaliseerde hiperparameter waardes het gelei tot 'n toename in toets akkuraatheid (van 0.8% tot 5.0%) vir beide RF en XGBoost. In die algeheel het RF beter presteer as XGBoost. In terme van voorspellende bevoegdheid en berekenings doeltreffendheid was die ontwikkelde FW benadering die mees suksesvolle veranderlike seleksie metode. Die ontwikkelde masjienleer-afstandwaarnemende raamwerk benodig verder navorsing om sy doeltreffendheid te bevestig. Die tesis het egter sleutelnavorsingsvrae beantwoord, met die ontwikkelde raamwerk wat 'n vertrekpunt vir toekomstige studies verskaf.Master
    • …
    corecore