467 research outputs found

    Support vector machines in hyperspectral imaging spectroscopy with application to material identification

    Get PDF
    A processing methodology based on Support Vector Machines is presented in this paper for the classification of hyperspectral spectroscopic images. The accurate classification of the images is used to perform on-line material identification in industrial environments. Each hyperspectral image consists of the diffuse reflectance of the material under study along all the points of a line of vision. These images are measured through the employment of two imaging spectrographs operating at Vis-NIR, from 400 to 1000 nm, and NIR, from 1000 to 2400 nm, ranges of the spectrum, respectively. The aim of this work is to demonstrate the robustness of Support Vector Machines to recognise certain spectral features of the target. Furthermore, research has been made to find the adequate SVM configuration for this hyperspectral application. In this way, anomaly detection and material identification can be efficiently performed. A classifier with a combination of a Gaussian Kernel and a non linear Principal Component Analysis, namely k-PCA is concluded as the best option in this particular case. Finally, experimental tests have been carried out with materials typical of the tobacco industry (tobacco leaves mixed with unwanted spurious materials, such as leathers, plastics, etc.) to demonstrate the suitability of the proposed technique

    Algorithms for feature selection and pattern recognition on Grassmann manifolds

    Get PDF
    Includes bibliographical references.2015 Summer.This dissertation presents three distinct application-driven research projects united by ideas and topics from geometric data analysis, optimization, computational topology, and machine learning. We first consider hyperspectral band selection problem solved by using sparse support vector machines (SSVMs). A supervised embedded approach is proposed using the property of SSVMs to exhibit a model structure that includes a clearly identifiable gap between zero and non-zero feature vector weights that permits important bands to be definitively selected in conjunction with the classification problem. An SSVM is trained using bootstrap aggregating to obtain a sample of SSVM models to reduce variability in the band selection process. This preliminary sample approach for band selection is followed by a secondary band selection which involves retraining the SSVM to further reduce the set of bands retained. We propose and compare three adaptations of the SSVM band selection algorithm for the multiclass problem. We illustrate the performance of these methods on two benchmark hyperspectral data sets. Second, we propose an approach for capturing the signal variability in data using the framework of the Grassmann manifold (Grassmannian). Labeled points from each class are sampled and used to form abstract points on the Grassmannian. The resulting points have representations as orthonormal matrices and as such do not reside in Euclidean space in the usual sense. There are a variety of metrics which allow us to determine distance matrices that can be used to realize the Grassmannian as an embedding in Euclidean space. Multidimensional scaling (MDS) determines a low dimensional Euclidean embedding of the manifold, preserving or approximating the Grassmannian geometry based on the distance measure. We illustrate that we can achieve an isometric embedding of the Grassmann manifold using the chordal metric while this is not the case with other distances. However, non-isometric embeddings generated by using the smallest principal angle pseudometric on the Grassmannian lead to the best classification results: we observe that as the dimension of the Grassmannian grows, the accuracy of the classification grows to 100% in binary classification experiments. To build a classification model, we use SSVMs to perform simultaneous dimension selection. The resulting classifier selects a subset of dimensions of the embedding without loss in classification performance. Lastly, we present an application of persistent homology to the detection of chemical plumes in hyperspectral movies. The pixels of the raw hyperspectral data cubes are mapped to the geometric framework of the Grassmann manifold where they are analyzed, contrasting our approach with the more standard framework in Euclidean space. An advantage of this approach is that it allows the time slices in a hyperspectral movie to be collapsed to a sequence of points in such a way that some of the key structure within and between the slices is encoded by the points on the Grassmannian. This motivates the search for topological structure, associated with the evolution of the frames of a hyperspectral movie, within the corresponding points on the manifold. The proposed framework affords the processing of large data sets, such as the hyperspectral movies explored in this investigation, while retaining valuable discriminative information. For a particular choice of a distance metric on the Grassmannian, it is possible to generate topological signals that capture changes in the scene after a chemical release

    An evaluation of hyperspectral and multispectral data for mapping invasive species in an African Savanna.

    Get PDF
    Master of Science in Geography and Environmental sciences.Invasive alien plant (IAP) species affects a range of ecosystem types in various regions of the world. Therefore are now considered one of the main phenomena causing global change. Invasive alien plants (IAP’s) cause considerable impacts on ecosystem processes and functions, biodiversity, agriculture and human well-being. Parthenium hysterophorus is an IAP which is widely spread across the globe. It is difficult to control and eradicate, and has detrimental impacts on the natural environment and human health. However, there is no record of accurate and up-to-date information on the distributions and extent of P. hysterophorus. This study evaluated the capability of hyperspectral and multispectral data for mapping P. hysterophorus in northern KwaZulu-Natal province, South Africa. First, the study sought to determine an optimal subset of bands from canopy hyperspectral data for discrimination of P. hysterophorus from its co-existing species. A novel hierarchical approach that integrates statistical filters and a wrapper technique has been proposed to select optimal bands to solve the problem of high spectral dimensionality and improve classification accuracy. A non-parametric algorithm, Support Vector Machines (SVM) showed inferior classification accuracy, i.e. 76.19% and 78.57% when using 20 best spectral bands from SVM – Recursive Feature Elimination (SVM-RFE) and entire dataset (n = 1633), respectively. On the other hand, superior overall accuracy of 83.33% was achieved when using ten spectral bands identified by the hierarchical approach. Next, SVM classifier was adopted to evaluate the capability of multispectral data (i.e. Operational Land Imager, OLI and SPOT 6) for determining the distribution and patch sizes of P. hysterophorus. The results showed that SPOT 6 had a higher overall accuracy of 83.33% than OLI, i.e.76.39%. While SPOT 6’s the higher spatial resolution was useful for better characterisation of the distribution and patch sizes, the study found that the spectral configuration of OLI was more important in identifying possible locations infested by P. hysterophorus. Overall, the study demonstrated that fewer spectral bands selected by the proposed hierarchical approach have the greatest potential for reliably discriminating IAP species using airborne and satellite hyperspectral sensors. The study also demonstrated that the current information needs on IAP’s can be addressed using accessible multispectral data, valuable for effective land management, site specific weed management, and site prioritisation

    TerraSenseTK: a toolkit for remote soil nutrient estimation

    Get PDF
    Intensive farming endangers soil quality in various ways. Researchers show that if these practices continue, humanity will be faced with food production issues. For this matter, Earth Observation, more concretely Soil Sensing, along with Machine Learning, can be employed to monitor several indicators of soil degradation, such as soil salinity, soil heavy metal contamination and soil nutrients estimation. More concretely, Soil Nutrients are of great importance. For instance, to understand which crop better suits the land, the soil nutrients must be identified. However, sampling soil is a laborous and expensive task, which can be leveraged by Remote Sensing and Machine Learning. Several studies have already been developed in this matter, although many gaps still exist. Among them, the lack of cross-dataset evaluations of existing algorithms, and also the steep learning curve to the Earth Observation domain that prevents many researchers from embracing this field. In this sense, we propose TerraSense ToolKit (TSTK), a python toolkit that addresses these challenges. In this work, the possibility to use Remote sensing along with Machine Learning algorithms to per form Soil Nutrient Estimation is explored, additionally, a nutrient estimation toolkit is proposed, and the effectivity of it is tested in a soil nutrient estimation case study. This toolkit is capable of simplifying Remote Sensing experiments and aims at reducing the barrier to entry to the field of Earth Observation. It comes with a preconfigured case study which implements a soil sensing pipeline. To evaluate the usability of the toolkit, experiments with five different crops were executed, namely with Wheat, Barley, Maize, Sunflower and Vineyards. This case study gave visibility to an underlying unbalanced data problem, which is not well addressed in the current State of the Art.A agricultura intensiva poe em perigo a qualidade do solo de v ˜ arias formas. Os investigadores ´ mostram que, se continuarmos com estas praticas, a humanidade ser ´ a confrontada com quest ´ oes de ˜ produc¸ao alimentar. Para este efeito, a Observac¸ ˜ ao da Terra, mais concretamente o Sensoriamento ˜ do Solo, juntamente com a aprendizagem automatica, podem ser utilizadas para monitorizar v ´ arios ´ indicadores da degradac¸ao do solo, tais como a salinidade do solo, a contaminac¸ ˜ ao do solo por metais ˜ pesados e a quantificac¸ao dos nutrientes do solo. Mais concretamente, os Nutrientes do Solo s ˜ ao de ˜ grande importancia. Por exemplo para compreender qual a cultura que melhor se adapta ao solo, os ˆ nutrientes do solo devem ser identificados. No entanto, a amostragem do solo e uma tarefa trabalhosa ´ e dispendiosa, que pode ser impulsionada pela percepc¸ao remota e pela aprendizagem autom ˜ atica. ´ Ja foram desenvolvidos v ´ arios estudos sobre este assunto, embora ainda existam muitas lacunas. ´ Entre eles, a falta de avaliac¸oes cruzadas dos algoritmos existentes, e tamb ˜ em a curva de aprendiza- ´ gem acentuada para o campo de Observac¸ao da Terra que impede muitos investigadores de enveredar ˜ por este campo. Neste sentido, propomos TSTK, um toolkit em python que aborda estes desafios. Neste trabalho, e explorada a possibilidade de usar a Percepc¸ ´ ao Remota juntamente com os algo- ˜ ritmos de Aprendizagem Automatica para realizar a Estimativa de Nutrientes do Solo. Al ´ em disso, ´ e´ proposto um toolkit de estimativa de nutrientes e tambem um pipeline para o devido efeito, a efetividade ´ do toolkit e testada num caso de estudo de Estimac¸ ´ ao de Nutrientes no Solo. ˜ Este toolkit e capaz de simplificar as experi ´ encias de Percepc¸ ˆ ao Remota e visa reduzir a barreira ˜ de entrada no campo da Observac¸ao da Terra. Para avaliar a usabilidade do toolkit, foram executadas ˜ experiencias com cinco culturas diferentes, nomeadamente Trigo, Cevada, Milho, Girassol e Vinha. Este ˆ caso de estudo deu visibilidade a um problema subjacente de dados desiquilibrados, o qual nao˜ e bem ´ identificado no Estado da Arte atual

    Classifying multisensor remote sensing data : Concepts, Algorithms and Applications

    Get PDF
    Today, a large quantity of the Earth’s land surface has been affected by human induced land cover changes. Detailed knowledge of the land cover is elementary for several decision support and monitoring systems. Earth-observation (EO) systems have the potential to frequently provide information on land cover. Thus many land cover classifications are performed based on remotely sensed EO data. In this context, it has been shown that the performance of remote sensing applications is further improved by multisensor data sets, such as combinations of synthetic aperture radar (SAR) and multispectral imagery. The two systems operate in different wavelength domains and therefore provide different yet complementary information on land cover. Considering the increase in revisit times and better spatial resolutions of recent and upcoming systems like TerraSAR-X (11 days; up to1 m), Radarsat-2 (24 days; up to 3 m), or RapidEye constellation (up to 1 day; 5 m), multisensor approaches become even more promising. However, these data sets with high spatial and temporal resolution might become very large and complex. Commonly used statistical pattern recognition methods are usually not appropriate for the classification of multisensor data sets. Hence, one of the greatest challenges in remote sensing might be the development of adequate concepts for classifying multisensor imagery. The presented study aims at an adequate classification of multisensor data sets, including SAR data and multispectral images. Different conventional classifiers and recent developments are used, such as support vector machines (SVM) and random forests (RF), which are well known in the field of machine learning and pattern recognition. Furthermore, the impact of image segmentation on the classification accuracy is investigated and the value of a multilevel concept is discussed. To increase the performance of the algorithms in terms of classification accuracy, the concept of SVM is modified and combined with RF for optimized decision making. The results clearly demonstrate that the use of multisensor imagery is worthwhile. Irrespective of the classification method used, classification accuracies increase by combining SAR and multispectral imagery. Nevertheless, SVM and RF are more adequate for classifying multisensor data sets and significantly outperform conventional classifier algorithms in terms of accuracy. The finally introduced multisensor-multilevel classification strategy, which is based on the sequential use of SVM and RF, outperforms all other approaches. The proposed concept achieves an accuracy of 84.9%. This is significantly higher than all single-source results and also better than those achieved on any other combination of data. Both aspects, i.e. the fusion of SAR and multispectral data as well as the integration of multiple segmentation scales, improve the results. Contrary to the high accuracy value by the proposed concept, the pixel-based classification on single-source data sets achieves a maximal accuracy of 65% (SAR) and 69.8% (multispectral) respectively. The findings and good performance of the presented strategy are underlined by the successful application of the approach to data sets from a second year. Based on the results from this work it can be concluded that the suggested strategy is particularly interesting with regard to recent and future satellite missions

    Novel pattern recognition methods for classification and detection in remote sensing and power generation applications

    Get PDF
    Novel pattern recognition methods for classification and detection in remote sensing and power generation application

    Tree species classification from AVIRIS-NG hyperspectral imagery using convolutional neural networks

    Full text link
    This study focuses on the automatic classification of tree species using a three-dimensional convolutional neural network (CNN) based on field-sampled ground reference data, a LiDAR point cloud and AVIRIS-NG airborne hyperspectral remote sensing imagery with 2 m spatial resolution acquired on 14 June 2021. I created a tree species map for my 10.4 km2 study area which is located in the Jurapark Aargau, a Swiss regional park of national interest. I collected ground reference data for six major tree species present in the study area (Quercus robur, Fagus sylvatica, Fraxinus excelsior, Pinus sylvestris, Tilia platyphyllos, total n = 331). To match the sampled ground reference to the AVIRIS-NG 425 band hyperspectral imagery, I delineated individual tree crowns (ITCs) from a canopy height model (CHM) based on LiDAR point cloud data. After matching the ground reference data to the hyperspectral imagery, I split the extracted image patches to training, validation, and testing subsets. The amount of training, validation and testing data was increased by applying image augmentation through rotating, flipping, and changing the brightness of the original input data. The classifier is a CNN trained on the first 32 principal components (PC’s) extracted from AVIRIS-NG data. The CNN uses image patches of 5 × 5 pixels and consists of two convolutional layers and two fully connected layers. The latter of which is responsible for the final classification using the softmax activation function. The results show that the CNN classifier outperforms comparable conventional classification methods. The CNN model is able to predict the correct tree species with an overall accuracy of 70% and an average F1-score of 0.67. A random forest classifier reached an overall accuracy of 67% and an average F1-score of 0.61 while a support-vector machine classified the tree species with an overall accuracy of 66% and an average F1-score of 0.62. This work highlights that CNNs based on imaging spectroscopy data can produce highly accurate high resolution tree species distribution maps based on a relatively small set of training data thanks to the high dimensionality of hyperspectral images and the ability of CNNs to utilize spatial and spectral features of the data. These maps provide valuable input for modelling the distributions of other plant and animal species and ecosystem services. In addition, this work illustrates the importance of direct collaboration with environmental practitioners to ensure user needs are met. This aspect will be evaluated further in future work by assessing how these products are used by environmental practitioners and as input for modelling purposes

    Techniques for the extraction of spatial and spectral information in the supervised classification of hyperspectral imagery for land-cover applications

    Get PDF
    The objective of this PhD thesis is the development of spatialspectral information extraction techniques for supervised classification tasks, both by means of classical models and those based on deep learning, to be used in the classification of land use or land cover (LULC) multi- and hyper-spectral images obtained by remote sensing. The main goal is the efficient application of these techniques, so that they are able to obtain satisfactory classification results with a low use of computational resources and low execution time

    A Random Forest Based Method for Urban Land Cover Classification using LiDAR Data and Aerial Imagery

    Get PDF
    Urban land cover classification has always been crucial due to its ability to link many elements of human and physical environments. Timely, accurate, and detailed knowledge of the urban land cover information derived from remote sensing data is increasingly required among a wide variety of communities. This surge of interest has been predominately driven by the recent innovations in data, technologies, and theories in urban remote sensing. The development of light detection and ranging (LiDAR) systems, especially incorporated with high-resolution camera component, has shown great potential for urban classification. However, the performance of traditional and widely used classification methods is limited in this context, due to image interpretation complexity. On the other hand, random forests (RF), a newly developed machine learning algorithm, is receiving considerable attention in the field of image classification and pattern recognition. Several studies have shown the advantages of RF in land cover classification. However, few have focused on urban areas by fusion of LiDAR data and aerial images. The performance of the RF based feature selection and classification methods for urban areas was explored and compared to other popular feature selection approach and classifiers. Evaluation was based on several criteria: classification accuracy, impact of different training sample size, and computational speed. LiDAR data and aerial imagery with 0.5-m resolution were used to classify four land categories in the study area located in the City of Niagara Falls (ON, Canada). The results clearly demonstrate that the use of RF improved the classification performance in terms of accuracy and speed. Support vector machines (SVM) based and RF based classifiers showed similar accuracies. However, RF based classifiers were much quicker than SVM based methods. Based on the results from this work, it can be concluded that the RF based method holds great potential for recent and future urban land cover classification problem with LiDAR data and aerial images
    • …
    corecore