293 research outputs found

    Learning the Structure of High-Dimensional Manifolds with Self-Organizing Maps for Accurate Information Extraction

    Get PDF
    This paper was submitted by the author prior to final official version. For official version please see http://hdl.handle.net/1911/70515This work aims to improve the capability of accurate information extraction from high-dimensional data, with a specific neural learning paradigm, the Self-Organizing Map (SOM). The SOM is an unsupervised learning algorithm that can faithfully sense the manifold structure and support supervised learning of relevant information from the data. Yet open problems regarding SOM learning exist. We focus on the following two issues. 1. Evaluation of topology preservation. Topology preservation is essential for SOMs in faithful representation of manifold structure. However, in reality, topology violations are not unusual, especially when the data have complicated structure. Measures capable of accurately quantifying and informatively expressing topology violations are lacking. One contribution of this work is a new measure, the Weighted Differential Topographic Function (WDTF), which differentiates an existing measure, the Topographic Function (TF), and incorporates detailed data distribution as an importance weighting of violations to distinguish severe violations from insignificant ones. Another contribution is an interactive visual tool, TopoView, which facilitates the visual inspection of violations on the SOM lattice. We show the effectiveness of the combined use of the WDTF and TopoView through a simple two-dimensional data set and two hyperspectral images. 2. Learning multiple latent variables from high-dimensional data. We use an existing two-layer SOM-hybrid supervised architecture, which captures the manifold structure in its SOM hidden layer, and then, uses its output layer to perform the supervised learning of latent variables. In the customary way, the output layer only uses the strongest output of the SOM neurons. This severely limits the learning capability. We allow multiple, k, strongest responses of the SOM neurons for the supervised learning. Moreover, the fact that different latent variables can be best learned with different values of k motivates a new neural architecture, the Conjoined Twins, which extends the existing architecture with additional copies of the output layer, for preferential use of different values of k in the learning of different latent variables. We also automate the customization of k for different variables with the statistics derived from the SOM. The Conjoined Twins shows its effectiveness in the inference of two physical parameters from Near-Infrared spectra of planetary ices

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    Machine learning approaches to star-galaxy classification

    Get PDF
    Accurate star-galaxy classification has many important applications in modern precision cosmology. However, a vast number of faint sources that are detected in the current and next-generation ground-based surveys may be challenged by poor star-galaxy classification. Thus, we explore a variety of machine learning approaches to improve star-galaxy classification in ground-based photometric surveys. In Chapter 2, we present a meta-classification framework that combines existing star-galaxy classifiers, and demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method. In Chapter 3, we show that a deep learning algorithm called convolutional neural networks is able to produce accurate and well-calibrated classifications by learning directly from the pixel values of photometric images. In Chapter 4, we study another deep learning technique called generative adversarial networks in a semi-supervised setting, and demonstrate that our semi-supervised method produces competitive classifications using only a small amount of labeled examples

    Hydrocarbon quantification using neural networks and deep learning based hyperspectral unmixing

    Get PDF
    Hydrocarbon (HC) spills are a global issue, which can seriously impact human life and the environment, therefore early identification and remedial measures taken at an early stage are important. Thus, current research efforts aim at remotely quantifying incipient quantities of HC mixed with soils. The increased spectral and spatial resolution of hyperspectral sensors has opened ground-breaking perspectives in many industries including remote inspection of large areas and the environment. The use of subpixel detection algorithms, and in particular the use of the mixture models, has been identified as a future advance that needs to be incorporated in remote sensing. However, there are some challenging tasks since the spectral signatures of the targets of interest may not be immediately available. Moreover, real time processing and analysis is required to support fast decision-making. Progressing in this direction, this thesis pioneers and researches novel methodologies for HC quantification capable of exceeding the limitations of existing systems in terms of reduced cost and processing time with improved accuracy. Therefore the goal of this research is to develop, implement and test different methods for improving HC detection and quantification using spectral unmixing and machine learning. An efficient hybrid switch method employing neural networks and hyperspectral is proposed and investigated. This robust method switches between state of the art hyperspectral unmixing linear and nonlinear models, respectively. This procedure is well suited for the quantification of small quantities of substances within a pixel with high accuracy as the most appropriate model is employed. Central to the proposed approach is a novel method for extracting parameters to characterise the non-linearity of the data. These parameters are fed into a feedforward neural network which decides in a pixel by pixel fashion which model is more suitable. The quantification process is fully automated by applying further classification techniques to the acquired hyperspectral images. A deep learning neural network model is designed for the quantification of HC quantities mixed with soils. A three-term backpropagation algorithm with dropout is proposed to avoid overfitting and reduce the computational complexity of the model. The above methods have been evaluated using classical repository datasets from the literature and a laboratory controlled dataset. For that, an experimental procedure has been designed to produce a labelled dataset. The data was obtained by mixing and homogenizing different soil types with HC substances, respectively and measuring the reflectance with a hyperspectral sensor. Findings from the research study reveal that the two proposed models have high performance, they are suitable for the detection and quantification of HC mixed with soils, and surpass existing methods. Improvements in sensitivity, accuracy, computational time are achieved. Thus, the proposed approaches can be used to detect HC spills at an early stage in order to mitigate significant pollution from the spill areas

    Hybrid spectral unmixing : using artificial neural networks for linear/non-linear switching

    Get PDF
    Spectral unmixing is a key process in identifying spectral signature of materials and quantifying their spatial distribution over an image. The linear model is expected to provide acceptable results when two assumptions are satisfied: (1) The mixing process should occur at macroscopic level and (2) Photons must interact with single material before reaching the sensor. However, these assumptions do not always hold and more complex nonlinear models are required. This study proposes a new hybrid method for switching between linear and nonlinear spectral unmixing of hyperspectral data based on artificial neural networks. The neural networks was trained with parameters within a window of the pixel under consideration. These parameters are computed to represent the diversity of the neighboring pixels and are based on the Spectral Angular Distance, Covariance and a non linearity parameter. The endmembers were extracted using Vertex Component Analysis while the abundances were estimated using the method identified by the neural networks (Vertex Component Analysis, Fully Constraint Least Square Method, Polynomial Post Nonlinear Mixing Model or Generalized Bilinear Model). Results show that the hybrid method performs better than each of the individual techniques with high overall accuracy, while the abundance estimation error is significantly lower than that obtained using the individual methods. Experiments on both synthetic dataset and real hyperspectral images demonstrated that the proposed hybrid switch method is efficient for solving spectral unmixing of hyperspectral images as compared to individual algorithms

    Adaptive Similarity Measures for Material Identification in Hyperspectral Imagery

    Get PDF
    Remotely-sensed hyperspectral imagery has become one the most advanced tools for analyzing the processes that shape the Earth and other planets. Effective, rapid analysis of high-volume, high-dimensional hyperspectral image data sets demands efficient, automated techniques to identify signatures of known materials in such imagery. In this thesis, we develop a framework for automatic material identification in hyperspectral imagery using adaptive similarity measures. We frame the material identification problem as a multiclass similarity-based classification problem, where our goal is to predict material labels for unlabeled target spectra based upon their similarities to source spectra with known material labels. As differences in capture conditions affect the spectral representations of materials, we divide the material identification problem into intra-domain (i.e., source and target spectra captured under identical conditions) and inter-domain (i.e., source and target spectra captured under different conditions) settings. The first component of this thesis develops adaptive similarity measures for intra-domain settings that measure the relevance of spectral features to the given classification task using small amounts of labeled data. We propose a technique based on multiclass Linear Discriminant Analysis (LDA) that combines several distinct similarity measures into a single hybrid measure capturing the strengths of each of the individual measures. We also provide a comparative survey of techniques for low-rank Mahalanobis metric learning, and demonstrate that regularized LDA yields competitive results to the state-of-the-art, at substantially lower computational cost. The second component of this thesis shifts the focus to inter-domain settings, and proposes a multiclass domain adaptation framework that reconciles systematic differences between spectra captured under similar, but not identical, conditions. Our framework computes a similarity-based mapping that captures structured, relative relationships between classes shared between source and target domains, allowing us apply a classifier trained using labeled source spectra to classify target spectra. We demonstrate improved domain adaptation accuracy in comparison to recently-proposed multitask learning and manifold alignment techniques in several case studies involving state-of-the-art synthetic and real-world hyperspectral imagery

    The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting

    Get PDF
    The numerous recent breakthroughs in machine learning (ML) make imperative to carefully ponder how the scientific community can benefit from a technology that, although not necessarily new, is today living its golden age. This Grand Challenge review paper is focused on the present and future role of machine learning in space weather. The purpose is twofold. On one hand, we will discuss previous works that use ML for space weather forecasting, focusing in particular on the few areas that have seen most activity: the forecasting of geomagnetic indices, of relativistic electrons at geosynchronous orbits, of solar flares occurrence, of coronal mass ejection propagation time, and of solar wind speed. On the other hand, this paper serves as a gentle introduction to the field of machine learning tailored to the space weather community and as a pointer to a number of open challenges that we believe the community should undertake in the next decade. The recurring themes throughout the review are the need to shift our forecasting paradigm to a probabilistic approach focused on the reliable assessment of uncertainties, and the combination of physics-based and machine learning approaches, known as gray-box.Comment: under revie

    Hyperspectral Image Analysis of Food Quality

    Get PDF

    Earth Observation Open Science and Innovation

    Get PDF
    geospatial analytics; social observatory; big earth data; open data; citizen science; open innovation; earth system science; crowdsourced geospatial data; citizen science; science in society; data scienc
    corecore