353 research outputs found

    Generalized topographic block model

    No full text
    Co-clustering leads to parsimony in data visualisation with a number of parameters dramatically reduced in comparison to the dimensions of the data sample. Herein, we propose a new generalized approach for nonlinear mapping by a re-parameterization of the latent block mixture model. The densities modeling the blocks are in an exponential family such that the Gaussian, Bernoulli and Poisson laws are particular cases. The inference of the parameters is derived from the block expectation–maximization algorithm with a Newton–Raphson procedure at the maximization step. Empirical experiments with textual data validate the interest of our generalized model

    Graph-based Data Modeling and Analysis for Data Fusion in Remote Sensing

    Get PDF
    Hyperspectral imaging provides the capability of increased sensitivity and discrimination over traditional imaging methods by combining standard digital imaging with spectroscopic methods. For each individual pixel in a hyperspectral image (HSI), a continuous spectrum is sampled as the spectral reflectance/radiance signature to facilitate identification of ground cover and surface material. The abundant spectrum knowledge allows all available information from the data to be mined. The superior qualities within hyperspectral imaging allow wide applications such as mineral exploration, agriculture monitoring, and ecological surveillance, etc. The processing of massive high-dimensional HSI datasets is a challenge since many data processing techniques have a computational complexity that grows exponentially with the dimension. Besides, a HSI dataset may contain a limited number of degrees of freedom due to the high correlations between data points and among the spectra. On the other hand, merely taking advantage of the sampled spectrum of individual HSI data point may produce inaccurate results due to the mixed nature of raw HSI data, such as mixed pixels, optical interferences and etc. Fusion strategies are widely adopted in data processing to achieve better performance, especially in the field of classification and clustering. There are mainly three types of fusion strategies, namely low-level data fusion, intermediate-level feature fusion, and high-level decision fusion. Low-level data fusion combines multi-source data that is expected to be complementary or cooperative. Intermediate-level feature fusion aims at selection and combination of features to remove redundant information. Decision level fusion exploits a set of classifiers to provide more accurate results. The fusion strategies have wide applications including HSI data processing. With the fast development of multiple remote sensing modalities, e.g. Very High Resolution (VHR) optical sensors, LiDAR, etc., fusion of multi-source data can in principal produce more detailed information than each single source. On the other hand, besides the abundant spectral information contained in HSI data, features such as texture and shape may be employed to represent data points from a spatial perspective. Furthermore, feature fusion also includes the strategy of removing redundant and noisy features in the dataset. One of the major problems in machine learning and pattern recognition is to develop appropriate representations for complex nonlinear data. In HSI processing, a particular data point is usually described as a vector with coordinates corresponding to the intensities measured in the spectral bands. This vector representation permits the application of linear and nonlinear transformations with linear algebra to find an alternative representation of the data. More generally, HSI is multi-dimensional in nature and the vector representation may lose the contextual correlations. Tensor representation provides a more sophisticated modeling technique and a higher-order generalization to linear subspace analysis. In graph theory, data points can be generalized as nodes with connectivities measured from the proximity of a local neighborhood. The graph-based framework efficiently characterizes the relationships among the data and allows for convenient mathematical manipulation in many applications, such as data clustering, feature extraction, feature selection and data alignment. In this thesis, graph-based approaches applied in the field of multi-source feature and data fusion in remote sensing area are explored. We will mainly investigate the fusion of spatial, spectral and LiDAR information with linear and multilinear algebra under graph-based framework for data clustering and classification problems

    Master of Science

    Get PDF
    thesisRecent geothermal studies on sedimentary basins in Western Utah suggest the possibility of significant geothermal reservoirs at depths of 3 to 5 km. This research focuses on 3 areas (Crater Bench, Pavant Butte, and Thermo), located within sedimentary basins having high geothermal potential. New geophysical data which include 364 gravity stations and 247 magnetotelluric (MT) stations collected during the summers of 2010 to 2012 have been used to augment historical gravity, electromagnetic, and borehole data where coverage is poor or insufficient. Two-dimensional gravity and MT models were created for these study areas in order to gain insight on the subsurface structural controls and to understand better the geothermal systems and potential of each study area. At Crater Bench, gravity and MT models show overall basalt flow thicknesses of 60 to 160 m and inferred depth-to-basement estimates of 1.3 to 3.6 km and a buried horst structure which is interpreted to be the structural control of the hot springs fluid flow. In the Pavant Butte study area, gravity and MT models display an elongate, mostly two-dimensional basin with a corridor about 15 km wide having 2 km of sediments on top of basement rock and in some areas reaching depths up to 3 km. Deep conductive bodies observed in this area hint at the presence of hot, saline fluids throughout the basin. Thermo displays intersecting gravity low trends of 4 to 10 mGal amplitude which intersect adjacent to the surface manifestation of the hot spring system and are interpreted as the structural control. Gravity and MT models indicate shallow depth-to-basement values (200 m) near the hot springs and up to 2 km to the southwest accompanied by low resistivities. Geothermal waters are old; water chemistry supports the conceptual model of waters migrating from the southwest basin up deeply penetrating faults and fractures to produce the hot springs. This geophysically-based study has added to the understanding of these potential geothermal systems

    Discriminant feature extraction: exploiting structures within each sample and across samples.

    Get PDF
    Zhang, Wei.Thesis (M.Phil.)--Chinese University of Hong Kong, 2009.Includes bibliographical references (leaves 95-109).Abstract also in Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Area of Machine Learning --- p.1Chapter 1.1.1 --- Types of Algorithms --- p.2Chapter 1.1.2 --- Modeling Assumptions --- p.4Chapter 1.2 --- Dimensionality Reduction --- p.4Chapter 1.3 --- Structure of the Thesis --- p.8Chapter 2 --- Dimensionality Reduction --- p.10Chapter 2.1 --- Feature Extraction --- p.11Chapter 2.1.1 --- Linear Feature Extraction --- p.11Chapter 2.1.2 --- Nonlinear Feature Extraction --- p.16Chapter 2.1.3 --- Sparse Feature Extraction --- p.19Chapter 2.1.4 --- Nonnegative Feature Extraction --- p.19Chapter 2.1.5 --- Incremental Feature Extraction --- p.20Chapter 2.2 --- Feature Selection --- p.20Chapter 2.2.1 --- Viewpoint of Feature Extraction --- p.21Chapter 2.2.2 --- Feature-Level Score --- p.22Chapter 2.2.3 --- Subset-Level Score --- p.22Chapter 3 --- Various Views of Feature Extraction --- p.24Chapter 3.1 --- Probabilistic Models --- p.25Chapter 3.2 --- Matrix Factorization --- p.26Chapter 3.3 --- Graph Embedding --- p.28Chapter 3.4 --- Manifold Learning --- p.28Chapter 3.5 --- Distance Metric Learning --- p.32Chapter 4 --- Tensor linear Laplacian discrimination --- p.34Chapter 4.1 --- Motivation --- p.35Chapter 4.2 --- Tensor Linear Laplacian Discrimination --- p.37Chapter 4.2.1 --- Preliminaries of Tensor Operations --- p.38Chapter 4.2.2 --- Discriminant Scatters --- p.38Chapter 4.2.3 --- Solving for Projection Matrices --- p.40Chapter 4.3 --- Definition of Weights --- p.44Chapter 4.3.1 --- Contextual Distance --- p.44Chapter 4.3.2 --- Tensor Coding Length --- p.45Chapter 4.4 --- Experimental Results --- p.47Chapter 4.4.1 --- Face Recognition --- p.48Chapter 4.4.2 --- Texture Classification --- p.50Chapter 4.4.3 --- Handwritten Digit Recognition --- p.52Chapter 4.5 --- Conclusions --- p.54Chapter 5 --- Semi-Supervised Semi-Riemannian Metric Map --- p.56Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Semi-Riemannian Spaces --- p.60Chapter 5.3 --- Semi-Supervised Semi-Riemannian Metric Map --- p.61Chapter 5.3.1 --- The Discrepancy Criterion --- p.61Chapter 5.3.2 --- Semi-Riemannian Geometry Based Feature Extraction Framework --- p.63Chapter 5.3.3 --- Semi-Supervised Learning of Semi-Riemannian Metrics --- p.65Chapter 5.4 --- Discussion --- p.72Chapter 5.4.1 --- A General Framework for Semi-Supervised Dimensionality Reduction --- p.72Chapter 5.4.2 --- Comparison to SRDA --- p.74Chapter 5.4.3 --- Advantages over Semi-supervised Discriminant Analysis --- p.74Chapter 5.5 --- Experiments --- p.75Chapter 5.5.1 --- Experimental Setup --- p.76Chapter 5.5.2 --- Face Recognition --- p.76Chapter 5.5.3 --- Handwritten Digit Classification --- p.82Chapter 5.6 --- Conclusion --- p.84Chapter 6 --- Summary --- p.86Chapter A --- The Relationship between LDA and LLD --- p.89Chapter B --- Coding Length --- p.91Chapter C --- Connection between SRDA and ANMM --- p.92Chapter D --- From S3RMM to Graph-Based Approaches --- p.93Bibliography --- p.9
    • …
    corecore