9 research outputs found

    Topics in High-Dimensional Statistics and the Analysis of Large Hyperspectral Images.

    Full text link
    Advancement in imaging technology has made hyperspectral images gathered from remote sensing much more common. The high-dimensional nature of these large scale data coupled with wavelength and spatial dependency necessitates high-dimensional and efficient computation methods to address these issues while producing results that are concise and easy to understand. The thesis addresses these issues by examining high-dimensional methods in the context of hyperspectral image classification, unmixing and wavelength correlation estimation. Chapter 2 re-examines the sparse Bayesian learning (SBL) of linear models in a high-dimensional setting with sparse signal. The hard-thresholded version of the SBL estimator, under orthogonal design, achieves non-asymptotic error rate that is comparable to LASSO. We also establish in the chapter that with high-probability the estimator recovers the sparsity structure of the signal. The ability to recover sparsity structures in high dimensional settings is crucial for unmixing with high-dimensional libraries in the next chapter. In Chapter 3, the thesis investigates the application of SBL on the task of linear/bilinear unmixing and classification of hyperspectral images. The proposed model in this chapter uses latent Markov random fields to classify pixels and account for the spatial dependence between pixels. In the proposed model, the pixels belonging to the same group share the same mixture of pure endmembers. The task of unmixing and classification are performed simultaneously, but this method does not address wavelength dependence. Chapter 4 is a natural extension of the previous chapter that contains the framework to account for both spatial and wavelength dependence in the unmixing of hyperspectral images. The classification of the images are performed using approximate spectral clustering while the unmixing task is performed in tandem with sparse wavelength concentration matrix estimation.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135893/1/chye_1.pd

    Using prior-data conflict to tune Bayesian regularized regression models

    Get PDF
    In high-dimensional regression models, variable selection becomes challenging from a computational and theoretical perspective. Bayesian regularized regression via shrinkage priors like the Laplace or spike-and-slab prior are effective methods for variable selection in p > n scenarios provided the shrinkage priors are configured adequately. We propose configuring shrinkage priors using checks for prior-data conflict: tests that assess whether there is disagreement in parameter information provided by the prior and data. We apply our proposed method to the Bayesian LASSO and spike-and-slab shrinkage priors and assess variable selection performance of our prior configurations against competing models through a linear and logistic high-dimensional simulation study. Additionally, we apply our method to proteomic data collected from patients admitted to the Albany Medical Center in Albany NY in April of 2020 with COVID-like respiratory issues. Simulation results suggest our proposed configurations may outperform competing models when the true regression effects are small

    Development of Statistical Models for Functional Near-infrared Spectroscopy Data Analysis Incorporating Anatomical and Probe Registration Prior Information

    Get PDF
    Functional near-infrared spectroscopy (fNIRS) is a non-invasive technology that uses low-levels of non-ionizing light in the range of 650 -- 900 nm (red and near-infrared) to record changes in the optical absorption and scattering of tissue. In particular, oxy-hemoglobin (HbO) and deoxy-hemoglobin (HbR) have characteristic absorption spectra at these wavelengths, which are used to discriminate blood flow and oxygen metabolism changes. As compared with functional magnetic resonance imaging (fMRI), fNIRS is less costly, more portable, and allows for a wider range of experimental scenarios because it neither requires a dedicated scanner nor needs the subject to lay supine. Current challenges in fNIRS data analysis include: (i) a small change in brain anatomy or optical probe positioning can create huge differences in fNIRS measurements even though the underlying brain activity remains the same due to the existence of ``blind-spots"; (ii) fNIRS image reconstruction is a high-dimensional, under-determined, and ill-posed problem, in which there are thousands of parameters to estimate while only tens of measurements available and existing methods notably overestimate the false positive rate; (iii) brain anatomical information has rarely been used in current fNIRS data analyses. This dissertation proposes two new methods aiming to improve fNIRS data analysis and overcome these challenges -- one of which is a channel-space method based on anatomically defined region-of-interest (ROI) and the other one is an image reconstruction method incorporating anatomical and physiological prior information. The two methods are developed using advanced statistical models including a combination of regularization models and Bayesian hierarchical modeling. The performance of the two methods is validated via numerical simulations and evaluated using receiver operating characteristics (ROC)-based tools. The statistical comparisons with conventional methods suggest significant improvements

    Contrubutions to the Analysis of Multistate and Degradation Data

    Full text link
    Traditional methods in survival, reliability, actuarial science, risk, and other event-history applications are based on the analysis of time-to-occurrence of some event of interest, generically called ``failure''. In the presence of high-degrees of censoring, however, it is difficult to make inference about the underlying failure distribution using failure time data. Moreover, such data are not very useful in predicting failures of specific systems, a problem of interest when dealing with expensive or critical systems. As an alternative, there is an increasing trend towards collecting and analyzing richer types of data related to the states and performance of systems or subjects under study. These include data on multistate and degradation processes. This dissertation makes several contributions to the analysis of multistate and degradation data. The first part of the dissertation deals with parametric inference for multistate processes with panel data. These include interval, right, and left censoring, which arise naturally as the processes are not observed continuously. Most of the literature in this area deal with Markov models, for which inference with censored data can be handled without too much difficulty. The dissertation considers progressive semi-Markov models and develops methods and algorithms for general parametric inference. A combination of Markov Chain Monte Carlo techniques and stochastic approximation methods are used. A second topic deals with the comparison of the traditional method and the process method for inference about the time-to-failure distribution in the presence of multistate data. Here, time-to-failure is the time when the process enters an absorbing state. There is limited literature in this area. The gains in both estimation and prediction efficiency are quantified for various parametric models of interest. The second part of the dissertation deals with the analysis of data on continuous measures of performance and degradation with missing data. In this case, time-to-failure is the time at which the degradation measure exceeds a certain threshold or performance level goes below some threshold. Inference problems about the mean and variance of the degradation and the imputation of the missing are studied under different settings.Ph.D.StatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86286/1/yangcn_1.pd

    Some perspectives on the problem of model selection

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Bayesian computation in imaging inverse problems with partially unknown models

    Get PDF
    Many imaging problems require solving a high-dimensional inverse problem that is ill-conditioned or ill-posed. Imaging methods typically address this difficulty by regularising the estimation problem to make it well-posed. This often requires setting the value of the so-called regularisation parameters that control the amount of regularisation enforced. These parameters are notoriously difficult to set a priori and can have a dramatic impact on the recovered estimates. In this thesis, we propose a general empirical Bayesian method for setting regularisation parameters in imaging problems that are convex w.r.t. the unknown image. Our method calibrates regularisation parameters directly from the observed data by maximum marginal likelihood estimation, and can simultaneously estimate multiple regularisation parameters. A main novelty is that this maximum marginal likelihood estimation problem is efficiently solved by using a stochastic proximal gradient algorithm that is driven by two proximal Markov chain Monte Carlo samplers, thus intimately combining modern high-dimensional optimisation and stochastic sampling techniques. Furthermore, the proposed algorithm uses the same basic operators as proximal optimisation algorithms, namely gradient and proximal operators, and it is therefore straightforward to apply to problems that are currently solved by using proximal optimisation techniques. We also present a detailed theoretical analysis of the proposed methodology, and demonstrate it with a range of experiments and comparisons with alternative approaches from the literature. The considered experiments include image denoising, non-blind image deconvolution, and hyperspectral unmixing, using synthesis and analysis priors involving the `1, total-variation, total-variation and `1, and total-generalised-variation pseudo-norms. Moreover, we explore some other applications of the proposed method including maximum marginal likelihood estimation in Bayesian logistic regression and audio compressed sensing, as well as an application to model selection based on residuals
    corecore