800 research outputs found
Cramer Rao-Type Bounds for Sparse Bayesian Learning
In this paper, we derive Hybrid, Bayesian and Marginalized Cram\'{e}r-Rao
lower bounds (HCRB, BCRB and MCRB) for the single and multiple measurement
vector Sparse Bayesian Learning (SBL) problem of estimating compressible
vectors and their prior distribution parameters. We assume the unknown vector
to be drawn from a compressible Student-t prior distribution. We derive CRBs
that encompass the deterministic or random nature of the unknown parameters of
the prior distribution and the regression noise variance. We extend the MCRB to
the case where the compressible vector is distributed according to a general
compressible prior distribution, of which the generalized Pareto distribution
is a special case. We use the derived bounds to uncover the relationship
between the compressibility and Mean Square Error (MSE) in the estimates.
Further, we illustrate the tightness and utility of the bounds through
simulations, by comparing them with the MSE performance of two popular
SBL-based estimators. It is found that the MCRB is generally the tightest among
the bounds derived and that the MSE performance of the Expectation-Maximization
(EM) algorithm coincides with the MCRB for the compressible vector. Through
simulations, we demonstrate the dependence of the MSE performance of SBL based
estimators on the compressibility of the vector for several values of the
number of observations and at different signal powers.Comment: Accepted for publication in the IEEE Transactions on Signal
Processing, 11 pages, 10 figure
Sparse-Based Estimation Performance for Partially Known Overcomplete Large-Systems
We assume the direct sum o for the signal subspace. As a result of
post- measurement, a number of operational contexts presuppose the a priori
knowledge of the LB -dimensional "interfering" subspace and the goal is to
estimate the LA am- plitudes corresponding to subspace . Taking into account
the knowledge of the orthogonal "interfering" subspace \perp, the Bayesian
estimation lower bound is de-
rivedfortheLA-sparsevectorinthedoublyasymptoticscenario,i.e. N,LA,LB -> \infty
with a finite asymptotic ratio. By jointly exploiting the Compressed Sensing
(CS) and the Random Matrix Theory (RMT) frameworks, closed-form expressions for
the lower bound on the estimation of the non-zero entries of a sparse vector of
interest are derived and studied. The derived closed-form expressions enjoy
several interesting features: (i) a simple interpretable expression, (ii) a
very low computational cost especially in the doubly asymptotic scenario, (iii)
an accurate prediction of the mean-square-error (MSE) of popular sparse-based
estimators and (iv) the lower bound remains true for any amplitudes vector
priors. Finally, several idealized scenarios are compared to the derived bound
for a common output signal-to-noise-ratio (SNR) which shows the in- terest of
the joint estimation/rejection methodology derived herein.Comment: 10 pages, 5 figures, Journal of Signal Processin
Bayesian Framework for Sparse Vector Recovery and Parameter Bounds with Application to Compressive Sensing
abstract: Signal compressed using classical compression methods can be acquired using brute force (i.e. searching for non-zero entries in component-wise). However, sparse solutions require combinatorial searches of high computations. In this thesis, instead, two Bayesian approaches are considered to recover a sparse vector from underdetermined noisy measurements. The first is constructed using a Bernoulli-Gaussian (BG) prior distribution and is assumed to be the true generative model. The second is constructed using a Gamma-Normal (GN) prior distribution and is, therefore, a different (i.e. misspecified) model. To estimate the posterior distribution for the correctly specified scenario, an algorithm based on generalized approximated message passing (GAMP) is constructed, while an algorithm based on sparse Bayesian learning (SBL) is used for the misspecified scenario. Recovering sparse signal using Bayesian framework is one class of algorithms to solve the sparse problem. All classes of algorithms aim to get around the high computations associated with the combinatorial searches. Compressive sensing (CS) is a widely-used terminology attributed to optimize the sparse problem and its applications. Applications such as magnetic resonance imaging (MRI), image acquisition in radar imaging, and facial recognition. In CS literature, the target vector can be recovered either by optimizing an objective function using point estimation, or recovering a distribution of the sparse vector using Bayesian estimation. Although Bayesian framework provides an extra degree of freedom to assume a distribution that is directly applicable to the problem of interest, it is hard to find a theoretical guarantee of convergence. This limitation has shifted some of researches to use a non-Bayesian framework. This thesis tries to close this gab by proposing a Bayesian framework with a suggested theoretical bound for the assumed, not necessarily correct, distribution. In the simulation study, a general lower Bayesian Cram\'er-Rao bound (BCRB) bound is extracted along with misspecified Bayesian Cram\'er-Rao bound (MBCRB) for GN model. Both bounds are validated using mean square error (MSE) performances of the aforementioned algorithms. Also, a quantification of the performance in terms of gains versus losses is introduced as one main finding of this report.Dissertation/ThesisMasters Thesis Computer Engineering 201
Law of Log Determinant of Sample Covariance Matrix and Optimal Estimation of Differential Entropy for High-Dimensional Gaussian Distributions
Differential entropy and log determinant of the covariance matrix of a
multivariate Gaussian distribution have many applications in coding,
communications, signal processing and statistical inference. In this paper we
consider in the high dimensional setting optimal estimation of the differential
entropy and the log-determinant of the covariance matrix. We first establish a
central limit theorem for the log determinant of the sample covariance matrix
in the high dimensional setting where the dimension can grow with the
sample size . An estimator of the differential entropy and the log
determinant is then considered. Optimal rate of convergence is obtained. It is
shown that in the case the estimator is asymptotically
sharp minimax. The ultra-high dimensional setting where is also
discussed.Comment: 19 page
Joint Covariance Estimation with Mutual Linear Structure
We consider the problem of joint estimation of structured covariance
matrices. Assuming the structure is unknown, estimation is achieved using
heterogeneous training sets. Namely, given groups of measurements coming from
centered populations with different covariances, our aim is to determine the
mutual structure of these covariance matrices and estimate them. Supposing that
the covariances span a low dimensional affine subspace in the space of
symmetric matrices, we develop a new efficient algorithm discovering the
structure and using it to improve the estimation. Our technique is based on the
application of principal component analysis in the matrix space. We also derive
an upper performance bound of the proposed algorithm in the Gaussian scenario
and compare it with the Cramer-Rao lower bound. Numerical simulations are
presented to illustrate the performance benefits of the proposed method
A Scale Mixture Perspective of Multiplicative Noise in Neural Networks
Corrupting the input and hidden layers of deep neural networks (DNNs) with
multiplicative noise, often drawn from the Bernoulli distribution (or
'dropout'), provides regularization that has significantly contributed to deep
learning's success. However, understanding how multiplicative corruptions
prevent overfitting has been difficult due to the complexity of a DNN's
functional form. In this paper, we show that when a Gaussian prior is placed on
a DNN's weights, applying multiplicative noise induces a Gaussian scale
mixture, which can be reparameterized to circumvent the problematic likelihood
function. Analysis can then proceed by using a type-II maximum likelihood
procedure to derive a closed-form expression revealing how regularization
evolves as a function of the network's weights. Results show that
multiplicative noise forces weights to become either sparse or invariant to
rescaling. We find our analysis has implications for model compression as it
naturally reveals a weight pruning rule that starkly contrasts with the
commonly used signal-to-noise ratio (SNR). While the SNR prunes weights with
large variances, seeing them as noisy, our approach recognizes their robustness
and retains them. We empirically demonstrate our approach has a strong
advantage over the SNR heuristic and is competitive to retraining with soft
targets produced from a teacher model
Recommended from our members
MODEL-FORM UNCERTAINTY QUANTIFICATION FOR PREDICTIVE PROBABILISTIC GRAPHICAL MODELS
In this thesis, we focus on Uncertainty Quantification and Sensitivity Analysis, which can provide performance guarantees for predictive models built with both aleatoric and epistemic uncertainties, as well as data, and identify which components in a model have the most influence on predictions of our quantities of interest.
In the first part (Chapter 2), we propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depend critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
In the second part (Chapter 3-4), we develop new information-based uncertainty quantification and sensitivity analysis methods for Probabilistic Graphical Models. Probabilistic graphical models are an important class of methods for probabilistic modeling and inference, probabilistic machine learning, and probabilistic artificial intelligence. Its hierarchical structure allows us to bring together in a systematic way statistical and multi-scale physical modeling, different types of data, incorporating expert knowledge, correlations, and causal relationships. However, due to multi-scale modeling, learning from sparse data, and mechanisms without full knowledge, many predictive models will necessarily have diverse sources of uncertainty at different scales. The new model-form uncertainty quantification indices we developed can handle both parametric and non-parametric probabilistic graphical models, as well as small and large model/parameter perturbations in a single, unified mathematical framework and provide an envelope of model predictions for our quantities of interest. Moreover, we propose a model-form Sensitivity Index, which allows us to rank the impact of each component of the probabilistic graphical model, and provide a systematic methodology to close the experiment - model - simulation - prediction loop and improve the computational model iteratively based on our new uncertainty quantification and sensitivity analysis methods. To illustrate our ideas, we explore a physicochemical application on the Oxygen Reduction Reaction (ORR) in Chapter 4, whose optimization was identified as a key to the performance of fuel cells.
In the last part (Chapter 5), we complete our discussion for the uncertainty quantification and sensitivity analysis methods on probabilistic graphical models by introducing a new sensitivity analysis method for the case where we know the real model sits in a certain parametric family. Note that the uncertainty indices above may be too pessimistic (as they are inherently non-parametric) when studying uncertainty/sensitivity questions for models confined within a given parametric family. Therefore, we develop a method using likelihood ratio and fisher information matrix, which can capture correlations and causal dependencies in the graphical models, and we show it can provide us more accurate results for the parametric probabilistic graphical models
- …