76 research outputs found
Recommended from our members
Empirical Bayes Estimators for High-Dimensional Sparse Vectors
The problem of estimating a high-dimensional sparse vector from an observation in i.i.d. Gaussian noise is considered. The performance is measured using squared-error loss. An empirical Bayes shrinkage estimator, derived using a Bernoulli-Gaussian prior, is analyzed and compared with the well-known soft-thresholding estimator. We obtain concentration inequalities for the Stein's unbiased risk estimate and the loss function of both estimators. The results show that for large , both the risk estimate and the loss function concentrate on deterministic values close to the true risk.
Depending on the underlying , either the proposed empirical Bayes (eBayes) estimator or soft-thresholding may have smaller loss. We consider a hybrid estimator that attempts to pick the better of the soft-thresholding estimator and the eBayes estimator by comparing their risk estimates. It is shown that: i) the loss of the hybrid estimator concentrates on the minimum of the losses of the two competing estimators, and ii) the risk of the hybrid estimator is within order of the minimum of the two risks. Simulation results are provided to support the theoretical results. Finally, we use the eBayes and hybrid estimators as denoisers in the approximate message passing (AMP) algorithm for compressed sensing, and show that their performance is superior to the soft-thresholding denoiser in a wide range of settings.This work was supported in part by a Marie Curie Career Integration Grant (Grant Agreement Number 631489), an Isaac Newton Trust Research Grant, and EPSRC Grant EP/N013999/1
Computational Methods for Matrix/Tensor Factorization and Deep Learning Image Denoising
Feature learning is a technique to automatically extract features from raw data. It is widely used in areas such as computer vision, image processing, data mining and natural language processing. In this thesis, we are interested in the computational aspects of feature learning. We focus on rank matrix and tensor factorization and deep neural network models for image denoising.
With respect to matrix and tensor factorization, we first present a technique to speed up alternating least squares (ALS) and gradient descent (GD) − two commonly used strategies for tensor factorization. We introduce an efficient, scalable and distributed algorithm that addresses the data explosion problem. Instead of a computationally challenging sub-step of ALS and GD, we implement the algorithm on parallel machines by using only two sparse matrix-vector products. Not only is the algorithm scalable but it is also on average 4 to 10 times faster than competing algorithms on various data sets. Next, we discuss our results of non-negative matrix factorization for hyperspectral image data in the presence of noise. We introduce a spectral total variation regularization and derive four variants of the alternating direction method of multiplier algorithm. While all four methods belong to the same family of algorithms, some perform better than others. Thus, we compare the algorithms using stimulated Raman spectroscopic image will be demonstrated.
For deep neural network models, we focus on its application to image denoising. We first demonstrate how an optimal procedure leveraging deep neural networks and convex optimization can combine a given set of denoisers to produce an overall better result. The proposed framework estimates the mean squared error (MSE) of individual denoised outputs using a deep neural network; optimally combines the denoised outputs via convex optimization; and recovers lost details of the combined images using another deep neural network. The framework consistently improves denoising performance for both deterministic denoisers and neural network denoisers. Next, we apply the deep neural network to solve the image reconstruction issues of the Quanta Image Sensor (QIS), which is a single-photon image sensor that oversamples the light field to generate binary measures
Sharp Time--Data Tradeoffs for Linear Inverse Problems
In this paper we characterize sharp time-data tradeoffs for optimization
problems used for solving linear inverse problems. We focus on the minimization
of a least-squares objective subject to a constraint defined as the sub-level
set of a penalty function. We present a unified convergence analysis of the
gradient projection algorithm applied to such problems. We sharply characterize
the convergence rate associated with a wide variety of random measurement
ensembles in terms of the number of measurements and structural complexity of
the signal with respect to the chosen penalty function. The results apply to
both convex and nonconvex constraints, demonstrating that a linear convergence
rate is attainable even though the least squares objective is not strongly
convex in these settings. When specialized to Gaussian measurements our results
show that such linear convergence occurs when the number of measurements is
merely 4 times the minimal number required to recover the desired signal at all
(a.k.a. the phase transition). We also achieve a slower but geometric rate of
convergence precisely above the phase transition point. Extensive numerical
results suggest that the derived rates exactly match the empirical performance
Approximate Message Passing for the Matrix Tensor Product Model
We propose and analyze an approximate message passing (AMP) algorithm for the
matrix tensor product model, which is a generalization of the standard spiked
matrix models that allows for multiple types of pairwise observations over a
collection of latent variables. A key innovation for this algorithm is a method
for optimally weighing and combining multiple estimates in each iteration.
Building upon an AMP convergence theorem for non-separable functions, we prove
a state evolution for non-separable functions that provides an asymptotically
exact description of its performance in the high-dimensional limit. We leverage
this state evolution result to provide necessary and sufficient conditions for
recovery of the signal of interest. Such conditions depend on the singular
values of a linear operator derived from an appropriate generalization of a
signal-to-noise ratio for our model. Our results recover as special cases a
number of recently proposed methods for contextual models (e.g., covariate
assisted clustering) as well as inhomogeneous noise models
Optimal denoising of rotationally invariant rectangular matrices
In this manuscript we consider denoising of large rectangular matrices: given
a noisy observation of a signal matrix, what is the best way of recovering the
signal matrix itself? For Gaussian noise and rotationally-invariant signal
priors, we completely characterize the optimal denoiser and its performance in
the high-dimensional limit, in which the size of the signal matrix goes to
infinity with fixed aspects ratio, and under the Bayes optimal setting, that is
when the statistician knows how the signal and the observations were generated.
Our results generalise previous works that considered only symmetric matrices
to the more general case of non-symmetric and rectangular ones. We explore
analytically and numerically a particular choice of factorized signal prior
that models cross-covariance matrices and the matrix factorization problem. As
a byproduct of our analysis, we provide an explicit asymptotic evaluation of
the rectangular Harish-Chandra-Itzykson-Zuber integral in a special case
- …