38 research outputs found
On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models
We consider estimating the parametric components of semi-parametric multiple
index models in a high-dimensional and non-Gaussian setting. Such models form a
rich class of non-linear models with applications to signal processing, machine
learning and statistics. Our estimators leverage the score function based first
and second-order Stein's identities and do not require the covariates to
satisfy Gaussian or elliptical symmetry assumptions common in the literature.
Moreover, to handle score functions and responses that are heavy-tailed, our
estimators are constructed via carefully thresholding their empirical
counterparts. We show that our estimator achieves near-optimal statistical rate
of convergence in several settings. We supplement our theoretical results via
simulation experiments that confirm the theory
Some Algorithms and Paradigms for Big Data
The reality of big data poses both opportunities and challenges to modern researchers. Its key features -- large sample sizes, high-dimensional feature spaces, and structural complexity -- enforce new paradigms upon the creation of effective yet algorithmic efficient data analysis algorithms. In this dissertation, we illustrate a few paradigms through the analysis of three new algorithms. The first two algorithms consider the problem of phase retrieval, in which we seek to recover a signal from random rank-one quadratic measurements. We first show that an adaptation of the randomized Kaczmarz method provably exhibits linear convergence so long as our sample size is linear in the signal dimension. Next, we show that the standard SDP relaxation of sparse PCA yields an algorithm that does signal recovery for sparse, model-misspecified phase retrieval with a sample complexity that scales according to the square of the sparsity parameter. Finally, our third algorithm addresses the problem of Non-Gaussian Component Analysis, in which we are trying to identify the non-Gaussian marginals of a high-dimensional distribution. We prove that our algorithm exhibits polynomial time convergence with polynomial sample complexity.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145895/1/yanshuo_1.pd