21 research outputs found
Estimation with Norm Regularization
Analysis of non-asymptotic estimation error and structured statistical
recovery based on norm regularized regression, such as Lasso, needs to consider
four aspects: the norm, the loss function, the design matrix, and the noise
model. This paper presents generalizations of such estimation error analysis on
all four aspects compared to the existing literature. We characterize the
restricted error set where the estimation error vector lies, establish
relations between error sets for the constrained and regularized problems, and
present an estimation error bound applicable to any norm. Precise
characterizations of the bound is presented for isotropic as well as
anisotropic subGaussian design matrices, subGaussian noise models, and convex
loss functions, including least squares and generalized linear models. Generic
chaining and associated results play an important role in the analysis. A key
result from the analysis is that the sample complexity of all such estimators
depends on the Gaussian width of a spherical cap corresponding to the
restricted error set. Further, once the number of samples crosses the
required sample complexity, the estimation error decreases as
, where depends on the Gaussian width of the unit norm
ball.Comment: Fixed technical issues. Generalized some result
A Geometric View on Constrained M-Estimators
We study the estimation error of constrained M-estimators, and derive
explicit upper bounds on the expected estimation error determined by the
Gaussian width of the constraint set. Both of the cases where the true
parameter is on the boundary of the constraint set (matched constraint), and
where the true parameter is strictly in the constraint set (mismatched
constraint) are considered. For both cases, we derive novel universal
estimation error bounds for regression in a generalized linear model with the
canonical link function. Our error bound for the mismatched constraint case is
minimax optimal in terms of its dependence on the sample size, for Gaussian
linear regression by the Lasso
High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient
High dimensional structured data enriched model describes groups of
observations by shared and per-group individual parameters, each with its own
structure such as sparsity or group sparsity. In this paper, we consider the
general form of data enrichment where data comes in a fixed but arbitrary
number of groups G. Any convex function, e.g., norms, can characterize the
structure of both shared and individual parameters. We propose an estimator for
high dimensional data enriched model and provide conditions under which it
consistently estimates both shared and individual parameters. We also delineate
sample complexity of the estimator and present high probability non-asymptotic
bound on estimation error of all parameters. Interestingly the sample
complexity of our estimator translates to conditions on both per-group sample
sizes and the total number of samples. We propose an iterative estimation
algorithm with linear convergence rate and supplement our theoretical analysis
with synthetic and real experimental results. Particularly, we show the
predictive power of data-enriched model along with its interpretable results in
anticancer drug sensitivity analysis