146 research outputs found
Convergence of the Exponentiated Gradient Method with Armijo Line Search
Consider the problem of minimizing a convex differentiable function on the
probability simplex, spectrahedron, or set of quantum density matrices. We
prove that the exponentiated gradient method with Armjo line search always
converges to the optimum, if the sequence of the iterates possesses a strictly
positive limit point (element-wise for the vector case, and with respect to the
Lowner partial ordering for the matrix case). To the best our knowledge, this
is the first convergence result for a mirror descent-type method that only
requires differentiability. The proof exploits self-concordant likeness of the
log-partition function, which is of independent interest.Comment: 18 page
A Geometric View on Constrained M-Estimators
We study the estimation error of constrained M-estimators, and derive
explicit upper bounds on the expected estimation error determined by the
Gaussian width of the constraint set. Both of the cases where the true
parameter is on the boundary of the constraint set (matched constraint), and
where the true parameter is strictly in the constraint set (mismatched
constraint) are considered. For both cases, we derive novel universal
estimation error bounds for regression in a generalized linear model with the
canonical link function. Our error bound for the mismatched constraint case is
minimax optimal in terms of its dependence on the sample size, for Gaussian
linear regression by the Lasso
Two Polyak-Type Step Sizes for Mirror Descent
We propose two Polyak-type step sizes for mirror descent and prove their
convergences for minimizing convex locally Lipschitz functions. Both step
sizes, unlike the original Polyak step size, do not need the optimal value of
the objective function.Comment: 13 page
Learning without Smoothness and Strong Convexity
Recent advances in statistical learning and convex optimization have inspired many successful practices. Standard theories assume smoothness---bounded gradient, Hessian, etc.---and strong convexity of the loss function. Unfortunately, such conditions may not hold in important real-world applications, and sometimes, to fulfill the conditions incurs unnecessary performance degradation. Below are three examples.
1. The standard theory for variable selection via L_1-penalization only considers the linear regression model, as the corresponding quadratic loss function has a constant Hessian and allows for exact second-order Taylor series expansion. In practice, however, non-linear regression models are often chosen to match data characteristics.
2. The standard theory for convex optimization considers almost exclusively smooth functions. Important applications such as portfolio selection and quantum state estimation, however, correspond to loss functions that violate the smoothness assumption; existing convergence guarantees for optimization algorithms hence do not apply.
3. The standard theory for compressive magnetic resonance imaging (MRI) guarantees the restricted isometry property (RIP)---a smoothness and strong convexity condition on the quadratic loss restricted on the set of sparse vectors---via random uniform sampling. The random uniform sampling strategy, however, yields unsatisfactory signal reconstruction performance empirically, in comparison to heuristic sampling approaches.
In this thesis, we provide rigorous solutions to the three examples above and other related problems. For the first two problems above, our key idea is to instead consider weaker localized versions of the smoothness condition. For the third, our solution is to propose a new theoretical framework for compressive MRI: We pose compressive MRI as a statistical learning problem, and solve it by empirical risk minimization. Interestingly, the RIP is not required in this framework
Online Self-Concordant and Relatively Smooth Minimization, With Applications to Online Portfolio Selection and Learning Quantum States
Consider an online convex optimization problem where the loss functions are
self-concordant barriers, smooth relative to a convex function , and
possibly non-Lipschitz. We analyze the regret of online mirror descent with
. Then, based on the result, we prove the following in a unified manner.
Denote by the time horizon and the parameter dimension. 1. For online
portfolio selection, the regret of , a variant of
exponentiated gradient due to Helmbold et al., is when . This improves on the original regret bound for . 2. For online portfolio
selection, the regret of online mirror descent with the logarithmic barrier is
. The regret bound is the same as that of Soft-Bayes due
to Orseau et al. up to logarithmic terms. 3. For online learning quantum states
with the logarithmic loss, the regret of online mirror descent with the
log-determinant function is also . Its per-iteration
time is shorter than all existing algorithms we know.Comment: 19 pages, 1 figur
Consistency of -Regularized Maximum-Likelihood for Compressive Poisson Regression
We consider Poisson regression with the canonical link function. This regression model is widely used in regression analysis involving count data; one important application in electrical engineering is transmission tomography. In this paper, we establish the variable selection consistency and estimation consistency of the -regularized maximum-likelihood estimator in this regression model, and characterize the asymptotic sample complexity that ensures consistency even under the compressive sensing setting (or the setting in high-dimensional statistics)
- …