146 research outputs found

    Convergence of the Exponentiated Gradient Method with Armijo Line Search

    Get PDF
    Consider the problem of minimizing a convex differentiable function on the probability simplex, spectrahedron, or set of quantum density matrices. We prove that the exponentiated gradient method with Armjo line search always converges to the optimum, if the sequence of the iterates possesses a strictly positive limit point (element-wise for the vector case, and with respect to the Lowner partial ordering for the matrix case). To the best our knowledge, this is the first convergence result for a mirror descent-type method that only requires differentiability. The proof exploits self-concordant likeness of the log-partition function, which is of independent interest.Comment: 18 page

    A Geometric View on Constrained M-Estimators

    Get PDF
    We study the estimation error of constrained M-estimators, and derive explicit upper bounds on the expected estimation error determined by the Gaussian width of the constraint set. Both of the cases where the true parameter is on the boundary of the constraint set (matched constraint), and where the true parameter is strictly in the constraint set (mismatched constraint) are considered. For both cases, we derive novel universal estimation error bounds for regression in a generalized linear model with the canonical link function. Our error bound for the mismatched constraint case is minimax optimal in terms of its dependence on the sample size, for Gaussian linear regression by the Lasso

    Two Polyak-Type Step Sizes for Mirror Descent

    Full text link
    We propose two Polyak-type step sizes for mirror descent and prove their convergences for minimizing convex locally Lipschitz functions. Both step sizes, unlike the original Polyak step size, do not need the optimal value of the objective function.Comment: 13 page

    Learning without Smoothness and Strong Convexity

    Get PDF
    Recent advances in statistical learning and convex optimization have inspired many successful practices. Standard theories assume smoothness---bounded gradient, Hessian, etc.---and strong convexity of the loss function. Unfortunately, such conditions may not hold in important real-world applications, and sometimes, to fulfill the conditions incurs unnecessary performance degradation. Below are three examples. 1. The standard theory for variable selection via L_1-penalization only considers the linear regression model, as the corresponding quadratic loss function has a constant Hessian and allows for exact second-order Taylor series expansion. In practice, however, non-linear regression models are often chosen to match data characteristics. 2. The standard theory for convex optimization considers almost exclusively smooth functions. Important applications such as portfolio selection and quantum state estimation, however, correspond to loss functions that violate the smoothness assumption; existing convergence guarantees for optimization algorithms hence do not apply. 3. The standard theory for compressive magnetic resonance imaging (MRI) guarantees the restricted isometry property (RIP)---a smoothness and strong convexity condition on the quadratic loss restricted on the set of sparse vectors---via random uniform sampling. The random uniform sampling strategy, however, yields unsatisfactory signal reconstruction performance empirically, in comparison to heuristic sampling approaches. In this thesis, we provide rigorous solutions to the three examples above and other related problems. For the first two problems above, our key idea is to instead consider weaker localized versions of the smoothness condition. For the third, our solution is to propose a new theoretical framework for compressive MRI: We pose compressive MRI as a statistical learning problem, and solve it by empirical risk minimization. Interestingly, the RIP is not required in this framework

    Online Self-Concordant and Relatively Smooth Minimization, With Applications to Online Portfolio Selection and Learning Quantum States

    Full text link
    Consider an online convex optimization problem where the loss functions are self-concordant barriers, smooth relative to a convex function hh, and possibly non-Lipschitz. We analyze the regret of online mirror descent with hh. Then, based on the result, we prove the following in a unified manner. Denote by TT the time horizon and dd the parameter dimension. 1. For online portfolio selection, the regret of EG~\widetilde{\text{EG}}, a variant of exponentiated gradient due to Helmbold et al., is O~(T2/3d1/3)\tilde{O} ( T^{2/3} d^{1/3} ) when T>4d/logdT > 4 d / \log d. This improves on the original O~(T3/4d1/2)\tilde{O} ( T^{3/4} d^{1/2} ) regret bound for EG~\widetilde{\text{EG}}. 2. For online portfolio selection, the regret of online mirror descent with the logarithmic barrier is O~(Td)\tilde{O}(\sqrt{T d}). The regret bound is the same as that of Soft-Bayes due to Orseau et al. up to logarithmic terms. 3. For online learning quantum states with the logarithmic loss, the regret of online mirror descent with the log-determinant function is also O~(Td)\tilde{O} ( \sqrt{T d} ). Its per-iteration time is shorter than all existing algorithms we know.Comment: 19 pages, 1 figur

    Consistency of 1\ell_1-Regularized Maximum-Likelihood for Compressive Poisson Regression

    Get PDF
    We consider Poisson regression with the canonical link function. This regression model is widely used in regression analysis involving count data; one important application in electrical engineering is transmission tomography. In this paper, we establish the variable selection consistency and estimation consistency of the 1\ell_1-regularized maximum-likelihood estimator in this regression model, and characterize the asymptotic sample complexity that ensures consistency even under the compressive sensing setting (or the npn \ll p setting in high-dimensional statistics)
    corecore