695 research outputs found

    Graphical Convergence of Subgradients in Nonconvex Optimization and Learning

    Full text link
    We investigate the stochastic optimization problem of minimizing population risk, where the loss defining the risk is assumed to be weakly convex. Compositions of Lipschitz convex functions with smooth maps are the primary examples of such losses. We analyze the estimation quality of such nonsmooth and nonconvex problems by their sample average approximations. Our main results establish dimension-dependent rates on subgradient estimation in full generality and dimension-independent rates when the loss is a generalized linear model. As an application of the developed techniques, we analyze the nonsmooth landscape of a robust nonlinear regression problem.Comment: 36 page

    Convergence analysis of sampling-based decomposition methods for risk-averse multistage stochastic convex programs

    Full text link
    We consider a class of sampling-based decomposition methods to solve risk-averse multistage stochastic convex programs. We prove a formula for the computation of the cuts necessary to build the outer linearizations of the recourse functions. This formula can be used to obtain an efficient implementation of Stochastic Dual Dynamic Programming applied to convex nonlinear problems. We prove the almost sure convergence of these decomposition methods when the relatively complete recourse assumption holds. We also prove the almost sure convergence of these algorithms when applied to risk-averse multistage stochastic linear programs that do not satisfy the relatively complete recourse assumption. The analysis is first done assuming the underlying stochastic process is interstage independent and discrete, with a finite set of possible realizations at each stage. We then indicate two ways of extending the methods and convergence analysis to the case when the process is interstage dependent

    A stochastic approximation method for approximating the efficient frontier of chance-constrained nonlinear programs

    Full text link
    We propose a stochastic approximation method for approximating the efficient frontier of chance-constrained nonlinear programs. Our approach is based on a bi-objective viewpoint of chance-constrained programs that seeks solutions on the efficient frontier of optimal objective value versus risk of constraint violation. To this end, we construct a reformulated problem whose objective is to minimize the probability of constraints violation subject to deterministic convex constraints (which includes a bound on the objective function value). We adapt existing smoothing-based approaches for chance-constrained problems to derive a convergent sequence of smooth approximations of our reformulated problem, and apply a projected stochastic subgradient algorithm to solve it. In contrast with exterior sampling-based approaches (such as sample average approximation) that approximate the original chance-constrained program with one having finite support, our proposal converges to stationary solutions of a smooth approximation of the original problem, thereby avoiding poor local solutions that may be an artefact of a fixed sample. Our proposal also includes a tailored implementation of the smoothing-based approach that chooses key algorithmic parameters based on problem data. Computational results on four test problems from the literature indicate that our proposed approach can efficiently determine good approximations of the efficient frontier

    Stochastic model-based minimization of weakly convex functions

    Full text link
    We consider a family of algorithms that successively sample and minimize simple stochastic models of the objective function. We show that under reasonable conditions on approximation quality and regularity of the models, any such algorithm drives a natural stationarity measure to zero at the rate O(kβˆ’1/4)O(k^{-1/4}). As a consequence, we obtain the first complexity guarantees for the stochastic proximal point, proximal subgradient, and regularized Gauss-Newton methods for minimizing compositions of convex functions with smooth maps. The guiding principle, underlying the complexity guarantees, is that all algorithms under consideration can be interpreted as approximate descent methods on an implicit smoothing of the problem, given by the Moreau envelope. Specializing to classical circumstances, we obtain the long-sought convergence rate of the stochastic projected gradient method, without batching, for minimizing a smooth function on a closed convex set.Comment: 33 pages, 4 figure

    Dual Dynamic Programming with cut selection: convergence proof and numerical experiments

    Full text link
    We consider convex optimization problems formulated using dynamic programming equations. Such problems can be solved using the Dual Dynamic Programming algorithm combined with the Level 1 cut selection strategy or the Territory algorithm to select the most relevant Benders cuts. We propose a limited memory variant of Level 1 and show the convergence of DDP combined with the Territory algorithm, Level 1 or its variant for nonlinear optimization problems. In the special case of linear programs, we show convergence in a finite number of iterations. Numerical simulations illustrate the interest of our variant and show that it can be much quicker than a simplex algorithm on some large instances of portfolio selection and inventory problems

    Accelerated Point-wise Maximum Approach to Approximate Dynamic Programming

    Full text link
    We describe an approximate dynamic programming approach to compute lower bounds on the optimal value function for a discrete time, continuous space, infinite horizon setting. The approach iteratively constructs a family of lower bounding approximate value functions by using the so-called Bellman inequality. The novelty of our approach is that, at each iteration, we aim to compute an approximate value function that maximizes the point-wise maximum taken with the family of approximate value functions computed thus far. This leads to a non-convex objective, and we propose a gradient ascent algorithm to find stationary points by solving a sequence of convex optimization problems. We provide convergence guarantees for our algorithm and an interpretation for how the gradient computation relates to the state relevance weighting parameter appearing in related approximate dynamic programming approaches. We demonstrate through numerical examples that, when compared to existing approaches, the algorithm we propose computes tighter sub-optimality bounds with less computation time.Comment: 14 pages, 3 figure

    Dynamical behavior of a stochastic forward-backward algorithm using random monotone operators

    Full text link
    The purpose of this paper is to study the dynamical behavior of the sequence produced by a forward-backward algorithm involving two random maximal monotone operators and a sequence of decreasing step sizes. Defining a mean monotone operator as an Aumann integral, and assuming that the sum of the two mean operators is maximal (sufficient maximality conditions are provided), it is shown that with probability one, the interpolated process obtained from the iterates is an asymptotic pseudo trajectory in the sense of Bena\"{\i}m and Hirsch of the differential inclusion involving the sum of the mean operators. The convergence of the empirical means of the iterates towards a zero of the sum of the mean operators is shown, as well as the convergence of the sequence itself to such a zero under a demipositivity assumption. These results find applications in a wide range of optimization or variational inequality problems in random environments

    Efficiency of minimizing compositions of convex functions and smooth maps

    Full text link
    We consider global efficiency of algorithms for minimizing a sum of a convex function and a composition of a Lipschitz convex function with a smooth map. The basic algorithm we rely on is the prox-linear method, which in each iteration solves a regularized subproblem formed by linearizing the smooth map. When the subproblems are solved exactly, the method has efficiency O(Ξ΅βˆ’2)\mathcal{O}(\varepsilon^{-2}), akin to gradient descent for smooth minimization. We show that when the subproblems can only be solved by first-order methods, a simple combination of smoothing, the prox-linear method, and a fast-gradient scheme yields an algorithm with complexity O~(Ξ΅βˆ’3)\widetilde{\mathcal{O}}(\varepsilon^{-3}). The technique readily extends to minimizing an average of mm composite functions, with complexity O~(m/Ξ΅2+m/Ξ΅3)\widetilde{\mathcal{O}}(m/\varepsilon^{2}+\sqrt{m}/\varepsilon^{3}) in expectation. We round off the paper with an inertial prox-linear method that automatically accelerates in presence of convexity

    Recursive Optimization of Convex Risk Measures: Mean-Semideviation Models

    Full text link
    We develop recursive, data-driven, stochastic subgradient methods for optimizing a new, versatile, and application-driven class of convex risk measures, termed here as mean-semideviations, strictly generalizing the well-known and popular mean-upper-semideviation. We introduce the MESSAGEp algorithm, which is an efficient compositional subgradient procedure for iteratively solving convex mean-semideviation risk-averse problems to optimality. We analyze the asymptotic behavior of the MESSAGEp algorithm under a flexible and structure-exploiting set of problem assumptions. In particular: 1) Under appropriate stepsize rules, we establish pathwise convergence of the MESSAGEp algorithm in a strong technical sense, confirming its asymptotic consistency. 2) Assuming a strongly convex cost, we show that, for fixed semideviation order p>1p>1 and for ϡ∈[0,1)\epsilon\in\left[0,1\right), the MESSAGEp algorithm achieves a squared-L2{\cal L}_{2} solution suboptimality rate of the order of O(nβˆ’(1βˆ’Ο΅)/2){\cal O}(n^{-\left(1-\epsilon\right)/2}) iterations, where, for Ο΅>0\epsilon>0, pathwise convergence is simultaneously guaranteed. This result establishes a rate of order arbitrarily close to O(nβˆ’1/2){\cal O}(n^{-1/2}), while ensuring strongly stable pathwise operation. For p≑1p\equiv1, the rate order improves to O(nβˆ’2/3){\cal O}(n^{-2/3}), which also suffices for pathwise convergence, and matches previous results. 3) Likewise, in the general case of a convex cost, we show that, for any ϡ∈[0,1)\epsilon\in\left[0,1\right), the MESSAGEp algorithm with iterate smoothing achieves an L1{\cal L}_{1} objective suboptimality rate of the order of O(nβˆ’(1βˆ’Ο΅)/(41{p>1}+4)){\cal O}(n^{-\left(1-\epsilon\right)/\left(4\bf{1}_{\left\{ p>1\right\} }+4\right)}) iterations. This result provides maximal rates of O(nβˆ’1/4){\cal O}(n^{-1/4}), if p≑1p\equiv1, and O(nβˆ’1/8){\cal O}(n^{-1/8}), if p>1p>1, matching the state of the art, as well.Comment: 90 pages, 3 figures. Update: Substantial revision of the technical content, with an additional fully detailed analysis in regard to the rate of convergence of the MESSAGEp algorithm. NOTE: Please open in browser to see the math in the abstract

    Subdifferentials of Nonconvex Integral Functionals in Banach Spaces with Applications to Stochastic Dynamic Programming

    Full text link
    The paper concerns the investigation of nonconvex and nondifferentiable integral functionals on general Banach spaces, which may not be reflexive and/or separable. Considering two major subdifferentials of variational analysis, we derive nonsmooth versions of the Leibniz rule on subdifferentiation under the integral sign, where the integral of the subdifferential set-valued mappings generated by Lipschitzian integrands is understood in the Gelfand sense. Besides examining integration over complete measure spaces and also over those with nonatomic measures, our special attention is drawn to a stronger version of measure nonatomicity, known as saturation, to invoke the recent results of the Lyapunov convexity theorem type for the Gelfand integral of the subdifferential mappings. The main results are applied to the subdifferential study of the optimal value functions and deriving the corresponding necessary optimality conditions in nonconvex problems of stochastic dynamic programming with discrete time on the infinite horizon
    • …
    corecore