4,047 research outputs found

    Second-Order Stochastic Optimization for Machine Learning in Linear Time

    Full text link
    First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity. Second-order methods, while able to provide faster convergence, have been much less explored due to the high cost of computing the second-order information. In this paper we develop second-order stochastic methods for optimization problems in machine learning that match the per-iteration cost of gradient based methods, and in certain settings improve upon the overall running time over popular first-order methods. Furthermore, our algorithm has the desirable property of being implementable in time linear in the sparsity of the input data

    A deep cut ellipsoid algorithm for convex programming

    Get PDF
    This paper proposes a deep cut version of the ellipsoid algorithm for solving a general class of continuous convex programming problems. In each step the algorithm does not require more computational effort to construct these deep cuts than its corresponding central cut version. Rules that prevent some of the numerical instabilities and theoretical drawbacks usually associated with the algorithm are also provided. Moreover, for a large class of convex programs a simple proof of its rate of convergence is given and the relation with previously known results is discussed. Finally some computational results of the deep and central cut version of the algorithm applied to a min—max stochastic queue location problem are reported.location theory;convex programming;deep cut ellipsoid algorithm;min—max programming;rate of convergence

    Minimax and Adaptive Inference in Nonparametric Function Estimation

    Get PDF
    Since Stein's 1956 seminal paper, shrinkage has played a fundamental role in both parametric and nonparametric inference. This article discusses minimaxity and adaptive minimaxity in nonparametric function estimation. Three interrelated problems, function estimation under global integrated squared error, estimation under pointwise squared error, and nonparametric confidence intervals, are considered. Shrinkage is pivotal in the development of both the minimax theory and the adaptation theory. While the three problems are closely connected and the minimax theories bear some similarities, the adaptation theories are strikingly different. For example, in a sharp contrast to adaptive point estimation, in many common settings there do not exist nonparametric confidence intervals that adapt to the unknown smoothness of the underlying function. A concise account of these theories is given. The connections as well as differences among these problems are discussed and illustrated through examples.Comment: Published in at http://dx.doi.org/10.1214/11-STS355 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A joint time-invariant filtering approach to the linear Gaussian relay problem

    Full text link
    In this paper, the linear Gaussian relay problem is considered. Under the linear time-invariant (LTI) model the problem is formulated in the frequency domain based on the Toeplitz distribution theorem. Under the further assumption of realizable input spectra, the LTI Gaussian relay problem is converted to a joint design problem of source and relay filters under two power constraints, one at the source and the other at the relay, and a practical solution to this problem is proposed based on the projected subgradient method. Numerical results show that the proposed method yields a noticeable gain over the instantaneous amplify-and-forward (AF) scheme in inter-symbol interference (ISI) channels. Also, the optimality of the AF scheme within the class of one-tap relay filters is established in flat-fading channels.Comment: 30 pages, 10 figure

    The CMA Evolution Strategy: A Tutorial

    Full text link
    This tutorial introduces the CMA Evolution Strategy (ES), where CMA stands for Covariance Matrix Adaptation. The CMA-ES is a stochastic, or randomized, method for real-parameter (continuous domain) optimization of non-linear, non-convex functions. We try to motivate and derive the algorithm from intuitive concepts and from requirements of non-linear, non-convex search in continuous domain.Comment: ArXiv e-prints, arXiv:1604.xxxx

    Random Coordinate Descent Methods for Minimizing Decomposable Submodular Functions

    Full text link
    Submodular function minimization is a fundamental optimization problem that arises in several applications in machine learning and computer vision. The problem is known to be solvable in polynomial time, but general purpose algorithms have high running times and are unsuitable for large-scale problems. Recent work have used convex optimization techniques to obtain very practical algorithms for minimizing functions that are sums of ``simple" functions. In this paper, we use random coordinate descent methods to obtain algorithms with faster linear convergence rates and cheaper iteration costs. Compared to alternating projection methods, our algorithms do not rely on full-dimensional vector operations and they converge in significantly fewer iterations
    • …